COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Optimization problem #2

• Email
• Follow

```Hi all,

Can anyone give me some hints as to solve the following problem,
preferably in a way that is faster than the way I used to do it, and
without the bug in the current version;

Problem description:

The program processes an array of N cells, with N up to about 4096. The
array is filled from the start, it may not be completely full but there
are no empty cells between full cells. A cell contains a value of
1..255, so logically every cell would be a byte, but if processing
efficiency requires each cell to be a word or even a double-word, that
would be no problem. The values are normalized, they are effectively
indexes into a table, so if the table contains just T entries, the
values would go from 1 to T.

Given the above setup, the problem is to find *all* sub-arrays that
contain 3..T distinct elements and are as short as possible, i.e. if
there are 42 sub-arrays of three elements containing three distinct
values, there is no need to find sub-arrays of four elements containing
three distinct values. Also, a shorter sub-array may *never* be a part
of a longer sub-array containing *only* distinct values.

An example, suppose only the array contains 56 elements, in this case
with, for clarity, values a..j:

1         2         3         4         5     5
.....5....0....5....0....5....0....5....0....5....0....56
aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja

in this case the program should find the following sub-arrays:

len aperture from to value
3     3      40  42 bef
3     3      44  46 fgh
7     7      50  56 gihbfja
8    16      41  56 efffghggggihbfja, containing efghibja
9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
10    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija

It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
they are either longer than the series of 7-in-7 (12-25 contain abcd),
or are part of the 7-in-7 series and as such they should not be included!

My AD 1996 program used to slide a window that started with three
elements (plus a sentinel on either side) over the big array and spit
out the position of the first element if it found three distinct
elements in the window *and* the two sentinels also contained any of the
values in the window. The program included some fairly minor
optimizations, such as immediately sliding the window to the last of a
series of equal values, in the example above, the three position wide
window would start by covering pos 6..8, and next it would move to
11..13, etc. If at any moment the program would find that all elements
in the window *and the sentinel on the right* would be distinct, the
window size would be increased by one and the scan would continue.

Taking the above example, a window with an aperture of four elements
would start its slide at position 6..9. Once it reached position 49,
with "gihb" visible and "g" and "f" in the two sentinels, the window
would be widened to five characters as the "f" is distinct from the four
"visible" characters, with a new sentinel of "j". Given that this is
again distinct from the now five characters in the window, the process
would repeat itself, eventually resulting in the string of seven
distinct characters starting at position 50.

Once the string of seven distinct characters has been found, the window
is widened to *nine* (if there had been a string of 8-in-8 it would have
been found by the previous slide!) characters and the process restarts
at position 6..14, but in the end it fails to find a series of eight
distinct characters and so the process is repeated with windows of 10,
11, 12, 13, 14, 15 and finally 16 characters, when a string of eight
distinct characters is finally found.

The problem is, for relatively low values of N and C, the process may
not be overly efficient, but it works. However, once N and C increase,
(N is theoretically unbounded, C has an upper limit of around 210), the
process becomes horribly inefficient: using the old IBM V2.3.0 OS PL/I
compiler, which had a statement count facility, this method required the
execution of 556,379,518 PL/I statements (and a number of actual machine
instruction that is at least one order of magnitude greater) for the
current values of N(2329) and C(66).

As for some figures on required restarts, the final series of 66 is
found in a sub-array with a length of 1950 elements and with a series of
65 in a sub-array of only 1889 elements, this means that the slide had
to be restarted 61 times! Even more restarts, 312(!), are needed when
going from 61 to 62 elements.

In 1998 I posted the problem to comp.lang.pascal.borland, the post
9,960 (out of 28,517) posts in that group.

Three people claimed that there was a much faster way, and two of them
backed this up with programs, Brent Beach and Paul Green, with Paul's
solution being the fastest by a fair margin. The results of his code
matched my output, I replaced my code with his code, and that seemed to
be the end of the story...

Until a few months back...

.... when I started converting the original Pascal code in assembler.
While doing so I decided that it would be interesting to use this
procedure in another part of the code. However, the results were not
what I expected: result rows were missing. I managed to track down a
paper copy of the original PG program, typed that in, had it checked for
typos by a friend, and it turned out to give the same erroneous results.
PG's email address is no longer valid, the floppy that contained a copy
of my email conversation with him and Brent, including a basic
explanation of the algorithm he used, is, after having been stored on a
cold loft for the best part of the last decade, unreadable. The only
thing I remember about his explanation were the words "clear as mud..."
near the end.

I have spent the last two weeks trying to figure out how it works and
staring at the Virtual Pascal IDE "Watch" window, I've finally decided

So, if anyone can give me any clues as to how I could perform the above
described process in an efficient way, I'd be very grateful. *I'm quite
capable of writing the code myself*, I would just like to get any clues
towards a more efficient algorithm. If that means you want to see the
current Pascal (or PL/I) code and the input data that leads to the
erroneous results, drop me a line, "robert(a)prino(d)org".

Thanks,

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0
Reply spamtrap14 (66) 5/16/2010 6:28:26 PM

See related articles to this posting

```In comp.lang.pl1 Robert AH Prins <spamtrap@prino.org> wrote:

> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;

> Problem description:

> The program processes an array of N cells, with N up to about 4096. The
> array is filled from the start, it may not be completely full but there
> are no empty cells between full cells. A cell contains a value of
> 1..255, so logically every cell would be a byte, but if processing
> efficiency requires each cell to be a word or even a double-word, that
> would be no problem. The values are normalized, they are effectively
> indexes into a table, so if the table contains just T entries, the
> values would go from 1 to T.

For alphabetsize of 32 or less, that is, values 0..31, and if I
did it in C, I would probably use the one bit per 32 bit word
storage, which makes testing fairly easy using & and |.
As far as I know, PL/I BIT(32) doesn't have the optimization
that uses a 32 bit word, and UNSPEC is likely even worse.

> Given the above setup, the problem is to find *all* sub-arrays that
> contain 3..T distinct elements and are as short as possible, i.e. if
> there are 42 sub-arrays of three elements containing three distinct
> values, there is no need to find sub-arrays of four elements containing
> three distinct values. Also, a shorter sub-array may *never* be a part
> of a longer sub-array containing *only* distinct values.

I wonder if this can be done using dynamic programming.
DP can speed up many such problems, but it isn't always
so obvious.

For the longer subarrays, it might be that hash tables could
speed up the comparisons.  That would, for example, get you
an O(N) test for a value being in a subarray being considered.

> An example, suppose only the array contains 56 elements, in this case
> with, for clarity, values a..j:

>          1         2         3         4         5     5
> ....5....0....5....0....5....0....5....0....5....0....56
> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja

> in this case the program should find the following sub-arrays:

> len aperture from to value
>  3     3      40  42 bef
>  3     3      44  46 fgh
>  7     7      50  56 gihbfja
>  8    16      41  56 efffghggggihbfja, containing efghibja
>  9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
> 10    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija

> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
> they are either longer than the series of 7-in-7 (12-25 contain abcd),
> or are part of the 7-in-7 series and as such they should not be included!

> My AD 1996 program used to slide a window that started with three
> elements (plus a sentinel on either side) over the big array and spit
> out the position of the first element if it found three distinct
> elements in the window *and* the two sentinels also contained any of the
> values in the window. The program included some fairly minor
> optimizations, such as immediately sliding the window to the last of a
> series of equal values, in the example above, the three position wide
> window would start by covering pos 6..8, and next it would move to
> 11..13, etc. If at any moment the program would find that all elements
> in the window *and the sentinel on the right* would be distinct, the
> window size would be increased by one and the scan would continue.

The optimization works if it is faster than a direct test.
For short subarrays, an unrolled compare loop should be pretty fast.
For longer ones, as I said above, a hash table might be fast.

> Taking the above example, a window with an aperture of four elements
> would start its slide at position 6..9. Once it reached position 49,
> with "gihb" visible and "g" and "f" in the two sentinels, the window
> would be widened to five characters as the "f" is distinct from the four
> "visible" characters, with a new sentinel of "j". Given that this is
> again distinct from the now five characters in the window, the process
> would repeat itself, eventually resulting in the string of seven
> distinct characters starting at position 50.

> Once the string of seven distinct characters has been found, the window
> is widened to *nine* (if there had been a string of 8-in-8 it would have
> been found by the previous slide!) characters and the process restarts
> at position 6..14, but in the end it fails to find a series of eight
> distinct characters and so the process is repeated with windows of 10,
> 11, 12, 13, 14, 15 and finally 16 characters, when a string of eight
> distinct characters is finally found.

> The problem is, for relatively low values of N and C, the process may
> not be overly efficient, but it works. However, once N and C increase,
> (N is theoretically unbounded, C has an upper limit of around 210), the
> process becomes horribly inefficient: using the old IBM V2.3.0 OS PL/I
> compiler, which had a statement count facility, this method required the
> execution of 556,379,518 PL/I statements (and a number of actual machine
> instruction that is at least one order of magnitude greater) for the
> current values of N(2329) and C(66).

> As for some figures on required restarts, the final series of 66 is
> found in a sub-array with a length of 1950 elements and with a series of
> 65 in a sub-array of only 1889 elements, this means that the slide had
> to be restarted 61 times! Even more restarts, 312(!), are needed when
> going from 61 to 62 elements.

> In 1998 I posted the problem to comp.lang.pascal.borland, the post
> 9,960 (out of 28,517) posts in that group.

> Three people claimed that there was a much faster way, and two of them
> backed this up with programs, Brent Beach and Paul Green, with Paul's
> solution being the fastest by a fair margin. The results of his code
> matched my output, I replaced my code with his code, and that seemed to
> be the end of the story...

> Until a few months back...

> ... when I started converting the original Pascal code in assembler.

Can you post some code so we can see it?  It might give some good,
or not so good, ideas on how to do it better.

> While doing so I decided that it would be interesting to use this
> procedure in another part of the code. However, the results were not
> what I expected: result rows were missing.

-- glen
```
 0

```In article <85akp6F4bpU1@mid.individual.net>,
Robert AH Prins <spamtrap@prino.org> writes:

> Given the above setup, the problem is to find *all* sub-arrays that
> contain 3..T distinct elements and are as short as possible, i.e. if
> there are 42 sub-arrays of three elements containing three distinct
> values, there is no need to find sub-arrays of four elements containing
> three distinct values. Also, a shorter sub-array may *never* be a part
> of a longer sub-array containing *only* distinct values.
>
> An example, suppose only the array contains 56 elements, in this case
> with, for clarity, values a..j:
>
>           1         2         3         4         5     5
> ....5....0....5....0....5....0....5....0....5....0....56
> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>
> in this case the program should find the following sub-arrays:
>
> len aperture from to value
>   3     3      40  42 bef
>   3     3      44  46 fgh
>   7     7      50  56 gihbfja
>   8    16      41  56 efffghggggihbfja, containing efghibja
>   9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
> 10    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija
>
>
> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
> they are either longer than the series of 7-in-7 (12-25 contain abcd),

Are you saying that if we tabulate the results, then both the "len" and
"aperture" columns should have increasing values?

If my understanding of the requirements is correct then it may be better
to start with a large window and _decrease_ it in each pass. (As finding
the longest length and corresponding aperture is trivial).
For example, after finding len=10 aperture=25 one should only consider
substrings of 25 or less when searching for sequences of less than 10
distinct values.
```
 0

```Dick Wesseling wrote:
> In article<85akp6F4bpU1@mid.individual.net>,
> 	Robert AH Prins<spamtrap@prino.org>  writes:
>
>> Given the above setup, the problem is to find *all* sub-arrays that
>> contain 3..T distinct elements and are as short as possible, i.e. if
>> there are 42 sub-arrays of three elements containing three distinct
>> values, there is no need to find sub-arrays of four elements containing
>> three distinct values. Also, a shorter sub-array may *never* be a part
>> of a longer sub-array containing *only* distinct values.
>>
>> An example, suppose only the array contains 56 elements, in this case
>> with, for clarity, values a..j:
>>
>>            1         2         3         4         5     5
>> ....5....0....5....0....5....0....5....0....5....0....56
>> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>>
>> in this case the program should find the following sub-arrays:
>>
>> len aperture from to value
>>    3     3      40  42 bef
>>    3     3      44  46 fgh
>>    7     7      50  56 gihbfja
>>    8    16      41  56 efffghggggihbfja, containing efghibja
>>    9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
>> 10    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija

I haven't thought this all the way through, but my immediate idea was to
start by making a full inverted index, i.e. sort all possible substrings
into alphabetical order.

However, on second consideration this probably doesn't really help. :-(

>>
>>
>> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
>> they are either longer than the series of 7-in-7 (12-25 contain abcd),
>
> Are you saying that if we tabulate the results, then both the "len" and
> "aperture" columns should have increasing values?
>
> If my understanding of the requirements is correct then it may be better
> to start with a large window and _decrease_ it in each pass. (As finding
> the longest length and corresponding aperture is trivial).
> For example, after finding len=10 aperture=25 one should only consider
> substrings of 25 or less when searching for sequences of less than 10
> distinct values.

This sounds like a good idea!

If we start with a full scan that indexes the locations of each possible
byte value, then we know that the longest sequence will contain each of
the unique byte values found, right?

We can then start with the full input array, then from each end remove
bytes until we get to the only remaining entry for a byte value: This
will be the shortest substring which contains all the unique values.

As you note, this is also the longest substring we need to consider...

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Robert AH Prins wrote:
> On 2010-05-17 00:29, glen herrmannsfeldt wrote:
>> In comp.lang.pl1 Robert AH Prins<spamtrap@prino.org> wrote:
>>
>>> Can anyone give me some hints as to solve the following problem,
>>> preferably in a way that is faster than the way I used to do it, and
>>> without the bug in the current version;
>>
>>> Problem description:
>>
>>> The program processes an array of N cells, with N up to about 4096. The
>>> array is filled from the start, it may not be completely full but there
>>> are no empty cells between full cells. A cell contains a value of
>>> 1..255, so logically every cell would be a byte, but if processing
>>> efficiency requires each cell to be a word or even a double-word, that
>>> would be no problem. The values are normalized, they are effectively
>>> indexes into a table, so if the table contains just T entries, the
>>> values would go from 1 to T.

The maximum array size of 4096 is interesting, it means that we can
afford to use quite a bit of lookup table space per entry.

Otoh is also means that we cannot afford to use a lot of time to
initialize those tables, since that would take just as long or longer
than the obvious window scanning algorithm.

I would still start by determining the first and last position of each
unique byte, then use that to pick the determine the shortest substring
which contains all the values: This becomes the limit for a window scan
that starts with aperture 3 and goes up.

byte not_found[256]; // Contains 1 for byte values not seen so far

unsigned current_limit = 3; // Minimum len!
unsigned next_aperture = 0; // Wait until we find the first substring

for (unsigned aperture = 3; aperture < limit; aperture++) {
if (aperture == next_aperture)
current_limit++;

for (unsigned start = 0; start <= length-aperture; start++) {
/* This memset might turn out to be the single most expensive
part of the algorithm, at least for short lengths/apertures.
Even using 16 SSE writes, it will take at least 16 cycles
just to get to L1 cache.

One possible optimization is to note that at least half
of all bus write cycles will be idle during the main loop
below, so we could use two independent not_found buffers
and alternate between them: Use one and zero the other
at the same time!
*/
memset(not_found, 1, sizeof(found));

unsigned count = 0;

/* The core of the scanning algorithm:
Count the number of unique byte values by adding together
all those values not seen previously, while making each
entry in the not_found[] array:
*/
for (i = 0; i < aperture; i++) {
byte c = data[start+i];
count += not_found[c];
not_found[c] = 0;
}
/* The running time for this loop should be ~3 cycles/iteration
as long as the loop branch is correctly predicted, and at
least for the shorter apertures the modern multi-level
predictors should do quite well here.
*/

/* This next part will be very rarely executed except for
the shortest apertures/lengths, so we can assume it to
be skipped:
*/
if (count >= current_limit) {
if (count > current_limit)
current_limit = count;
save(start, aperture, count);
next_aperture = aperture+1;
}
}
}

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-17 04:57, Dick Wesseling wrote:
> In article<85akp6F4bpU1@mid.individual.net>,
> 	Robert AH Prins<spamtrap@prino.org>  writes:
>
>> Given the above setup, the problem is to find *all* sub-arrays that
>> contain 3..T distinct elements and are as short as possible, i.e. if
>> there are 42 sub-arrays of three elements containing three distinct
>> values, there is no need to find sub-arrays of four elements containing
>> three distinct values. Also, a shorter sub-array may *never* be a part
>> of a longer sub-array containing *only* distinct values.
>>
>> An example, suppose only the array contains 56 elements, in this case
>> with, for clarity, values a..j:
>>
>>            1         2         3         4         5     5
>> ....5....0....5....0....5....0....5....0....5....0....56
>> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>>
>> in this case the program should find the following sub-arrays:
>>
>> len aperture from to value
>>  3     3      40  42 bef
>>  3     3      44  46 fgh
>>  7     7      50  56 gihbfja
>>  8    16      41  56 efffghggggihbfja, containing efghibja
>>  9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
>>  0    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija
>>
>>
>> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
>> they are either longer than the series of 7-in-7 (12-25 contain abcd),
>
> Are you saying that if we tabulate the results, then both the "len" and
> "aperture" columns should have increasing values?

If they are temporarily stored in an array or list, then it's trivial to
sort them in any required order before further processing.

> If my understanding of the requirements is correct then it may be better
> to start with a large window and _decrease_ it in each pass. (As finding
> the longest length and corresponding aperture is trivial).

Even if there may be more than one longest interval? This *is* pretty
unlikely. However, the current set of data contains two sub-arrays with
a length of 1879 that contain 64 distinct values.

> For example, after finding len=10 aperture=25 one should only consider
> substrings of 25 or less when searching for sequences of less than 10
> distinct values.

That is pretty obvious, but why would reducing the aperture from 24 down
to potentially 9 to find substrings with 9 distinct values be any more
efficient than starting with an aperture of 9 and increasing it to 24?
If the substring of 9 elements were to be found with an aperture of 16
or 17, i.e. dead in the middle, starting with the smaller aperture would
involve less testing...

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```On 2010-05-17 00:29, glen herrmannsfeldt wrote:
> In comp.lang.pl1 Robert AH Prins<spamtrap@prino.org>  wrote:
>
>> Can anyone give me some hints as to solve the following problem,
>> preferably in a way that is faster than the way I used to do it, and
>> without the bug in the current version;
>
>> Problem description:
>
>> The program processes an array of N cells, with N up to about 4096. The
>> array is filled from the start, it may not be completely full but there
>> are no empty cells between full cells. A cell contains a value of
>> 1..255, so logically every cell would be a byte, but if processing
>> efficiency requires each cell to be a word or even a double-word, that
>> would be no problem. The values are normalized, they are effectively
>> indexes into a table, so if the table contains just T entries, the
>> values would go from 1 to T.
>
> For alphabetsize of 32 or less, that is, values 0..31, and if I
> did it in C, I would probably use the one bit per 32 bit word
> storage, which makes testing fairly easy using&  and |.
> As far as I know, PL/I BIT(32) doesn't have the optimization
> that uses a 32 bit word, and UNSPEC is likely even worse.

Bit operations, especially on aligned bitstrings, were never very bad,
even in the old V2.3.0 compiler, and BIT(1) ALIGNED already uses one bit
per 1 byte.

>> Given the above setup, the problem is to find *all* sub-arrays that
>> contain 3..T distinct elements and are as short as possible, i.e. if
>> there are 42 sub-arrays of three elements containing three distinct
>> values, there is no need to find sub-arrays of four elements containing
>> three distinct values. Also, a shorter sub-array may *never* be a part
>> of a longer sub-array containing *only* distinct values.
>
> I wonder if this can be done using dynamic programming.
> DP can speed up many such problems, but it isn't always
> so obvious.
>
> For the longer subarrays, it might be that hash tables could
> speed up the comparisons.  That would, for example, get you
> an O(N) test for a value being in a subarray being considered.

Never thought of using a hash table, always focused on the sliding
window and tried to do that as optimal as possible.

<snip>

>> Until a few months back...
>
>> ... when I started converting the original Pascal code in assembler.
>
> Can you post some code so we can see it?  It might give some good,
> or not so good, ideas on how to do it better.

I'll pull the code out of the program, probably needs a bit of massaging
to make it runnable. I probably still have the (or at least an
iteration) of sliding window somewhere, but it may not be able to post
it this week as I will be away for a a short break.

>> While doing so I decided that it would be interesting to use this
>> procedure in another part of the code. However, the results were not
>> what I expected: result rows were missing.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```"Robert AH Prins" <spamtrap@prino.org> wrote in message news:85akp6F4bpU1@mid.individual.net...
| Hi all,
|
| Can anyone give me some hints as to solve the following problem,
| preferably in a way that is faster than the way I used to do it, and
| without the bug in the current version;
|
| Problem description:
|
| The program processes an array of N cells, with N up to about 4096. The
| array is filled from the start, it may not be completely full but there
| are no empty cells between full cells. A cell contains a value of
| 1..255, so logically every cell would be a byte, but if processing
| efficiency requires each cell to be a word or even a double-word, that
| would be no problem.

In general, byte values are quicker to manipulate than word,
halfword, and doubleword values.

| The values are normalized, they are effectively
| indexes into a table, so if the table contains just T entries, the
| values would go from 1 to T.
|
| Given the above setup, the problem is to find *all* sub-arrays that
| contain 3..T distinct elements and are as short as possible, i.e. if
| there are 42 sub-arrays of three elements containing three distinct
| values, there is no need to find sub-arrays of four elements containing
| three distinct values. Also, a shorter sub-array may *never* be a part
| of a longer sub-array containing *only* distinct values.
|
| An example, suppose only the array contains 56 elements, in this case
| with, for clarity, values a..j:
|
|          1         2         3         4         5     5
| ....5....0....5....0....5....0....5....0....5....0....56
| aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
|
| in this case the program should find the following sub-arrays:
|
| len aperture from to value
|  3     3      40  42 bef
|  3     3      44  46 fgh
|  7     7      50  56 gihbfja
|  8    16      41  56 efffghggggihbfja, containing efghibja
|  9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
| 10    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija
|
| It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
| they are either longer than the series of 7-in-7 (12-25 contain abcd),
| or are part of the 7-in-7 series and as such they should not be included!
|
| My AD 1996 program used to slide a window that started with three
| elements (plus a sentinel on either side) over the big array and spit
| out the position of the first element if it found three distinct
| elements in the window *and* the two sentinels also contained any of the
| values in the window.

This sounds like something adaptable to using INDEX.
especially as your data are byte entries. (or maybe I'm
thinking of the wrong part of the search...).

```
 0

```"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message news:hsq2kv\$4gt\$1@speranza.aioe.org...

| For alphabetsize of 32 or less, that is, values 0..31, and if I
| did it in C, I would probably use the one bit per 32 bit word
| storage, which makes testing fairly easy using & and |.
| As far as I know, PL/I BIT(32) doesn't have the optimization
| that uses a 32 bit word,

It does if it's unioned with a 31-bit FIXED BINARY.

| and UNSPEC is likely even worse.

Did you try it?

```
 0

```Robert AH Prins wrote:
> On 2010-05-17 09:03, Terje Mathisen wrote:
>> Robert AH Prins wrote:
>>> On 2010-05-17 00:29, glen herrmannsfeldt wrote:
>>>> In comp.lang.pl1 Robert AH Prins<spamtrap@prino.org> wrote:
>>>>
>>>>> Can anyone give me some hints as to solve the following problem,
>>>>> preferably in a way that is faster than the way I used to do it, an=
d
>>>>> without the bug in the current version;
>>>>
>>>>> Problem description:
>>>>
>>>>> The program processes an array of N cells, with N up to about 4096.
>>>>> The
>>>>> array is filled from the start, it may not be completely full but
>>>>> there
>>>>> are no empty cells between full cells. A cell contains a value of
>>>>> 1..255, so logically every cell would be a byte, but if processing
>>>>> efficiency requires each cell to be a word or even a double-word, t=
hat
>>>>> would be no problem. The values are normalized, they are effectivel=
y
>>>>> indexes into a table, so if the table contains just T entries, the
>>>>> values would go from 1 to T.
>>
>> The maximum array size of 4096 is interesting, it means that we can
>> afford to use quite a bit of lookup table space per entry.
>
> Memory is cheap, but cache is of course limited.
>
>> Otoh is also means that we cannot afford to use a lot of time to
>> initialize those tables, since that would take just as long or longer
>> than the obvious window scanning algorithm.
>
> The problem is, the sliding window algorithm is fundamentally wrong. I
> spend a bit of time converting the current Pl/I version of the program
> back to something that compiles with the old (AD late 1980's, early
> 1990's compiler) and ran the program with the count option. The old
> sliding window method resulted, as mentioned before, in the execution o=
f
> *556,379,518* PL/I statements. The Paul Green method gets the same
> result by executing a mere *79,386* PL/I statements, a rather staggerin=
g
> difference...

That sounds like the difference between an O(n*n) and O(n*log(n))=20
algorithm. :-)
>
> I'm now going to dig through about a dozen boxes with papers that were
> never unpacked after we moved. I have a faint hope that there may be
> printouts of my email exchange with him (PG) in one of them. Hope I hav=
e
> the time to do it today or tomorrow, as I'm off to the other side of
> Europe later this week, provided the Eyjafjallaj=F6kull allows me...

Britain and several other western parts of Europe seems quite iffy=20
tomorrow morning, according to=20
http://weatheronline.co.uk/cgi-app/volcanic?LANG=3Den&ART=3D3
>
>> I would still start by determining the first and last position of each
>> unique byte, then use that to pick the determine the shortest substrin=
g
>> which contains all the values: This becomes the limit for a window sca=
n
>> that starts with aperture 3 and goes up.
>
> That string may contain all values, but strings to either side of it ma=
y
> contain shorter substrings. In the example I gave, all values occur in
> the substring starting at position 32, but if there had been additional
> values b and c at positions "0" and "-1", your window scan would, if I
> read the above correctly, not pick them up as part of a 3-string.

Right, I did realize that a little bit later.

This is an intriguing problem, I'll have think some more...
:-)

Terje

--=20
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-17 09:03, Terje Mathisen wrote:
> Robert AH Prins wrote:
>> On 2010-05-17 00:29, glen herrmannsfeldt wrote:
>>> In comp.lang.pl1 Robert AH Prins<spamtrap@prino.org> wrote:
>>>
>>>> Can anyone give me some hints as to solve the following problem,
>>>> preferably in a way that is faster than the way I used to do it, and
>>>> without the bug in the current version;
>>>
>>>> Problem description:
>>>
>>>> The program processes an array of N cells, with N up to about 4096. The
>>>> array is filled from the start, it may not be completely full but there
>>>> are no empty cells between full cells. A cell contains a value of
>>>> 1..255, so logically every cell would be a byte, but if processing
>>>> efficiency requires each cell to be a word or even a double-word, that
>>>> would be no problem. The values are normalized, they are effectively
>>>> indexes into a table, so if the table contains just T entries, the
>>>> values would go from 1 to T.
>
> The maximum array size of 4096 is interesting, it means that we can
> afford to use quite a bit of lookup table space per entry.

Memory is cheap, but cache is of course limited.

> Otoh is also means that we cannot afford to use a lot of time to
> initialize those tables, since that would take just as long or longer
> than the obvious window scanning algorithm.

The problem is, the sliding window algorithm is fundamentally wrong. I
spend a bit of time converting the current Pl/I version of the program
back to something that compiles with the old (AD late 1980's, early
1990's compiler) and ran the program with the count option. The old
sliding window method resulted, as mentioned before, in the execution of
*556,379,518* PL/I statements. The Paul Green method gets the same
result by executing a mere *79,386* PL/I statements, a rather staggering
difference...

I'm now going to dig through about a dozen boxes with papers that were
never unpacked after we moved. I have a faint hope that there may be
printouts of my email exchange with him (PG) in one of them. Hope I have
the time to do it today or tomorrow, as I'm off to the other side of
Europe later this week, provided the Eyjafjallaj�kull allows me...

> I would still start by determining the first and last position of each
> unique byte, then use that to pick the determine the shortest substring
> which contains all the values: This becomes the limit for a window scan
> that starts with aperture 3 and goes up.

That string may contain all values, but strings to either side of it may
contain shorter substrings. In the example I gave, all values occur in
the substring starting at position 32, but if there had been additional
values b and c at positions "0" and "-1", your window scan would, if I
read the above correctly, not pick them up as part of a 3-string.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```In article <85c8i8FphiU1@mid.individual.net>,
Robert AH Prins <spamtrap@prino.org> writes:
> On 2010-05-17 04:57, Dick Wesseling wrote:
>> In article<85akp6F4bpU1@mid.individual.net>,
>> 	Robert AH Prins<spamtrap@prino.org>  writes:
>>
>>> Given the above setup, the problem is to find *all* sub-arrays that
>>> contain 3..T distinct elements and are as short as possible, i.e. if
>>> there are 42 sub-arrays of three elements containing three distinct
>>> values, there is no need to find sub-arrays of four elements containing
>>> three distinct values. Also, a shorter sub-array may *never* be a part
>>> of a longer sub-array containing *only* distinct values.
>>>
>>> An example, suppose only the array contains 56 elements, in this case
>>> with, for clarity, values a..j:
>>>
>>>            1         2         3         4         5     5
>>> ....5....0....5....0....5....0....5....0....5....0....56
>>> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>>>
>>> in this case the program should find the following sub-arrays:
>>>
>>> len aperture from to value
>>>  3     3      40  42 bef
>>>  3     3      44  46 fgh
>>>  7     7      50  56 gihbfja
>>>  8    16      41  56 efffghggggihbfja, containing efghibja
>>>  9    23      34  56 cbbbbbbefffghggggihbfja, containing cbefghija
>>>  0    25      32  56 dccbbbbbbefffghggggihbfja, containing dcbefghija
>>>
>>>
>>> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
>>> they are either longer than the series of 7-in-7 (12-25 contain abcd),
>>
>> Are you saying that if we tabulate the results, then both the "len" and
>> "aperture" columns should have increasing values?
>
> If they are temporarily stored in an array or list, then it's trivial to
> sort them in any required order before further processing.
>

That's not what I meant. I wanted to make sure that I understood the
problem statement before attempting a solution.

>> If my understanding of the requirements is correct then it may be better
>> to start with a large window and _decrease_ it in each pass. (As finding
>> the longest length and corresponding aperture is trivial).
>
> Even if there may be more than one longest interval? This *is* pretty
> unlikely. However, the current set of data contains two sub-arrays with
> a length of 1879 that contain 64 distinct values.
>
>> For example, after finding len=10 aperture=25 one should only consider
>> substrings of 25 or less when searching for sequences of less than 10
>> distinct values.
>
> That is pretty obvious, but why would reducing the aperture from 24 down
> to potentially 9 to find substrings with 9 distinct values be any more
> efficient than starting with an aperture of 9 and increasing it to 24?
> If the substring of 9 elements were to be found with an aperture of 16
> or 17, i.e. dead in the middle, starting with the smaller aperture would
> involve less testing...
>

The solution that I have in mind is O(N*C) where N is the length of
the input and C is either the size of the alphabet (current version) or
- with a few improvements that left as an exercise - the number of
different lengths (which is 5 for {3,7,8,9,10} in your example).

My solution involves an exclusion list that is searched linearly,
this may ruin the performance, but it should be possible to use a
binary search tree instead. Also left as an exercise.

Anyway, here is what I had in mind. Compiles with gcc.

#include <stdio.h>
#include <string.h>

int verbose=0;

#if 1
// original example

#define TESTSTR "aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja"

#else

/*
That string may contain all values, but strings to either side of it may
contain shorter substrings. In the example I gave, all values occur in
the substring starting at position 32, but if there had been additional
values b and c at positions "0" and "-1", your window scan would, if I
read the above correctly, not pick them up as part of a 3-string.
*/

#define TESTSTR "bcaaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja"
#endif
unsigned char input [] = TESTSTR ;
unsigned N = sizeof(TESTSTR)-1;

#define C 256               // Size of alphabet
#define ONE 1               /* Pascal array bias convention */

void findseqs(unsigned char *data, unsigned len) {
unsigned i;
unsigned uniq = 0;              // Nr unique symbols in window
unsigned apub = ~0;             // Aperture upper bound

unsigned nrfound = /* to keep gcc happy */ 0;
unsigned found[len];

unsigned freq[C];               // Frequency count
bzero(freq, sizeof(freq));

#define inc(c) do { if(!freq[c]) uniq++; freq[c]++; } while(0)
#define dec(c) do { freq[c]--; if(!freq[c]) uniq--; } while(0)

#define WINSIZE ((r-l)+1)

// THIS NEEDS TO BE IMPROVED:
struct {
unsigned start;
unsigned end;                   // Exclusive end
} excluded [len];
int nrexcluded=0;

unsigned l = 0;             // Start of window
unsigned r = len-1;         // Inclusive end of window

for (i=0; i<len; i++) {
inc(data[i]);
}

// Trim left & right edges

while (freq[data[r]]>1) { dec(data[r]); r--; }
while (freq[data[l]]>1) { dec(data[l]); l++; }

// From here on we want to find only sequences that are shorter or
// equal to what we have found above:

unsigned goal = uniq;

haveseq:
if (WINSIZE <= apub ) {         // Short enough?

// A candidate. May be premature if we have shorter sequences
// with the same nr of unique symbols.

if (verbose) {
printf("%s\n", data);
for (i=0; i<l; i++) printf(" ");
for (   ; i<=r; i++) printf("-");
printf("\n");
printf("Possible sequence: len %d aper %d from %d to %d\n",
uniq, WINSIZE, l, r);
}

// Part of exclusion?
// For now brute force. todo: some sort of binary search

for (i=0; i<nrexcluded; i++) {
if ( l >= excluded[i].start &&
r <  excluded[i].end) {
if (verbose) printf("Skip excluded\n");
goto skip;
}
}
if (WINSIZE < apub) {
nrfound = 0;            // Dump old candidates
apub = WINSIZE;         // Next matches must be <= this one
}
found[nrfound] = l;         // Add to list of tentative results
nrfound++;
skip:   ;
}

// Slide window

dec(data[l]); l++;
while (freq[data[l]]>1) { dec(data[l]); l++; }
while (uniq < goal && r<len-1) { r++; inc(data[r]); }

// Hit right edge?
if (uniq == goal) goto haveseq;

// No more matches of this length. Print results

for (i=0; i<nrfound; i++) {

unsigned j;
printf("Found len %2d aperture %2d pos %d\t", goal, apub, ONE+found[i]);
for (j=found[i]; j< apub+found[i]; j++) printf("%c", data[j]);
printf("\n");

if (apub == goal) {     // *only* distinct values?
excluded[nrexcluded].start = found[i];
excluded[nrexcluded].end   = found[i]+goal;
nrexcluded++;
}
}

// Now the next shorter sequence.
//
// If may understanding of the problem is correct it must have
// a smaller aperture than the one found above. So in order to
// find the length and a first estimate of the aperture we slide
// a fixed size window over the input data

#define RESET   \
bzero(freq, sizeof(freq));  \
uniq = l = r = 0;           /* reset everything */

RESET

apub--;                         // Must be shorter the previous

for (r=0; r<apub-1; r++) {
inc(data[r]);
}   inc(data[r]);

unsigned int start = 0;         // Where to start next iteration
unsigned int end   = 0;         // of main loop

goal=0;

// match fixed window at all positions
while (r < (len-1)) {
if (uniq > goal ) {

// Q: does it pay to test exclusion list at this point?
// Yes! That saves quite a few iterations of the main
// loop. E.g. sample data is now O(10*N) whereas
// O(5*N) can be achieved. This is left as an exercise

goal = uniq;
start = l;
end   = r;
}
dec(data[l]); l++;
r++; inc(data[r]);
}

if (goal < 3) return;

RESET
l=start;
for (r=l; r<end; r++) {
inc(data[r]);
}   inc(data[r]);

while (freq[data[l]]>1) { dec(data[l]); l++; }
apub=WINSIZE;
nrfound=0;

if (verbose) {
printf("Next goal %d start %d end %d uniq %d \n",
goal, start, end, uniq );
}
goto haveseq;
}

int main(int argc, char **argv) {
if (argc >1) {
findseqs(argv[1], strlen(argv[1]));
} else {
findseqs(input, N);
}
return 0;
}
```
 0

```Robert AH Prins wrote:
> Hi all,
>
> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;
>
....
I believe the program below does what you want and should be a couple of orders of magnitude faster
than the method you outlined (based on your statement about how many "PL/I statements" are executed
on an input of length 2329 with 66 distinct values). For illustrative purposes, the program uses the
character representation you used in your example for input and output, but the algorithm itself
will work for values of c up to 32765 and v up to 32764 if you have enough memory and patience.

The program uses four arrays: y(c), a character array into which the input is read; x(c), an integer
array giving the encoded input with a=1, b=2, etc; a(0:c,v+1), described below; and l(v), with l(j)
the length of the shortest subsequence with j distinct values.

The basic idea is to build an array, a, with a row for each cell and a column for each value. The
i,j element of the array gives the relative cell number in the input array starting from cell i of
the first occurrence of value j, if any, otherwise a(i,j) is greater than max(c,v+1). The array is
built starting from the last row, c, and working back to row 1. Row c is first set to max(c+1,v+2).
Then a(c,x(c)) is set to 1. Working backward for i=c-1 to 1, each element of row i is set to the
minimum of 32766 and the corresponding element of row i+1 plus 1 and then element a(i,x(i)) is set to 1.

Once this construction is complete, it is evident that there is one and only one instance of the
integer 1 in each row, and all values in each row less than c+1 are unique.

Next each row is sorted. After sorting, the a(i,j) element contains the length of the shortest
subsequence starting at position i that contains j distinct values unless there is no such
subsequence, in which case a(i,j) is greater than c. It is also evident that if a(i,j)=j then
a(i,k)=k for 1<=k<=j. Now if a(i,j)=j the subsequence of length j starting at position i has j
distinct values, but it should be excluded from the solution set for j distinct values if it is a
proper subsequence of a longer subsequence of distinct values. This will be so iff a(i,j+1)=j+1 or
a(i-1,j+1)=j+1.

Based on these considerations, the solution sets for each j are constructed by scanning each column
starting with j=3 and proceeding to j=v. For this purpose the array a actually has an additional
row, 0, to contain the subscript of the first cell of the first minimal subsequence for each j, and
an extra column, v+1, containing values greater than c to ensure that the test for inclusion can be
carried out for j=v. For each j, l(j) is initialized to c+1 and k (the previous element in the list
of minimal subsequences) to 0. The scan proceeds as follows: if a(i,j) is less than l(j) the current
subsequence is shorter that the shortest found so far so l(j) is updated and k reset to 0. If a(i,j)
is equal to l(j) the inclusion test is applied and if it passes the current subsequence is added to
the list. Once the entire column has been processed the list is terminated by setting a(k,j)=0.

It is now a simple matter to read off the solution subsequences. The program utilizes sorta to sort
the rows. call sorta(addr(first element),n,w,c)  sorts an array of n elements of width w in place
according the comparison function c. c(u,v) returns '1'b (bit(1) aligned) if the u-th element may
precede the v-th element, i.e., x(u)<=x(v).

Source:

%process mar(2,100) offset;
subsets: proc options(main) reorder;
dcl
(c,v,i,j,k,m,ar init((rank('a')-1))) bin fixed,
(x(c),a(0:c,v+1),l(v)) bin fixed ctl,
y(c) char(1) ctl,
sorta entry(ptr,bin fixed(31),bin fixed(31),
entry(bin fixed(31),bin fixed(31)) returns(bit(1) aligned)),
vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50) var),
sysin file input,
sysprint file print,
get file(sysin) list(c,v);
put file(sysprint) edit('c: ',vfmt(c,10,0),', v: ',vfmt(v,10,0))(col(1),4 a);
allocate x,a,l,y;
get file(sysin) edit(y)(col(1),(c)a(1));
put file(sysprint) edit('Input: ',y)(col(1),a,(c)a(1));
do i=1 to c; x(i)=rank(y(i))-ar; end;
a(c,*)=max(c+1,v+2); a(c,x(c))=1;
do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;
do i=1 to c; call sorta(addr(a(i,1)),v,2,comp); end;
do j=3 to v; l(j)=c+1; k=0; m=j+1; a(0,m)=0;
do i=1 to c;
if a(i,j)<l(j) then do; l(j)=a(i,j); k=0; end;
if a(i,j)=l(j) then if a(i-1,m)~=m then if a(i,m)~=m then do; a(k,j),k=i; end;
end;
a(k,j)=0;
if a(0,j)>0 then do;
put file(sysprint) edit('Distinct: ',vfmt(j,10,0),', length: ',vfmt(l(j),10,0))(col(1),4 a);
i=a(0,j); k=0; do while(i~=0); k+=1;
put file(sysprint) edit(k,': (',vfmt(i,10,0),') ',(y(i+m) do m=0 to l(j)-1))
(col(1),f(10),3 a,(l(j))a(1));
i=a(i,j);
end;
end;
end;
comp: proc(u,v) returns(bit(1) aligned) reorder;
dcl (u,v) bin fixed(31);
return(a(i,u)<=a(i,v));
end comp;
end subsets;

Input:

56 10
aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja

Output:

c: 56, v: 10
Input: aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
Distinct: 3, length: 3
1: (40) bef
2: (44) fgh
Distinct: 7, length: 7
1: (50) gihbfja
Distinct: 8, length: 16
1: (41) efffghggggihbfja
Distinct: 9, length: 23
1: (34) cbbbbbbefffghggggihbfja
Distinct: 10, length: 25
1: (32) dccbbbbbbefffghggggihbfja

Here is the output from another example:

c: 48, v: 10
Input: abcabcabcabcdefgabcabccdabcdeabcaaabbbcccdefghij
Distinct: 3, length: 3
1: (1) abc
2: (2) bca
3: (3) cab
4: (4) abc
5: (5) bca
6: (6) cab
7: (7) abc
8: (8) bca
9: (9) cab
10: (18) bca
11: (19) cab
12: (20) abc
13: (31) bca
Distinct: 4, length: 4
1: (23) cdab
2: (24) dabc
Distinct: 5, length: 5
1: (25) abcde
2: (26) bcdea
3: (27) cdeab
4: (28) deabc
Distinct: 7, length: 7
1: (10) abcdefg
2: (11) bcdefga
3: (12) cdefgab
4: (13) defgabc
Distinct: 8, length: 8
1: (41) cdefghij
Distinct: 9, length: 11
1: (38) bcccdefghij
Distinct: 10, length: 14
1: (35) abbbcccdefghij
```
 0

```Robert AH Prins wrote:
> Hi all,
>
> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;
>
....
I believe the program below does what you want and should be a couple of orders of magnitude faster
than the method you outlined (based on your statement about how many "PL/I statements" are executed
on an input of length 2329 with 66 distinct values). For illustrative purposes, the program uses the
character representation you used in your example for input and output, but the algorithm itself
will work for values of c up to 32765 and v up to 32764 if you have enough memory and patience.

The program uses four arrays: y(c), a character array into which the input is read; x(c), an integer
array giving the encoded input with a=1, b=2, etc; a(0:c,v+1), described below; and l(v), with l(j)
the length of the shortest subsequence with j distinct values.

The basic idea is to build an array, a, with a row for each cell and a column for each value. The
i,j element of the array gives the relative cell number in the input array starting from cell i of
the first occurrence of value j, if any, otherwise a(i,j) is greater than max(c,v+1). The array is
built starting from the last row, c, and working back to row 1. Row c is first set to max(c+1,v+2).
Then a(c,x(c)) is set to 1. Working backward for i=c-1 to 1, each element of row i is set to the
minimum of 32766 and the corresponding element of row i+1 plus 1 and then element a(i,x(i)) is set to 1.

Once this construction is complete, it is evident that there is one and only one instance of the
integer 1 in each row, and all values in each row less than c+1 are unique.

Next each row is sorted. After sorting, the a(i,j) element contains the length of the shortest
subsequence starting at position i that contains j distinct values unless there is no such
subsequence, in which case a(i,j) is greater than c. It is also evident that if a(i,j)=j then
a(i,k)=k for 1<=k<=j. Now if a(i,j)=j the subsequence of length j starting at position i has j
distinct values, but it should be excluded from the solution set for j distinct values if it is a
proper subsequence of a longer subsequence of distinct values. This will be so iff a(i,j+1)=j+1 or
a(i-1,j+1)=j+1.

Based on these considerations, the solution sets for each j are constructed by scanning each column
starting with j=3 and proceeding to j=v. For this purpose the array a actually has an additional
row, 0, to contain the subscript of the first cell of the first minimal subsequence for each j, and
an extra column, v+1, containing values greater than c to ensure that the test for inclusion can be
carried out for j=v. For each j, l(j) is initialized to c+1 and k (the previous element in the list
of minimal subsequences) to 0. The scan proceeds as follows: if a(i,j) is less than l(j) the current
subsequence is shorter that the shortest found so far so l(j) is updated and k reset to 0. If a(i,j)
is equal to l(j) the inclusion test is applied and if it passes the current subsequence is added to
the list. Once the entire column has been processed the list is terminated by setting a(k,j)=0.

It is now a simple matter to read off the solution subsequences. The program utilizes sorta to sort
the rows. call sorta(addr(first element),n,w,c)  sorts an array of n elements of width w in place
according the comparison function c. c(u,v) returns '1'b (bit(1) aligned) if the u-th element may
precede the v-th element, i.e., x(u)<=x(v).

Source:

%process mar(2,100) offset;
subsets: proc options(main) reorder;
dcl
(c,v,i,j,k,m,ar init((rank('a')-1))) bin fixed,
(x(c),a(0:c,v+1),l(v)) bin fixed ctl,
y(c) char(1) ctl,
sorta entry(ptr,bin fixed(31),bin fixed(31),
entry(bin fixed(31),bin fixed(31)) returns(bit(1) aligned)),
vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50) var),
sysin file input,
sysprint file print,
get file(sysin) list(c,v);
put file(sysprint) edit('c: ',vfmt(c,10,0),', v: ',vfmt(v,10,0))(col(1),4 a);
allocate x,a,l,y;
get file(sysin) edit(y)(col(1),(c)a(1));
put file(sysprint) edit('Input: ',y)(col(1),a,(c)a(1));
do i=1 to c; x(i)=rank(y(i))-ar; end;
a(c,*)=max(c+1,v+2); a(c,x(c))=1;
do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;
do i=1 to c; call sorta(addr(a(i,1)),v,2,comp); end;
do j=3 to v; l(j)=c+1; k=0; m=j+1; a(0,m)=0;
do i=1 to c;
if a(i,j)<l(j) then do; l(j)=a(i,j); k=0; end;
if a(i,j)=l(j) then if a(i-1,m)~=m then if a(i,m)~=m then do; a(k,j),k=i; end;
end;
a(k,j)=0;
if a(0,j)>0 then do;
put file(sysprint) edit('Distinct: ',vfmt(j,10,0),', length: ',vfmt(l(j),10,0))(col(1),4 a);
i=a(0,j); k=0; do while(i~=0); k+=1;
put file(sysprint) edit(k,': (',vfmt(i,10,0),') ',(y(i+m) do m=0 to l(j)-1))
(col(1),f(10),3 a,(l(j))a(1));
i=a(i,j);
end;
end;
end;
comp: proc(u,v) returns(bit(1) aligned) reorder;
dcl (u,v) bin fixed(31);
return(a(i,u)<=a(i,v));
end comp;
end subsets;

Input:

56 10
aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja

Output:

c: 56, v: 10
Input: aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
Distinct: 3, length: 3
1: (40) bef
2: (44) fgh
Distinct: 7, length: 7
1: (50) gihbfja
Distinct: 8, length: 16
1: (41) efffghggggihbfja
Distinct: 9, length: 23
1: (34) cbbbbbbefffghggggihbfja
Distinct: 10, length: 25
1: (32) dccbbbbbbefffghggggihbfja

Here is the output from another example:

c: 48, v: 10
Input: abcabcabcabcdefgabcabccdabcdeabcaaabbbcccdefghij
Distinct: 3, length: 3
1: (1) abc
2: (2) bca
3: (3) cab
4: (4) abc
5: (5) bca
6: (6) cab
7: (7) abc
8: (8) bca
9: (9) cab
10: (18) bca
11: (19) cab
12: (20) abc
13: (31) bca
Distinct: 4, length: 4
1: (23) cdab
2: (24) dabc
Distinct: 5, length: 5
1: (25) abcde
2: (26) bcdea
3: (27) cdeab
4: (28) deabc
Distinct: 7, length: 7
1: (10) abcdefg
2: (11) bcdefga
3: (12) cdefgab
4: (13) defgabc
Distinct: 8, length: 8
1: (41) cdefghij
Distinct: 9, length: 11
1: (38) bcccdefghij
Distinct: 10, length: 14
1: (35) abbbcccdefghij

```
 0

```Sorry for the double posting.

When I sent the message the first time, a message from my outgoing mail server appeared saying
something to the effect that "this account is valid for email only; postings to newsgroups are ignored."

I looked on the newsgroup and sure enough the message was not there. I messed around a bit with the
copy in my sent folder and tried again. I did not get the message from the isp this time but the
message did not show up on the newsgroup either. I decided to give up and go to bed and try again in
the morning. By that time both messages had appeared.

Can anyone explain what was going on? I have never seen this behavior before.
```
 0

```James J. Weinkam wrote:
> Sorry for the double posting.
>
> When I sent the message the first time, a message from my outgoing mail
> server appeared saying something to the effect that "this account is
> valid for email only; postings to newsgroups are ignored."

Hi James,

The NG comp.lang.asm.x86 is moderated. What this means is that your NNTP
server, instead of posting the article as usual, sends it - by mail - to
the moderator for approval. I've never seen that message, but I suspect
that's where it's from. When the moderator (or the "moderator's
apprentice", me) approves the message, we re-post it to an NNTP server.
This time, bearing the highly arcane moderator's incantation, it gets
posted. In the case of a cross-posted message, this delays posting to
all groups, even the unmoderated ones. We try to keep on top of it, but
there is inevitably some delay.

I strongly suspected I was approving the same message twice, but would
rather do that than risk missing some minor edit. If anyone is terribly
bothered by the double-posting, my abject apologies on behalf of the
moderation team - and my personal advice: "get a life!" :)

Sorry 'bout that - we appreciate your posting!

Best,
Frank

```
 0

```In article <oxAIn.4447\$z%6.25@edtnps83>,
"James J. Weinkam" <jjw@nospicedham.cs.sfu.ca> writes:
> Sorry for the double posting.
>
> When I sent the message the first time, a message from my outgoing mail server appeared saying
> something to the effect that "this account is valid for email only; postings to newsgroups are ignored."
>
> I looked on the newsgroup and sure enough the message was not there.

Which newsgroup? I am reading comp.lang.asm.x86 which is moderated,
so it is not unusual for a message to not show up immediately. I don't know
if the other two groups - comp.lang.pascal.borland and comp.lang.pl1 - are
moderated. What happens when posting to multiple groups when one is moderated
and the others are not?

>                                                                      I messed around a bit with the
> copy in my sent folder and tried again. I did not get the message from the isp this time but the
> message did not show up on the newsgroup either. I decided to give up and go to bed and try again in
> the morning. By that time both messages had appeared.
>
> Can anyone explain what was going on? I have never seen this behavior before.

I noticed some strange things too.
My message <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl> references three
other messages:

References: <85akp6F4bpU1@mid.individual.net>
<4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
<85c8i8FphiU1@mid.individual.net>

But the message that appeared in clax86 contained only the first
reference. Did the moderation software remove the other two? Never
seen that before.
```
 0

```In comp.lang.pl1 Dick Wesseling <free@nospicedham.securityaudit.val.newsbank.net> wrote:
(snip)

> Which newsgroup? I am reading comp.lang.asm.x86 which is moderated,
(snip)
> What happens when posting to multiple groups when one is moderated
> and the others are not?

Last I knew, it wasn't posted to any until the moderator approved it.

-- glen
```
 0

```Dick Wesseling wrote:
> In article <oxAIn.4447\$z%6.25@edtnps83>,
> 	"James J. Weinkam" <jjw@nospicedham.cs.sfu.ca> writes:
>> Sorry for the double posting.
>>
>> When I sent the message the first time, a message from my outgoing mail server appeared saying
>> something to the effect that "this account is valid for email only; postings to newsgroups are ignored."
>>
>> I looked on the newsgroup and sure enough the message was not there.
>
> Which newsgroup? I am reading comp.lang.asm.x86 which is moderated,
> so it is not unusual for a message to not show up immediately. I don't know
> if the other two groups - comp.lang.pascal.borland and comp.lang.pl1 - are
> moderated. What happens when posting to multiple groups when one is moderated
> and the others are not?

All postings are delayed (I'm not sure why, exactly...) A more(?)
interesting question is, "What if more than one of the groups is
moderated?" I understand that it is considered "impolite" to approve a
message to another group, but I *think* if it's approved for one, it's
approved for all. It is not obvious whether other groups are moderated
or not. Apologies to any other moderators whose toes are being stepped on!

>> copy in my sent folder and tried again. I did not get the message from the isp this time but the
>> message did not show up on the newsgroup either. I decided to give up and go to bed and try again in
>> the morning. By that time both messages had appeared.
>>
>> Can anyone explain what was going on? I have never seen this behavior before.
>
> I noticed some strange things too.
> My message <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl> references three
> other messages:
>
> References: <85akp6F4bpU1@mid.individual.net>
>     <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
>     <85c8i8FphiU1@mid.individual.net>
>
> But the message that appeared in clax86 contained only the first
> reference. Did the moderation software remove the other two? Never
> seen that before.

+OK 3319 octets
Return-path: <news@news1.news.xs4all.nl>
Envelope-to: clax86-submit@inspiretomorrow.net
Delivery-date: Sun, 16 May 2010 23:58:00 -0500
by srv16.hosting24.com with esmtp (Exim 4.69)
(envelope-from <news@news1.news.xs4all.nl>)
id 1ODsOu-0000DM-Az
for clax86-submit@inspiretomorrow.net; Sun, 16 May 2010 23:58:00 -0500
by moderators.individual.net (Exim 4.69)
for comp-lang-asm-x86@moderators.isc.org with esmtp
(envelope-from <news@news1.news.xs4all.nl>)
id <1ODsOo-00024O-57>; Mon, 17 May 2010 06:57:59 +0200
by news1.news.xs4all.nl (8.11.6/8.11.6) id o4H4vqL01125;
Mon, 17 May 2010 06:57:52 +0200 (CEST)
(envelope-from news)
To: comp-lang-asm-x86@moderators.isc.org
Mime-Version: 1.0
References: <85akp6F4bpU1@mid.individual.net>
From: free@securityaudit.val.newsbank.net (Dick Wesseling)
Subject: Re: Optimization problem
Newsgroups: comp.lang.pascal.borland,comp.lang.asm.x86,comp.lang.pl1
Content-Type: text/plain; charset=us-ascii
Date: 17 May 2010 04:57:52 GMT
Lines: 40
Message-ID: <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
NNTP-Posting-Host: 2001:888:11d7:0:210:a7ff:fe0a:d33
X-Trace: 1274072272 news6.xs4all.nl 22945
[2001:888:11d7:0:210:a7ff:fe0a:d33]:40744
X-Complaints-To: abuse@xs4all.nl

As you can see, only one reference showing. We do not intentionally
remove any "References:" lines. We *do* remove "Received:" lines, and a
couple others which, some people feel, reveal too much information about
the poster (sorry about "outing" you). Possible we're not getting this

Again, sorry for the confusion. I suspect "moderation issues" are not of
interest to the other groups, so further discussion should probably go

Best,
Frank
```
 0

```In article <hsv17s\$rte\$1@speranza.aioe.org>,
Frank Kotler <fbkotler@nospicedham.myfairpoint.net> writes:
> Dick Wesseling wrote:
>> In article <oxAIn.4447\$z%6.25@edtnps83>,
>> 	"James J. Weinkam" <jjw@nospicedham.cs.sfu.ca> writes:
>>> Sorry for the double posting.
>>>
>>> When I sent the message the first time, a message from my outgoing mail server appeared saying
>>> something to the effect that "this account is valid for email only; postings to newsgroups are ignored."
>>>
>>> I looked on the newsgroup and sure enough the message was not there.
>>
>> Which newsgroup? I am reading comp.lang.asm.x86 which is moderated,
>> so it is not unusual for a message to not show up immediately. I don't know
>> if the other two groups - comp.lang.pascal.borland and comp.lang.pl1 - are
>> moderated. What happens when posting to multiple groups when one is moderated
>> and the others are not?
>
> All postings are delayed (I'm not sure why, exactly...) A more(?)
> interesting question is, "What if more than one of the groups is
> moderated?" I understand that it is considered "impolite" to approve a
> message to another group, but I *think* if it's approved for one, it's
> approved for all. It is not obvious whether other groups are moderated
> or not. Apologies to any other moderators whose toes are being stepped on!
>
>>> copy in my sent folder and tried again. I did not get the message from the isp this time but the
>>> message did not show up on the newsgroup either. I decided to give up and go to bed and try again in
>>> the morning. By that time both messages had appeared.
>>>
>>> Can anyone explain what was going on? I have never seen this behavior before.
>>
>> I noticed some strange things too.
>> My message <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl> references three
>> other messages:
>>
>> References: <85akp6F4bpU1@mid.individual.net>
>>     <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
>>     <85c8i8FphiU1@mid.individual.net>
>>
>> But the message that appeared in clax86 contained only the first
>> reference. Did the moderation software remove the other two? Never
>> seen that before.
>
> Everything, as received by the "submission address" is "supposed" to be
>
> +OK 3319 octets
> Return-path: <news@news1.news.xs4all.nl>
> Envelope-to: clax86-submit@inspiretomorrow.net
> Delivery-date: Sun, 16 May 2010 23:58:00 -0500
> 	by srv16.hosting24.com with esmtp (Exim 4.69)
> 	(envelope-from <news@news1.news.xs4all.nl>)
> 	id 1ODsOu-0000DM-Az
> 	for clax86-submit@inspiretomorrow.net; Sun, 16 May 2010 23:58:00 -0500
>            by moderators.individual.net (Exim 4.69)
>            for comp-lang-asm-x86@moderators.isc.org with esmtp
>            (envelope-from <news@news1.news.xs4all.nl>)
>            id <1ODsOo-00024O-57>; Mon, 17 May 2010 06:57:59 +0200
> 	by news1.news.xs4all.nl (8.11.6/8.11.6) id o4H4vqL01125;
> 	Mon, 17 May 2010 06:57:52 +0200 (CEST)
> 	(envelope-from news)
> To: comp-lang-asm-x86@moderators.isc.org
> Mime-Version: 1.0
> References: <85akp6F4bpU1@mid.individual.net>
> From: free@securityaudit.val.newsbank.net (Dick Wesseling)
> Subject: Re: Optimization problem
> Newsgroups: comp.lang.pascal.borland,comp.lang.asm.x86,comp.lang.pl1
> Content-Type: text/plain; charset=us-ascii
> Date: 17 May 2010 04:57:52 GMT
> Lines: 40
> Message-ID: <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>

That is not <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl> !

How about the correct log entry?
```
 0

```Dick Wesseling wrote:

....
>> Message-ID: <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
>
> That is not <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl> !
>
> How about the correct log entry?

Oh, you mean *this* <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl>! Sorry
'bout that. (note to self: purge that file more often - way too big!)

+OK 10846 octets
Return-path: <news@news1.news.xs4all.nl>
Envelope-to: clax86-submit@inspiretomorrow.net
Delivery-date: Mon, 17 May 2010 18:06:32 -0500
by srv16.hosting24.com with esmtp (Exim 4.69)
(envelope-from <news@news1.news.xs4all.nl>)
id 1OE9OK-0000ou-53
for clax86-submit@inspiretomorrow.net; Mon, 17 May 2010 18:06:32 -0500
by moderators.individual.net (Exim 4.69)
for comp-lang-asm-x86@moderators.isc.org with esmtp
(envelope-from <news@news1.news.xs4all.nl>)
id <1OE9OD-0003Pw-RR>; Tue, 18 May 2010 01:06:31 +0200
by news1.news.xs4all.nl (8.11.6/8.11.6) id o4HN6N802174;
Tue, 18 May 2010 01:06:23 +0200 (CEST)
(envelope-from news)
To: comp-lang-asm-x86@moderators.isc.org
Mime-Version: 1.0
References: <85akp6F4bpU1@mid.individual.net>
<4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
<85c8i8FphiU1@mid.individual.net>
From: ftu@fi.ruu.nl (Dick Wesseling)
Subject: Re: Optimization problem
Newsgroups: comp.lang.pascal.borland,comp.lang.asm.x86,comp.lang.pl1
Content-Type: text/plain; charset=us-ascii
Date: 17 May 2010 23:06:23 GMT
Lines: 283
Message-ID: <4bf1cbef\$0\$22934\$c5fe704e@news6.xs4all.nl>
NNTP-Posting-Host: 2001:888:11d7:0:210:a7ff:fe0a:d33
X-Trace: 1274137583 news6.xs4all.nl 22934
[2001:888:11d7:0:210:a7ff:fe0a:d33]:41025
X-Complaints-To: abuse@xs4all.nl

Yeah, I do seem to be truncating your references! Presumably because
they're on separate lines... I thought I had that handled. I'll look
into it. Thanks for the heads up!

Best,
Frank
```
 0

```On 5/18/2010 3:27 AM, James J. Weinkam wrote:
> Robert AH Prins wrote:
>> Hi all,
>>
>> Can anyone give me some hints as to solve the following problem,
>> preferably in a way that is faster than the way I used to do it, and
>> without the bug in the current version;
>>
> ...
> I believe the program below does what you want and should be a couple of
> orders of magnitude faster
> than the method you outlined (based on your statement about how many
> "PL/I statements" are executed
> on an input of length 2329 with 66 distinct values). For illustrative
> purposes, the program uses the
> character representation you used in your example for input and output,
> but the algorithm itself
> will work for values of c up to 32765 and v up to 32764 if you have
> enough memory and patience.
>
> The program uses four arrays: y(c), a character array into which the
> input is read; x(c), an integer
> array giving the encoded input with a=1, b=2, etc; a(0:c,v+1), described
> below; and l(v), with l(j)
> the length of the shortest subsequence with j distinct values.
>
> The basic idea is to build an array, a, with a row for each cell and a
> column for each value. The
> i,j element of the array gives the relative cell number in the input
> array starting from cell i of
> the first occurrence of value j, if any, otherwise a(i,j) is greater
> than max(c,v+1). The array is
> built starting from the last row, c, and working back to row 1. Row c is
> first set to max(c+1,v+2).
> Then a(c,x(c)) is set to 1. Working backward for i=c-1 to 1, each
> element of row i is set to the
> minimum of 32766 and the corresponding element of row i+1 plus 1 and
> then element a(i,x(i)) is set to 1.
>
> Once this construction is complete, it is evident that there is one and
> only one instance of the
> integer 1 in each row, and all values in each row less than c+1 are unique.
>
> Next each row is sorted. After sorting, the a(i,j) element contains the
> length of the shortest
> subsequence starting at position i that contains j distinct values
> unless there is no such
> subsequence, in which case a(i,j) is greater than c. It is also evident
> that if a(i,j)=j then
> a(i,k)=k for 1<=k<=j. Now if a(i,j)=j the subsequence of length j
> starting at position i has j
> distinct values, but it should be excluded from the solution set for j
> distinct values if it is a
> proper subsequence of a longer subsequence of distinct values. This will
> be so iff a(i,j+1)=j+1 or
> a(i-1,j+1)=j+1.
>
> Based on these considerations, the solution sets for each j are
> constructed by scanning each column
> starting with j=3 and proceeding to j=v. For this purpose the array a
> row, 0, to contain the subscript of the first cell of the first minimal
> subsequence for each j, and
> an extra column, v+1, containing values greater than c to ensure that
> the test for inclusion can be
> carried out for j=v. For each j, l(j) is initialized to c+1 and k (the
> previous element in the list
> of minimal subsequences) to 0. The scan proceeds as follows: if a(i,j)
> is less than l(j) the current
> subsequence is shorter that the shortest found so far so l(j) is updated
> and k reset to 0. If a(i,j)
> is equal to l(j) the inclusion test is applied and if it passes the
> current subsequence is added to
> the list. Once the entire column has been processed the list is
> terminated by setting a(k,j)=0.
>
> It is now a simple matter to read off the solution subsequences. The
> program utilizes sorta to sort
> the rows. call sorta(addr(first element),n,w,c) sorts an array of n
> elements of width w in place
> according the comparison function c. c(u,v) returns '1'b (bit(1)
> aligned) if the u-th element may
> precede the v-th element, i.e., x(u)<=x(v).
>
>
> Source:
>
> %process mar(2,100) offset;
> subsets: proc options(main) reorder;
> dcl
> (c,v,i,j,k,m,ar init((rank('a')-1))) bin fixed,
> (x(c),a(0:c,v+1),l(v)) bin fixed ctl,
> y(c) char(1) ctl,
> sorta entry(ptr,bin fixed(31),bin fixed(31),
> entry(bin fixed(31),bin fixed(31)) returns(bit(1) aligned)),
> vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50)
> var),
> sysin file input,
> sysprint file print,
> get file(sysin) list(c,v);
> put file(sysprint) edit('c: ',vfmt(c,10,0),', v:
> ',vfmt(v,10,0))(col(1),4 a);
> allocate x,a,l,y;
> get file(sysin) edit(y)(col(1),(c)a(1));
> put file(sysprint) edit('Input: ',y)(col(1),a,(c)a(1));
> do i=1 to c; x(i)=rank(y(i))-ar; end;
> a(c,*)=max(c+1,v+2); a(c,x(c))=1;
> do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;
> do i=1 to c; call sorta(addr(a(i,1)),v,2,comp); end;
> do j=3 to v; l(j)=c+1; k=0; m=j+1; a(0,m)=0;
> do i=1 to c;
> if a(i,j)<l(j) then do; l(j)=a(i,j); k=0; end;
> if a(i,j)=l(j) then if a(i-1,m)~=m then if a(i,m)~=m then do;
> a(k,j),k=i; end;
> end;
> a(k,j)=0;
> if a(0,j)>0 then do;
> put file(sysprint) edit('Distinct: ',vfmt(j,10,0),', length:
> ',vfmt(l(j),10,0))(col(1),4 a);
> i=a(0,j); k=0; do while(i~=0); k+=1;
> put file(sysprint) edit(k,': (',vfmt(i,10,0),') ',(y(i+m) do m=0 to
> l(j)-1))
> (col(1),f(10),3 a,(l(j))a(1));
> i=a(i,j);
> end;
> end;
> end;
> comp: proc(u,v) returns(bit(1) aligned) reorder;
> dcl (u,v) bin fixed(31);
> return(a(i,u)<=a(i,v));
> end comp;
> end subsets;
>
> Input:
>
> 56 10
> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>
> Output:
>
> c: 56, v: 10
> Input: aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
> Distinct: 3, length: 3
> 1: (40) bef
> 2: (44) fgh
> Distinct: 7, length: 7
> 1: (50) gihbfja
> Distinct: 8, length: 16
> 1: (41) efffghggggihbfja
> Distinct: 9, length: 23
> 1: (34) cbbbbbbefffghggggihbfja
> Distinct: 10, length: 25
> 1: (32) dccbbbbbbefffghggggihbfja
>
> Here is the output from another example:
>
> c: 48, v: 10
> Input: abcabcabcabcdefgabcabccdabcdeabcaaabbbcccdefghij
> Distinct: 3, length: 3
> 1: (1) abc
> 2: (2) bca
> 3: (3) cab
> 4: (4) abc
> 5: (5) bca
> 6: (6) cab
> 7: (7) abc
> 8: (8) bca
> 9: (9) cab
> 10: (18) bca
> 11: (19) cab
> 12: (20) abc
> 13: (31) bca
> Distinct: 4, length: 4
> 1: (23) cdab
> 2: (24) dabc
> Distinct: 5, length: 5
> 1: (25) abcde
> 2: (26) bcdea
> 3: (27) cdeab
> 4: (28) deabc
> Distinct: 7, length: 7
> 1: (10) abcdefg
> 2: (11) bcdefga
> 3: (12) cdefgab
> 4: (13) defgabc
> Distinct: 8, length: 8
> 1: (41) cdefghij
> Distinct: 9, length: 11
> 1: (38) bcccdefghij
> Distinct: 10, length: 14
> 1: (35) abbbcccdefghij
>
>
Professor Weinkam's elegant algorithm solves the problem using c.v.lg v
operations (his notation), i.e., N.T.lg T (RAH Prins' notation).

In the PL/I code that he gave, Professor Weinkam used two procedures
that are not included in IBM's VAPLI: SORTA and VFMT. However, it is
easy to supply the missing functions. In place of SORTA one may use
quicksort or mergesort, for which PL/I code is available at
http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#PL.2FI and
http://rosettacode.org/wiki/Sort_an_integer_array#PL.2FI. For VFMT, the
following will do, although it allocates memory that is freed only by
exiting the program:

vfmt: proc(v) returns(char(10) var);
dcl (l,v) fixed bin;
dcl str char(10) var ctl;
allocate str;
if v < 10 then l=1;
else if v < 100 then l=2;
else if v < 1000 then l=3;
else l=4;
put string (str) edit (v) (F(l));
return (str);
end vfmt;

-- mecej4
```
 0

```Frank Kotler wrote:

....
> Yeah, I do seem to be truncating your references! Presumably because
> they're on separate lines... I thought I had that handled. I'll look
> into it. Thanks for the heads up!

May have that fixed. Seems I was seeing a space where I was expecting a
tab... Trouble with trial and error coding is you have to keep posting
test messages! Couldn't resist trying to fix other things, too. Blew the
last test... try again...

Best,
Frank

```
 0

```On May 18, 2:58=A0pm, f...@nospicedham.securityaudit.val.newsbank.net
(Dick Wesseling) wrote:
>
> I noticed some strange things too.
> My message <4bf1cbef\$0\$22934\$c5fe7...@news6.xs4all.nl> references three
> other messages:
>
> References: <85akp6F4bpU1@mid.individual.net>
> =A0 =A0 <4bf0ccd0\$0\$22945\$c5fe704e@news6.xs4all.nl>
> =A0 =A0 <85c8i8FphiU1@mid.individual.net>
>
> But the message that appeared in clax86 contained only the first
> reference. Did the moderation software remove the other two? Never
> seen that before.

Wow!  Thanks for helping to debug our "paper mache" moderation
software.  Who writes that kind of thing in ASM?  We must be 'high' or
crazy or something...  {shifty eyes, pretends bewilderment}  ...
yeah, we're nuts.  Bring on the straight jacket!  :)

Nathan.
http://clax.inspiretomorrow.net/index.php
```
 0

```"James J. Weinkam" <jjw@nospicedham.cs.sfu.ca> wrote in message news:oZrIn.4426\$z%6.1359@edtnps83...

| I believe the program below does what you want and should be a couple of orders of magnitude faster
| than the method you outlined (based on your statement about how many "PL/I statements" are executed
| on an input of length 2329 with 66 distinct values). For illustrative purposes, the program uses the
| character representation you used in your example for input and output, but the algorithm itself
| will work for values of c up to 32765 and v up to 32764 if you have enough memory and patience.

Assuming that it works, a very impressive effort!

```
 0

```In comp.lang.pascal.borland message <hsv17s\$rte\$1@speranza.aioe.org>,
Tue, 18 May 2010 17:35:30, Frank Kotler <fbkotler@nospicedham.myfairpoin
t.net> posted:

>All postings are delayed (I'm not sure why, exactly...) A more(?)
>interesting question is, "What if more than one of the groups is
>moderated?" I understand that it is considered "impolite" to approve a
>message to another group, but I *think* if it's approved for one, it's
>approved for all. It is not obvious whether other groups are moderated
>or not. Apologies to any other moderators whose toes are being stepped
>on!

The initial submission should be mailed to the moderation address of the
first moderated newsgroup in the Newsgroups line.  Each moderator is
expected either to reject it or to mail it to the moderation address of
the next listed moderated newsgroup, except that the moderator address
of the last listed moderated newsgroup either rejects or posts.

A moderator is expected to consider suitability only for his own group,
but might be allowed to reject to protect his group (the Newsgroups
header might include a troll-haunt such as alt.palin.a.nutter.for.half.a
century).

Moderation team members should know how the system is meant to work,
even if their software actually handles the matter.

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk  Turnpike v6.05   MIME.
Web  <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (RFCs 5536/7)
Do not Mail News to me. Before a reply, quote with ">" or "> " (RFCs 5536/7)
```
 0

```robin wrote:
> "James J. Weinkam"<jjw@nospicedham.cs.sfu.ca>  wrote in message news:oZrIn.4426\$z%6.1359@edtnps83...
>
> | I believe the program below does what you want and should be a couple of orders of magnitude faster
> | than the method you outlined (based on your statement about how many "PL/I statements" are executed
> | on an input of length 2329 with 66 distinct values). For illustrative purposes, the program uses the
> | character representation you used in your example for input and output, but the algorithm itself
> | will work for values of c up to 32765 and v up to 32764 if you have enough memory and patience.
>
> Assuming that it works, a very impressive effort!

Indeed.

I have not tried to read the code (or explanation) except to note that
it must be far better than O(n*n). :-)

With that given, how about a simple approach:

Scan through the input string from beginning to end, while maintaining a
set of lookup tables, one for each previous starting position. In each
lookup table we will have two things: The values seen so far, and the
total number of different characters.

For each possible length we remember the shortest string that so far has
resulted in this many unique values, updating it when we find a shorter
string.

This way the processing would consist of loading data[i], then for each
of the previous bytes taken as the starting position check if this value
hasn't been seen before, and if so, update the count and check if this
count/length is equal or better to the best seen so far:

for (i = 2; i < data_len; i++) {
byte b = data[i];
for (j = 0; j < i-2; j++) {
if (!seen[j][b]) {
seen[j][b] = 1;
unique_values = ++count[j];
len = i-j+1;
if (len <= shortest[unique_values]) {
save(unique_values, len, j);
shortest[unique_values] = len;
}
}
}
}

The save function will generate a list of all possible "winning"
candidates, so it must be sorted and pruned before the final output but
that is trivial.

The main (only?) problem with the program above is that it is still
O(n*n), with data_len = 4096, the total number of iterations will be
1+2+3+4+...+4093 which is ~8M.

However, since the maximum possible number of unique values is the
character set size, even using all 256 possible byte values would mean
that for the longer sequences, starting early in the input, 15/16 of all
bytes will have been seen previously so the only code executed will be the

if (!seen[j][b]) {

test & branch.

With 4K 256-byte lookup tables we'll need 1 MB of cache, using bits
instead would compress this to 128K, but that is still too large to fit
in L1 cache, so probably not worth it.

I expect the running time to be ~ 10-20 cycles/inner loop iteration, or
80-160M cycles for the 4K input, letting the program finish in less than
a tenth of a second on any currently existing PC.

Is that fast enough?

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Terje Mathisen wrote:

> I have not tried to read the code (or explanation) except to note that
> it must be far better than O(n*n). :-)

I have studied the code and the explanation. I too am impressed by its
elegance, and now I finally understand the problem statement.
I think that there is a solution with equal or better O() and a smaller
memory requirement.

Let's go back to the problem statement:

> Given the above setup, the problem is to find *all* sub-arrays that
> contain 3..T distinct elements and are as short as possible,

I took me while to understand what is meant by "as short as possible".

In the example it says:

> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
> they are either longer than the series of 7-in-7 (12-25 contain abcd),
> or are part of the 7-in-7 series and as such they should not be included!

1         2         3         4         5     5
.....5....0....5....0....5....0....5....0....5....0....56
aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
--------------                        -------
4-in-14                               7-in-7

I did not fully understand why 12-15 was to be rejected, after all
"abcd" is not a substring of "gihbfja".
Now I understand. 7-in-7 contains substrings 6-in-6, 5-in-5, 4-in-4
and so on. These are not to be printed because they are proper
substrings of "gihbfja". However, they define "as short as possible".
for lengths <= 7.

In other words, once you know the longest n-in-n sequence all shorter
sequences must consist entirely of unique symbols.

This suggests two different seach strategies, one for long seqences and
one for short sequences.

Plan A:

Starting with the longest possible sequence search for n-in-m
sequences until n=m. This requires multiple passes for the input,
one for each unique value of n with m>=n (Or two passes actually, see
my first attempt).

Plan B:

Once the longests n-in-n sequence is known perform ONE more pass
over the input, searching for remaining maximal sequences with
3 <= length <n that consist of unique symbols only.

Finish touch:
Sort the results if so desired.
```
 0

```Frank Kotler wrote:
> Frank Kotler wrote:
>
> ...
>> Yeah, I do seem to be truncating your references! Presumably because
>> they're on separate lines... I thought I had that handled. I'll look
>> into it. Thanks for the heads up!
>
> May have that fixed. Seems I was seeing a space where I was expecting a
> tab... Trouble with trial and error coding is you have to keep posting
> test messages! Couldn't resist trying to fix other things, too. Blew the
> last test... try again...

Guess my patch worked, eh?

Date: 19 May 2010 19:34:56 GMT
References: <85akp6F4bpU1@mid.individual.net>
<oZrIn.4426\$z%6.1359@edtnps83>
<6fmdc7-0kv1.ln1@ntp.tmsw.no>

Seems I mostly see multiple "References" all on one line... or with a
new line starting with TAB. For some reason, I'm seeing "space" here.
Easily fixed... hope it doesn't catch too much unintended stuff. We
shall see...

"My users are my test suite, I shall not want." :)

Best,
Frank
```
 0

```On 2010-05-19 18:15, Terje Mathisen wrote:
> robin wrote:
>> "James J. Weinkam"<jjw@nospicedham.cs.sfu.ca> wrote in message
>> news:oZrIn.4426\$z%6.1359@edtnps83...
>>
>> | I believe the program below does what you want and should be a
>> couple of orders of magnitude faster
>> | than the method you outlined (based on your statement about how many
>> "PL/I statements" are executed
>> | on an input of length 2329 with 66 distinct values). For
>> illustrative purposes, the program uses the
>> | character representation you used in your example for input and
>> output, but the algorithm itself
>> | will work for values of c up to 32765 and v up to 32764 if you have
>> enough memory and patience.
>>
>> Assuming that it works, a very impressive effort!
>
> Indeed.
>
> I have not tried to read the code (or explanation) except to note that
> it must be far better than O(n*n). :-)

But is does contain a sort...

> With that given, how about a simple approach:
>
> Scan through the input string from beginning to end, while maintaining a
> set of lookup tables, one for each previous starting position. In each
> lookup table we will have two things: The values seen so far, and the
> total number of different characters.
>
> For each possible length we remember the shortest string that so far has
> resulted in this many unique values, updating it when we find a shorter
> string.
>
> This way the processing would consist of loading data[i], then for each
> of the previous bytes taken as the starting position check if this value
> hasn't been seen before, and if so, update the count and check if this
> count/length is equal or better to the best seen so far:
>
> for (i = 2; i < data_len; i++) {
> byte b = data[i];
> for (j = 0; j < i-2; j++) {
> if (!seen[j][b]) {
> seen[j][b] = 1;
> unique_values = ++count[j];
> len = i-j+1;
> if (len <= shortest[unique_values]) {
> save(unique_values, len, j);
> shortest[unique_values] = len;
> }
> }
> }
> }
>
> The save function will generate a list of all possible "winning"
> candidates, so it must be sorted and pruned before the final output but
> that is trivial.
>
> The main (only?) problem with the program above is that it is still
> O(n*n), with data_len = 4096, the total number of iterations will be
> 1+2+3+4+...+4093 which is ~8M.
>
> However, since the maximum possible number of unique values is the
> character set size, even using all 256 possible byte values would mean
> that for the longer sequences, starting early in the input, 15/16 of all
> bytes will have been seen previously so the only code executed will be the
>
> if (!seen[j][b]) {
>
> test & branch.
>
> With 4K 256-byte lookup tables we'll need 1 MB of cache, using bits
> instead would compress this to 128K, but that is still too large to fit
> in L1 cache, so probably not worth it.
>
> I expect the running time to be ~ 10-20 cycles/inner loop iteration, or
> 80-160M cycles for the 4K input, letting the program finish in less than
> a tenth of a second on any currently existing PC.
>
> Is that fast enough?

You seem to have come up with a carbon copy of the approach Paul Green
used in 1998 - I did finally find a printout of our email conversation
from 1998 in one of the unopened boxes on the loft. I will scan and
(hopefully) OCR(ish) it when I come back from Vilnius next week time.

His code is in Pascal, and until I started running it for parts of the
input string I had had blind faith in it. Wrong, as it turns out as
there are three strings that it does not process correctly, and all my
debugging hasn't given me any clues as to the why.

If there is interest, I can also post my original sliding window code.

FWIW, the number of statement executed is obviously a rather crude way
of assessing the efficiency of a program. However, I also ran the PL/I
program with Strobe, a mainframe performance analyzing tool. The result
was even more devastating, *one single loop* to see if a next value is
already in the series was responsible for 95% of the CPU time of the
program. Ouch!

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```Frank Kotler wrote:
> James J. Weinkam wrote:
>> Sorry for the double posting.
>>
>> When I sent the message the first time, a message from my outgoing
>> mail server appeared saying something to the effect that "this account
>> is valid for email only; postings to newsgroups are ignored."
>
> Hi James,
>
> The NG comp.lang.asm.x86 is moderated. What this means is that your NNTP
> server, instead of posting the article as usual, sends it - by mail - to
> the moderator for approval. I've never seen that message, but I suspect
> that's where it's from. When the moderator (or the "moderator's
> apprentice", me) approves the message, we re-post it to an NNTP server.
> This time, bearing the highly arcane moderator's incantation, it gets
> posted. In the case of a cross-posted message, this delays posting to
> all groups, even the unmoderated ones. We try to keep on top of it, but
> there is inevitably some delay.
>
> I strongly suspected I was approving the same message twice, but would
> rather do that than risk missing some minor edit. If anyone is terribly
> bothered by the double-posting, my abject apologies on behalf of the
> moderation team - and my personal advice: "get a life!" :)
>
> Sorry 'bout that - we appreciate your posting!
>
> Best,
> Frank
>
Well this just goes to show that you learn something new every day.

Looking at the subsequent posts in this thread, it strikes me that the process being used to deal
with messages posted to multiple news groups some of which are moderated is ill conceived. The
outcome should be the same as if the message were posted separately to each group. This could easily
be accomplished by sending the measage directly to each unmoderated group and directly to the
moderator of each moderated group. Why should there be any interaction? How is the ordering decided?
Just the order the poster happened to choose?

Anyway, thanks for the explanation. However it still doesn't explain the pop up I got when I first
posted the message. Surely in many years of posting to news groups, I must have posted a message to
multiple groups before sone of which were moderated. Nevertheless, I have never seen that message
from the mail server that the "account is only vaild for email and newsgroups will be ignored."
Moreover that isn't what eventually happened.
```
 0

```On May 19, 10:17=A0pm, "James J. Weinkam" <j...@nospicedham.cs.sfu.ca>
wrote:
> Frank Kotler wrote:
> > James J. Weinkam wrote:
> >> Sorry for the double posting.
>
> >> When I sent the message the first time, a message from my outgoing
> >> mail server appeared saying something to the effect that "this account
> >> is valid for email only; postings to newsgroups are ignored."
>
> > Hi James,
>
> > The NG comp.lang.asm.x86 is moderated. What this means is that your NNT=
P
> > server, instead of posting the article as usual, sends it - by mail - t=
o
> > the moderator for approval. I've never seen that message, but I suspect
> > that's where it's from. When the moderator (or the "moderator's
> > apprentice", me) approves the message, we re-post it to an NNTP server.
> > This time, bearing the highly arcane moderator's incantation, it gets
> > posted. In the case of a cross-posted message, this delays posting to
> > all groups, even the unmoderated ones. We try to keep on top of it, but
> > there is inevitably some delay.
>
> > I strongly suspected I was approving the same message twice, but would
> > rather do that than risk missing some minor edit. If anyone is terribly
> > bothered by the double-posting, my abject apologies on behalf of the
> > moderation team - and my personal advice: "get a life!" :)
>
> > Sorry 'bout that - we appreciate your posting!
>
> > Best,
> > Frank
>
> Well this just goes to show that you learn something new every day.
>
> Looking at the subsequent posts in this thread, it strikes me that the pr=
ocess being used to deal
> with messages posted to multiple news groups some of which are moderated =
is ill conceived. The
> outcome should be the same as if the message were posted separately to ea=
ch group. This could easily
> be accomplished by sending the measage directly to each unmoderated group=
and directly to the
> moderator of each moderated group. Why should there be any interaction? H=
ow is the ordering decided?
> Just the order the poster happened to choose?
>

Well, I am pretty sure these questions were asked when Usenet was
quite young.  I imagine that policies and 'actual practice' have long
been not-quite-in-sync and surely both have evolved much over time.
If you are interested in the details, there are resources specifically
dedicated to these topics:

The Big-8 Management Board
http://www.big-8.org/dokuwiki/doku.php

Internet Systems Consortium
http://www.isc.org/community/reference

And some newsgroups where the above folks 'hang out' to post

news.announce.newsgroups
news.groups.questions

The status of Usenet as a whole seems that it may not be in very good
health:  http://news.slashdot.org/story/10/05/18/2342241/Duke-To-Shut-Down-=
Usenet-Server?art_pos=3D24

Maybe it will come a day that we'll need to set-up our own News
Servers?

Heck, maybe people would see the value of it if there was a Facebook
app that allowed them to read/post to Usenet??  I dunno..  maybe it'll

> Anyway, thanks for the explanation. However it still doesn't explain the =
pop up I got when I first
> posted the message. Surely in many years of posting to news groups, I mus=
t have posted a message to
> multiple groups before sone of which were moderated. Nevertheless, I have=
never seen that message
> from the mail server that the "account is only vaild for email and newsgr=
oups will be ignored."
> Moreover that isn't what eventually happened.

Probably just a quirk of your Newsreader -- it is the only item that
can actually produce a 'pop up' on your machine.

Nathan.
```
 0

```In article  <ht1g4p\$up8\$1@speranza.aioe.org>
fbkotler@nospicedham.myfairpoint.net "Frank Kotler" writes:

> Frank Kotler wrote:
> > Frank Kotler wrote:
> >
> > ...
> >> Yeah, I do seem to be truncating your references! Presumably because
> >> they're on separate lines... I thought I had that handled. I'll look
> >> into it. Thanks for the heads up!
> >
> > May have that fixed. Seems I was seeing a space where I was expecting a
> > tab... Trouble with trial and error coding is you have to keep posting
> > test messages! Couldn't resist trying to fix other things, too. Blew the
> > last test... try again...
>
> Guess my patch worked, eh?
>
> Date: 19 May 2010 19:34:56 GMT
> References: <85akp6F4bpU1@mid.individual.net>
>      <oZrIn.4426\$z%6.1359@edtnps83>
>      <6fmdc7-0kv1.ln1@ntp.tmsw.no>

I'm sure the OP will check it ;-)

> Seems I mostly see multiple "References" all on one line... or with a
> new line starting with TAB. For some reason, I'm seeing "space" here.
> Easily fixed... hope it doesn't catch too much unintended stuff. We
> shall see...

It shouldn't - the RFCs for news messages define "whitespace" as tabs
or spaces (and I think newlines too for when the References: are

But while you're in there, Frank, you might also want to check for
multiple tabs/spaces between the <reference> entries.  The maximum
allowed length of a header line is 1000 characters (and a References
so there might come a point in a busy thread where adding the latest
message-id will take the line over the limit; what one is supposed to
do then is retain as many as possible of the _latest_ references, plus
the _original_ reference/message-id, by removing entries.  So for
example if the references look like GFEDCBA (A is original) and
reference H would blow the 1000 character limit, one should remove
reference B (and if necessary, C, D...) until it all "fits". This will
leave the references as HGFEDCA.

Anyway, getting back to checking for multiple spaces, the
recommendation (which I must say I've never seen...) is to separate
the reference entries with "3 or more" spaces where removals have
taken place; so the example above might look like HGFEDC   A.

The only reason I know of this stuff is that my own newsreader was
trashing References: headers so I had to rewrite that handling to fix
it!

Good luck!
Pete
--
"We have not inherited the earth from our ancestors,
we have borrowed it from our descendants."
```
 0

```Terje Mathisen wrote:
> robin wrote:
>> "James J. Weinkam"<jjw@nospicedham.cs.sfu.ca> wrote in message
>> news:oZrIn.4426\$z%6.1359@edtnps83...
>>
>> | I believe the program below does what you want and should be a
>> couple of orders of magnitude faster
>> | than the method you outlined (based on your statement about how many
>> "PL/I statements" are executed
>> | on an input of length 2329 with 66 distinct values). For
>> illustrative purposes, the program uses the
>> | character representation you used in your example for input and
>> output, but the algorithm itself
>> | will work for values of c up to 32765 and v up to 32764 if you have
>> enough memory and patience.
>>
>> Assuming that it works, a very impressive effort!
>
> Indeed.
>
> I have not tried to read the code (or explanation) except to note that
> it must be far better than O(n*n). :-)
>
> With that given, how about a simple approach:
>
> Scan through the input string from beginning to end, while maintaining a
> set of lookup tables, one for each previous starting position. In each
> lookup table we will have two things: The values seen so far, and the
> total number of different characters.
>
> For each possible length we remember the shortest string that so far has
> resulted in this many unique values, updating it when we find a shorter
> string.
>
> This way the processing would consist of loading data[i], then for each
> of the previous bytes taken as the starting position check if this value
> hasn't been seen before, and if so, update the count and check if this
> count/length is equal or better to the best seen so far:
>
> for (i = 2; i < data_len; i++) {
> byte b = data[i];
> for (j = 0; j < i-2; j++) {
> if (!seen[j][b]) {
> seen[j][b] = 1;
> unique_values = ++count[j];
> len = i-j+1;
> if (len <= shortest[unique_values]) {
> save(unique_values, len, j);
> shortest[unique_values] = len;
> }
> }
> }
> }

I just realized that the actual lookup tables can be skipped, while
making the program significantly faster:

Simply scan backwards through the input string, starting from the
previous position:

For each such position increment the count of unique values seen, until
we reach the first byte, or a duplicate of the current input: At this
point the scan can terminate at once, which will significantly reduce
the expected number of iterations, while getting rid of all the big
lookup tables:

for (i = 2; i < data_len; i++) {
byte b = data[i];
for (j = i-1; j >= 0; j--) {
if (data[j] == b) break;
len = i-j+1;
unique_values = ++count[len];
if (len <= shortest[unique_values]) {
save(unique_values, len, j);
shortest[unique_values] = len;
}
}
}

>
> The save function will generate a list of all possible "winning"
> candidates, so it must be sorted and pruned before the final output but
> that is trivial.

This is still the same.
>
> The main (only?) problem with the program above is that it is still
> O(n*n), with data_len = 4096, the total number of iterations will be
> 1+2+3+4+...+4093 which is ~8M.

According to the birthday paradox, for random inputs the expected number
of samples needed to find a collision (i.e. a non-unique value) in a set
of size n is ~sqrt(n), so in one fell swoop the rewrite above has
reduced the algorithm from O(n*n) to O(n*sqrt(m)) where (n) is the input
array length and (m) is the alphabet size.

I.e. for the case of n=4096 and m=256, the expected iteration count is
now 4096*16=64K, and everything will fit in L1 cache.

Running time should be _well_ below a single ms.

The inner loop can never iterate over more than the last (m) characters,
so we have 4kB of input and 512 bytes (with 16-bit counters) for the
active part of the shortest[] table.

Going to huge inputs (more than 64k of data and a 32-bit (utf8)
alphabet) we end up with n*4 bytes of input and n*4 bytes of shortest[]
table space, for a total of 8*n.

Assuming a 4GB input array consisting of 1G 32-bit random values, the
expected iteration count becomes ~1G * sqrt(1G) = 1G*32k or 32e12.

With 5 instructions in the inner loop, say 10 cycles (all accesses are
sequential so we get maximum use of L1 and L2!), this would take 3-6 hours.

> I expect the running time to be ~ 10-20 cycles/inner loop iteration, or
> 80-160M cycles for the 4K input, letting the program finish in less than
> a tenth of a second on any currently existing PC.
>
> Is that fast enough?

Obviously not. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-18 22:02:31 -0400, mecej4 said:

> On 5/18/2010 3:27 AM, James J. Weinkam wrote:
>> Robert AH Prins wrote:
>>> Hi all,
>>>
>>> Can anyone give me some hints as to solve the following problem,
>>> preferably in a way that is faster than the way I used to do it, and
>>> without the bug in the current version;
>>>
>> ...
>> I believe the program below does what you want and should be a couple of
>> orders of magnitude faster
>> than the method you outlined (based on your statement about how many
>> "PL/I statements" are executed
>> on an input of length 2329 with 66 distinct values). For illustrative
>> purposes, the program uses the
>> character representation you used in your example for input and output,
>> but the algorithm itself
>> will work for values of c up to 32765 and v up to 32764 if you have
>> enough memory and patience.
>>
>> The program uses four arrays: y(c), a character array into which the
>> input is read; x(c), an integer
>> array giving the encoded input with a=1, b=2, etc; a(0:c,v+1), described
>> below; and l(v), with l(j)
>> the length of the shortest subsequence with j distinct values.
>>
>> The basic idea is to build an array, a, with a row for each cell and a
>> column for each value. The
>> i,j element of the array gives the relative cell number in the input
>> array starting from cell i of
>> the first occurrence of value j, if any, otherwise a(i,j) is greater
>> than max(c,v+1). The array is
>> built starting from the last row, c, and working back to row 1. Row c is
>> first set to max(c+1,v+2).
>> Then a(c,x(c)) is set to 1. Working backward for i=c-1 to 1, each
>> element of row i is set to the
>> minimum of 32766 and the corresponding element of row i+1 plus 1 and
>> then element a(i,x(i)) is set to 1.
>>
>> Once this construction is complete, it is evident that there is one and
>> only one instance of the
>> integer 1 in each row, and all values in each row less than c+1 are unique.
>>
>> Next each row is sorted. After sorting, the a(i,j) element contains the
>> length of the shortest
>> subsequence starting at position i that contains j distinct values
>> unless there is no such
>> subsequence, in which case a(i,j) is greater than c. It is also evident
>> that if a(i,j)=j then
>> a(i,k)=k for 1<=k<=j. Now if a(i,j)=j the subsequence of length j
>> starting at position i has j
>> distinct values, but it should be excluded from the solution set for j
>> distinct values if it is a
>> proper subsequence of a longer subsequence of distinct values. This will
>> be so iff a(i,j+1)=j+1 or
>> a(i-1,j+1)=j+1.
>>
>> Based on these considerations, the solution sets for each j are
>> constructed by scanning each column
>> starting with j=3 and proceeding to j=v. For this purpose the array a
>> row, 0, to contain the subscript of the first cell of the first minimal
>> subsequence for each j, and
>> an extra column, v+1, containing values greater than c to ensure that
>> the test for inclusion can be
>> carried out for j=v. For each j, l(j) is initialized to c+1 and k (the
>> previous element in the list
>> of minimal subsequences) to 0. The scan proceeds as follows: if a(i,j)
>> is less than l(j) the current
>> subsequence is shorter that the shortest found so far so l(j) is updated
>> and k reset to 0. If a(i,j)
>> is equal to l(j) the inclusion test is applied and if it passes the
>> current subsequence is added to
>> the list. Once the entire column has been processed the list is
>> terminated by setting a(k,j)=0.
>>
>> It is now a simple matter to read off the solution subsequences. The
>> program utilizes sorta to sort
>> the rows. call sorta(addr(first element),n,w,c) sorts an array of n
>> elements of width w in place
>> according the comparison function c. c(u,v) returns '1'b (bit(1)
>> aligned) if the u-th element may
>> precede the v-th element, i.e., x(u)<=x(v).
>>
>>
>> Source:
>>
>> %process mar(2,100) offset;
>> subsets: proc options(main) reorder;
>> dcl
>> (c,v,i,j,k,m,ar init((rank('a')-1))) bin fixed,
>> (x(c),a(0:c,v+1),l(v)) bin fixed ctl,
>> y(c) char(1) ctl,
>> sorta entry(ptr,bin fixed(31),bin fixed(31),
>> entry(bin fixed(31),bin fixed(31)) returns(bit(1) aligned)),
>> vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50)
>> var),
>> sysin file input,
>> sysprint file print,
>> get file(sysin) list(c,v);
>> put file(sysprint) edit('c: ',vfmt(c,10,0),', v:
>> ',vfmt(v,10,0))(col(1),4 a);
>> allocate x,a,l,y;
>> get file(sysin) edit(y)(col(1),(c)a(1));
>> put file(sysprint) edit('Input: ',y)(col(1),a,(c)a(1));
>> do i=1 to c; x(i)=rank(y(i))-ar; end;
>> a(c,*)=max(c+1,v+2); a(c,x(c))=1;
>> do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;
>> do i=1 to c; call sorta(addr(a(i,1)),v,2,comp); end;
>> do j=3 to v; l(j)=c+1; k=0; m=j+1; a(0,m)=0;
>> do i=1 to c;
>> if a(i,j)<l(j) then do; l(j)=a(i,j); k=0; end;
>> if a(i,j)=l(j) then if a(i-1,m)~=m then if a(i,m)~=m then do;
>> a(k,j),k=i; end;
>> end;
>> a(k,j)=0;
>> if a(0,j)>0 then do;
>> put file(sysprint) edit('Distinct: ',vfmt(j,10,0),', length:
>> ',vfmt(l(j),10,0))(col(1),4 a);
>> i=a(0,j); k=0; do while(i~=0); k+=1;
>> put file(sysprint) edit(k,': (',vfmt(i,10,0),') ',(y(i+m) do m=0 to
>> l(j)-1))
>> (col(1),f(10),3 a,(l(j))a(1));
>> i=a(i,j);
>> end;
>> end;
>> end;
>> comp: proc(u,v) returns(bit(1) aligned) reorder;
>> dcl (u,v) bin fixed(31);
>> return(a(i,u)<=a(i,v));
>> end comp;
>> end subsets;
>>
>> Input:
>>
>> 56 10
>> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>>
>> Output:
>>
>> c: 56, v: 10
>> Input: aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>> Distinct: 3, length: 3
>> 1: (40) bef
>> 2: (44) fgh
>> Distinct: 7, length: 7
>> 1: (50) gihbfja
>> Distinct: 8, length: 16
>> 1: (41) efffghggggihbfja
>> Distinct: 9, length: 23
>> 1: (34) cbbbbbbefffghggggihbfja
>> Distinct: 10, length: 25
>> 1: (32) dccbbbbbbefffghggggihbfja
>>
>> Here is the output from another example:
>>
>> c: 48, v: 10
>> Input: abcabcabcabcdefgabcabccdabcdeabcaaabbbcccdefghij
>> Distinct: 3, length: 3
>> 1: (1) abc
>> 2: (2) bca
>> 3: (3) cab
>> 4: (4) abc
>> 5: (5) bca
>> 6: (6) cab
>> 7: (7) abc
>> 8: (8) bca
>> 9: (9) cab
>> 10: (18) bca
>> 11: (19) cab
>> 12: (20) abc
>> 13: (31) bca
>> Distinct: 4, length: 4
>> 1: (23) cdab
>> 2: (24) dabc
>> Distinct: 5, length: 5
>> 1: (25) abcde
>> 2: (26) bcdea
>> 3: (27) cdeab
>> 4: (28) deabc
>> Distinct: 7, length: 7
>> 1: (10) abcdefg
>> 2: (11) bcdefga
>> 3: (12) cdefgab
>> 4: (13) defgabc
>> Distinct: 8, length: 8
>> 1: (41) cdefghij
>> Distinct: 9, length: 11
>> 1: (38) bcccdefghij
>> Distinct: 10, length: 14
>> 1: (35) abbbcccdefghij
>>
>>
> Professor Weinkam's elegant algorithm solves the problem using c.v.lg v
> operations (his notation), i.e., N.T.lg T (RAH Prins' notation).
>
> In the PL/I code that he gave, Professor Weinkam used two procedures
> that are not included in IBM's VAPLI: SORTA and VFMT. However, it is
> easy to supply the missing functions. In place of SORTA one may use
> quicksort or mergesort, for which PL/I code is available at
> http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#PL.2FI and
> http://rosettacode.org/wiki/Sort_an_integer_array#PL.2FI. For VFMT, the
> following will do, although it allocates memory that is freed only by
> exiting the program:
>
>    vfmt: proc(v) returns(char(10) var);
>    dcl (l,v) fixed bin;
>    dcl str char(10) var ctl;
>    allocate str;
>    if v < 10 then l=1;
>    else if v < 100 then l=2;
>    else if v < 1000 then l=3;
>    else l=4;
>    put string (str) edit (v) (F(l));
>    return (str);
>    end vfmt;
>
> -- mecej4

There is no earthly reason for str to be CONTROLLED.

Indeed, there is no earthly reason for any variable in any of the above
code to be CONTROLLED. Get the bounds, and then declare the arrays as
AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
mistake; why do you think no other language has ever had it? It was a
botched first draft of BASED, which was itself a botched first draft of
HANDLE.

--
John W Kennedy
"The pathetic hope that the White House will turn a Caligula into a
Marcus Aurelius is as na�ve as the fear that ultimate power inevitably
corrupts."
-- James D. Barber (1930-2004)

```
 0
Reply jwkenne (1440) 5/20/2010 9:00:42 PM

```mecej4 wrote:
> On 5/18/2010 3:27 AM, James J. Weinkam wrote
>
.....

> In the PL/I code that he gave, Professor Weinkam used two procedures
> that are not included in IBM's VAPLI: SORTA and VFMT. However, it is
> easy to supply the missing functions. In place of SORTA one may use
> quicksort or mergesort, for which PL/I code is available at
> http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#PL.2FI and
> http://rosettacode.org/wiki/Sort_an_integer_array#PL.2FI. For VFMT, the
> following will do, although it allocates memory that is freed only by
> exiting the program:
>
>   vfmt: proc(v) returns(char(10) var);
>   dcl (l,v) fixed bin;
>   dcl str char(10) var ctl;
>   allocate str;
>   if v < 10 then l=1;
>   else if v < 100 then l=2;
>   else if v < 1000 then l=3;
>   else l=4;
>   put string (str) edit (v) (F(l));
>   return (str);
>   end vfmt;
>
Since the OP indicate he could handle programming details, I didn't think it necessary to include
the sort subroutine or the formatting routine, which is purely cosmetic in any case. I will post the
routines upon request.

> -- mecej4
```
 0

```In <EY0Jn.4632\$z%6.1904@edtnps83>, on 05/20/2010
at 02:17 AM, "James J. Weinkam" <jjw@nospicedham.cs.sfu.ca> said:

>Looking at the subsequent posts in this thread, it strikes me
>that the process being used to deal  with messages posted to
>multiple news groups some of which are moderated is ill
>conceived.

That's the result of taking a superficial look at a complex problem.

>The outcome should be the same as if the message were posted
>separately to each group.

That would be fundamentally wrong, for reasons that have been hashed

--
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not

```
 0

```Robert AH Prins wrote:
>
> His code is in Pascal, and until I started running it for parts of the
> input string I had had blind faith in it. Wrong, as it turns out as
> there are three strings that it does not process correctly, and all my
> debugging hasn't given me any clues as to the why.
>
Can you send a copy of the data set that fails? Also, if it isn't prying into your business, would
you describe the application this problem is abstracted from? You indicated in your original post
that the values (characters in the posted esample) were actually indices into a table in the real
application.

Finally, the code I posted has two bugs that I know of so far; but they didn't prevent the algorithm
from getting the correct answer. Has anyone spotted them?
```
 0

```/*
Here is the final version of my solution.

Let:    N       number of cells, size of the input
V       number of values, size of symbol set.
M       the number of distinct values in the largest sequence.
T       the size of the largest n-in-n sequence, i.e. the
largest sequence consisting of distinct values only.

As explained in my previous message <4bf43d60\$0\$22939\$c5fe704e@news6.xs4all.nl>
the algorithm uses two different search strategies:

Plan A

Find the largest sequence with size M

for n from M downwards search n-in-m sequences, keeping only sequences
with minimal value of m.
Print sequence if n<m. (If n=m the same sequence will be found again
in the next step).

Now n=m=T.

Plan B
Prints all maximal n-in-n sequences.

Plan A scans the input 1+(M-T) .. 2+(M-T) times, depending on the
input position of the longest sequence.
Plan B scans the input once.

The running time is therefore bounded by O(N * (3+(M-T)) which
is typically better than or equal to O(N*C).
For small values of N the cost of erasing the frequency array should
also be accounted for.

Two arrays are used:
found[N]    Intermediate results of plan A.
freq[C]     Frequency count for all symbols in the sliding window

Finding sequences is done using a straightforward sliding window
scan of the input. The values in freq[] are updated when the window
edges move and the numer of unique symbols is updated when a
frequency count changes between zero and non-zero.

The output is unsorted.

*/

#include <stdio.h>
#include <string.h>

#define C 256               // Size of alphabet
#define ONE 1               // Pascal array bias convention

void printres (unsigned char *data, unsigned pos, unsigned l, unsigned w)
{
unsigned i;
printf("len %2d aperture %2d pos %2d\t", l, w, ONE+pos);
for (i=pos; i<pos+w; i++) printf("%c", data[i]);
printf("\n");
}

void findseqs (unsigned char *data, unsigned len)
{
//  printf("%s\n", data);
unsigned i;
unsigned uniq = 0;              // Nr unique symbols in window
unsigned apub = ~0;             // Aperture upper bound (aka "m")

unsigned nrfound = /* to keep gcc happy */ 0;
unsigned found[len];            // intermediate results

unsigned freq[C];               // Frequency count
bzero(freq, sizeof(freq));

#   define inc(c) do { if(!freq[c]) uniq++; freq[c]++; } while(0)
#   define dec(c) do { freq[c]--; if(!freq[c]) uniq--; } while(0)
#   define WINSIZE ((r-l)+1)

unsigned l = 0;                 // Start of window
unsigned r = len-1;             // Inclusive end of window
unsigned lim = len-1;           // inclusive end of data

for (i=0; i<len; i++) {
inc(data[i]);
}
if (uniq < 3) return;

// Trim left & right edges

while (freq[data[r]]>1) { dec(data[r]); r--; }
while (freq[data[l]]>1) { dec(data[l]); l++; }

// From here on we want to find only sequences that are shorter or
// equal to what we have found above:

unsigned goal = uniq;

while(1) {

//  data[l..r] contains "goal" unique symbols
//  freq[]     number of occurances of each symbol in data[l..r]
//  data[l]    is a unique symbol
//  data[r]    is a unique symbol

if (WINSIZE <= apub) {          // Short enough?

// A candidate. May be premature if we have shorter sequences
// with the same nr of unique symbols.

if (WINSIZE < apub) {
nrfound = 0;            // Dump old candidates
apub = WINSIZE;         // Next matches must be <= this one
}
found[nrfound] = l;         // Add to list of tentative results
nrfound++;
}

// Slide window to next match.

dec(data[l]); l++;
while (freq[data[l]]>1) { dec(data[l]); l++; }
while (uniq < goal && r<lim)   { r++; inc(data[r]); }

if (uniq == goal) continue;

// We've hit the right edge

bzero(freq, sizeof(freq));      // Reset window & statistics
uniq = l = r = 0;

if (apub == goal) break;        // If *only* distinct values proceed
// with plan B.

for (i=0; i<nrfound; i++) {     // Print the results
printres(data, found[i], goal, apub);
}

// Now the next shorter sequence.

goal--;

inc(data[l]);
do {
r++;
inc(data[r]);
} while (uniq != goal);

while (freq[data[l]] != 1) {
dec(data[l]);
l++;
}
// Now the precondition holds. Also, the window size is smaller
// than the aperture upper bound from the previous iteration.
}

// Plan B
// Now we consider only sequences will all symbols distinct.
// The freq[] array therefore behaves like a bitmap with values 0&1 only.

freq[data[l]] = 1;

while (r<lim) {
unsigned char cl, cr;
cl = data[l];
cr = data[r+1];

if (freq[cr] != 0) {            // Maximal sequence?
if (WINSIZE >= 3) {
printres(data, l, WINSIZE, WINSIZE);
}
do {                        // Move left edge until
cl = data[l];           // we have all symbols unique.
freq[cl] = 0;
l++;
} while (cl != cr);
}
r++;
freq[data[r]]=1;
}

// Final sequence.

if (WINSIZE >= 3) {
printres(data, l, WINSIZE, WINSIZE);
}
return;
}

int main(int argc, char **argv)
{
if (argc > 2) {
fprintf(stderr, "Usage: %s {string}\n", argv[0]);
return 1;
}
if (argc == 1) {
unsigned char buf[4096];
size_t inlen = fread(buf, sizeof(buf[0]), sizeof(buf), stdin);
findseqs( buf, (unsigned) inlen);
}
if (argc == 2) {
findseqs(argv[1], strlen(argv[1]));
}

return 0;
}
```
 0

```James J. Weinkam wrote:
> Robert AH Prins wrote:
>>
>> His code is in Pascal, and until I started running it for parts of the
>> input string I had had blind faith in it. Wrong, as it turns out as
>> there are three strings that it does not process correctly, and all my
>> debugging hasn't given me any clues as to the why.
>>
> Can you send a copy of the data set that fails? Also, if it isn't prying
> into your business, would you describe the application this problem is
> abstracted from? You indicated in your original post that the values
> (characters in the posted esample) were actually indices into a table in
> the real application.

I would also love to see that, I really do believe that my current
approach is sound.
>
> Finally, the code I posted has two bugs that I know of so far; but they
> didn't prevent the algorithm from getting the correct answer. Has anyone
> spotted them?

No, we have all been to busy debugging our own algorithms! :-)

The raw output (not filtered to only keep the best solution) is as
follows for the sample input:

The rows are in order:

Nr of unique symbols
String length
Starting position
The found string

3   5  11 abbbc
4  14  11 abbbcbcccccccd
3   4  31 dccb
4  10  31 dccbbbbbbe
5  30  11 abbbcbcccccccddddddddccbbbbbbe
3   3  39 bef
4   9  33 cbbbbbbef
5  11  31 dccbbbbbbef
6  31  11 abbbcbcccccccddddddddccbbbbbbef
4   6  39 befffg
6  14  31 dccbbbbbbefffg
7  34  11 abbbcbcccccccddddddddccbbbbbbefffg
3   3  43 fgh
4   6  40 efffgh
5   7  39 befffgh
6  13  33 cbbbbbbefffgh
7  15  31 dccbbbbbbefffgh
8  35  11 abbbcbcccccccddddddddccbbbbbbefffgh
6  12  39 befffghggggi
8  20  31 dccbbbbbbefffghggggi
9  40  11 abbbcbcccccccddddddddccbbbbbbefffghggggi
3   3  49 gih
3   3  50 ihb
4   4  49 gihb
3   3  51 hbf
4   4  50 ihbf
5   5  49 gihbf
3   3  52 bfj
4   4  51 hbfj
5   5  50 ihbfj
6   6  49 gihbfj
7  15  40 efffghggggihbfj
9  24  31 dccbbbbbbefffghggggihbfj
10  44  11 abbbcbcccccccddddddddccbbbbbbefffghggggihbfj
3   3  53 fja
4   4  52 bfja
5   5  51 hbfja
6   6  50 ihbfja
7   7  49 gihbfja
8  16  40 efffghggggihbfja
9  23  33 cbbbbbbefffghggggihbfja
10  25  31 dccbbbbbbefffghggggihbfja

Filtering is trivial:

First remove all non-shortest strings for a given # of unique symbols
Next remove all strings that either start or end at the same position as
another string while being exactly one character shorter. I.e. remove
all pure substrings.

Doing the first stage manually on the set above result in:

3   3  39 bef
3   3  43 fgh
3   3  49 gih
3   3  50 ihb
4   4  49 gihb
3   3  51 hbf
4   4  50 ihbf
5   5  49 gihbf
3   3  52 bfj
4   4  51 hbfj
5   5  50 ihbfj
6   6  49 gihbfj
3   3  53 fja
4   4  52 bfja
5   5  51 hbfja
6   6  50 ihbfja
7   7  49 gihbfja
8  16  40 efffghggggihbfja
9  23  33 cbbbbbbefffghggggihbfja
10  25  31 dccbbbbbbefffghggggihbfja

The second stage removes the proper substrings, starting from the shortest:

3   3  39 bef
3   3  43 fgh
7   7  49 gihbfja
8  16  40 efffghggggihbfja
9  23  33 cbbbbbbefffghggggihbfja
10  25  31 dccbbbbbbefffghggggihbfja

Except for the use of zero-based counting of starting position, this
looks identical to the original request. :-)

Running time is still O(N*sqrt(C)) where N is the input length and C is
the character size, and total memory use is O(N), i.e. the same as the
data itself.

With 4K inputs total data size would be 12KB, but only 4.5 KB would be
within the working set at any given time, so L1 hit rates are very close
to 100%.

int scan(byte *data, int data_len)
{
int c = 0;
for (int i = 2; i < data_len; i++) {
byte b = data[i];
for (int j = i-1; j >= 0; j--) {
// A previously seen character cannot end a new longest sequence!
// It also cannot add to the sequence count for this or any
// longer string, so stop scanning!
if (data[j] == b) break;

int len = i-j+1;
// Increment the count of unique values seen when starting
// from position [j]
int unique_values = ++count[j];
if (len <= shortest[unique_values]) {
if (len >= 3)
save(unique_values, len, j);
shortest[unique_values] = len;
c++;
}
}
}
return c;
}

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
| On 2010-05-18 22:02:31 -0400, mecej4 said:
|
| There is no earthly reason for str to be CONTROLLED.
|
| Indeed, there is no earthly reason for any variable in any of the above
| code to be CONTROLLED. Get the bounds, and then declare the arrays as
| AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
| mistake; why do you think no other language has ever had it?

Fortran has had controlled storage since 1991.

And CONTROLLED storage is/was not a mistake.

| It was a botched first draft of BASED,

No it wassn't.

```
 0
Reply robin51 (256) 5/21/2010 8:41:03 AM

``` >Terje Mathisen wrote:

> [Snip]
>
> With 4K inputs total data size would be 12KB, but only 4.5 KB would be within the working set at
any given time, so L1 hit rates are very close to 100%.
>
> int scan(byte *data, int data_len)
> {
>     int c = 0;
>     for (int i = 2; i < data_len; i++) {
>         byte b = data[i];
>         for (int j = i-1; j >= 0; j--) {
> // A previously seen character cannot end a new longest sequence!
> // It also cannot add to the sequence count for this or any
> // longer string, so stop scanning!
>             if (data[j] == b) break;
>
>             int len = i-j+1;
> // Increment the count of unique values seen when starting
> // from position [j]
>             int unique_values = ++count[j];
>             if (len <= shortest[unique_values]) {
>                 if (len >= 3)
>                     save(unique_values, len, j);
>                 shortest[unique_values] = len;
>                 c++;
>             }
>         }
>     }
>     return c;
> }
As Terje has pointed out a number of times, taking into account
the hit rate for L1 and L2 cache will have a significant impact
on how fast the code runs.  While many of the other contributions
to this thread have ignored this aspect, a program which requires
only L1 cache for most of its data can run many times faster,
perhaps up to ten times as fast based on some of the tests I did

Jerome Fine
```
 0

```On 2010-05-21 04:41:03 -0400, robin said:

> "John W Kennedy" <jwkenne@attglobal.net> wrote in message
> news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
> | On 2010-05-18 22:02:31 -0400, mecej4 said:
> |
> | There is no earthly reason for str to be CONTROLLED.
> |
> | Indeed, there is no earthly reason for any variable in any of the above
> | code to be CONTROLLED. Get the bounds, and then declare the arrays as
> | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
> | mistake; why do you think no other language has ever had it?
>
> Fortran has had controlled storage since 1991.

No, it has had dynamic (allocatable) arrays, which are not one-tenth as
complex.

> And CONTROLLED storage is/was not a mistake.

I repeat: it was a botch that was never repeated.

>> It was a botched first draft of BASED,
>
> No it wassn't.

....which demonstrates conclusively that you know nothing of the
development of PL/I. Anyone who was there at the time knows how in the
early days, the syntax for declaring a based variable was:

DECLARE P POINTER;
DECLARE 1 RECORD CONTROLLED(P), 2...;

--
John W Kennedy
If Bill Gates believes in "intelligent design", why can't he apply it
to Windows?

```
 0
Reply jwkenne (1440) 5/21/2010 2:22:38 PM

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf69720\$0\$31260\$607ed4bc@cv.net...
| On 2010-05-21 04:41:03 -0400, robin said:
|
| > "John W Kennedy" <jwkenne@attglobal.net> wrote in message
| > news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
| > | On 2010-05-18 22:02:31 -0400, mecej4 said:
| > |
| > | There is no earthly reason for str to be CONTROLLED.
| > |
| > | Indeed, there is no earthly reason for any variable in any of the above
| > | code to be CONTROLLED. Get the bounds, and then declare the arrays as
| > | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
| > | mistake; why do you think no other language has ever had it?
| >
| > Fortran has had controlled storage since 1991.
|
| No, it has had dynamic (allocatable) arrays, which are not one-tenth as
| complex.

Fortran's ALLOCATABLE array is an example of controlled storage.
Just like PL/I, storage is alllocated and freed explicitly by the
programmer.  The relevent statement correcponding to FREE
is DEALLOCATE.

| > And CONTROLLED storage is/was not a mistake.
|
| I repeat: it was a botch that was never repeated.
|
| >> It was a botched first draft of BASED,
| >
| > No it wassn't.
|
| ...which demonstrates conclusively that you know nothing of the
| development of PL/I. Anyone who was there at the time knows how in the
| early days, the syntax for declaring a based variable was:
|
|    DECLARE P POINTER;
|    DECLARE 1 RECORD CONTROLLED(P), 2...;

We haven't been talking about BASED variables;
the discussion is about controlled storage.

```
 0
Reply robin51 (256) 5/21/2010 2:59:37 PM

```Jerome H. Fine wrote:
>  >Terje Mathisen wrote:
>
>
>  > [Snip]
>  >
>  > With 4K inputs total data size would be 12KB, but only 4.5 KB would
> be within the working set at any given time, so L1 hit rates are very
> close to 100%.
>  >
>  > int scan(byte *data, int data_len)
>  > {
>  > int c = 0;
>  > for (int i = 2; i < data_len; i++) {
>  > byte b = data[i];
>  > for (int j = i-1; j >= 0; j--) {
>  > // A previously seen character cannot end a new longest sequence!
>  > // It also cannot add to the sequence count for this or any
>  > // longer string, so stop scanning!
>  > if (data[j] == b) break;
>  >
>  > int len = i-j+1;
>  > // Increment the count of unique values seen when starting
>  > // from position [j]
>  > int unique_values = ++count[j];
>  > if (len <= shortest[unique_values]) {
>  > if (len >= 3)
>  > save(unique_values, len, j);
>  > shortest[unique_values] = len;
>  > c++;
>  > }
>  > }
>  > }
>  > return c;
>  > }
> As Terje has pointed out a number of times, taking into account
> the hit rate for L1 and L2 cache will have a significant impact
> on how fast the code runs. While many of the other contributions
> to this thread have ignored this aspect, a program which requires
> only L1 cache for most of its data can run many times faster,
> perhaps up to ten times as fast based on some of the tests I did

I have now timed my code:

The full scan for potential solutions with the given 56-byte input
string took about 8K cycles, while my big-O estimate said O(n*sqrt(c))
which in our case is O(56*sqrt(10)) ~= 180, i.e. each iteration took a
little over 40 cycles.

Since the number of potential solutions was quite high (40), the part of
the inner loop that saved each of them used significant amounts of time,
including a significant number of mis-predicted branches.

Still, running at 130+ cycles per input byte is still fast enough to
handle significant amounts of data.

I would love to have a copy of the "problematic" 4K input which the
original poster have alluded to!

Terje

PS. The final filter operation runs in O(n*log(n)) so it should be much
faster than the scan, at least for large (n). With the 56-byte input
however, the filter took about twice as long as the scan, i.e. the
constant terms were significantly higher!

This meant that the total processing time was actually about 25K cycles,
or 400+ cycles/byte. On my laptop (2.2GHz) this corresponds to "only"
5.5 MB/s. :-(

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-21 10:59:37 -0400, robin said:

> "John W Kennedy" <jwkenne@attglobal.net> wrote in message
> news:4bf69720\$0\$31260\$607ed4bc@cv.net...
> | On 2010-05-21 04:41:03 -0400, robin said:
> |
> | > "John W Kennedy" <jwkenne@attglobal.net> wrote in message
> | > news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
> | > | On 2010-05-18 22:02:31 -0400, mecej4 said:
> | > |
> | > | There is no earthly reason for str to be CONTROLLED.
> | > |
> | > | Indeed, there is no earthly reason for any variable in any of the above
> | > | code to be CONTROLLED. Get the bounds, and then declare the arrays as
> | > | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
> | > | mistake; why do you think no other language has ever had it?
> | >
> | > Fortran has had controlled storage since 1991.
> |
> | No, it has had dynamic (allocatable) arrays, which are not one-tenth as
> | complex.
>
> Fortran's ALLOCATABLE array is an example of controlled storage.
> Just like PL/I, storage is alllocated and freed explicitly by the
> programmer.  The relevent statement correcponding to FREE
> is DEALLOCATE.

And can multiple allocations be stacked? Can they be in COMMON blocks?
Can they be SAVE variables?

>
> | > And CONTROLLED storage is/was not a mistake.
> |
> | I repeat: it was a botch that was never repeated.
> |
> | >> It was a botched first draft of BASED,
> | >
> | > No it wassn't.
> |
> | ...which demonstrates conclusively that you know nothing of the
> | development of PL/I. Anyone who was there at the time knows how in the
> | early days, the syntax for declaring a based variable was:
> |
> |    DECLARE P POINTER;
> |    DECLARE 1 RECORD CONTROLLED(P), 2...;
>
> We haven't been talking about BASED variables;
> the discussion is about controlled storage.

That's a barefaced lie, and the quotes above prove it.

--
John W Kennedy
"There are those who argue that everything breaks even in this old dump
of a world of ours. I suppose these ginks who argue that way hold that
because the rich man gets ice in the summer and the poor man gets it in
the winter things are breaking even for both. Maybe so, but I'll swear
I can't see it that way."
-- The last words of Bat Masterson

```
 0
Reply jwkenne (1440) 5/21/2010 3:43:13 PM

```On 21 Geg, 18:21, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> wrote:
> Jerome H. Fine wrote:
> > =A0>Terje Mathisen wrote:
>
> > =A0> [Snip]
>
> > =A0> With 4K inputs total data size would be 12KB, but only 4.5 KB woul=
d
> > be within the working set at any given time, so L1 hit rates are very
> > close to 100%.
>
> > =A0> int scan(byte *data, int data_len)
> > =A0> {
> > =A0> int c =3D 0;
> > =A0> for (int i =3D 2; i < data_len; i++) {
> > =A0> byte b =3D data[i];
> > =A0> for (int j =3D i-1; j >=3D 0; j--) {
> > =A0> // A previously seen character cannot end a new longest sequence!
> > =A0> // It also cannot add to the sequence count for this or any
> > =A0> // longer string, so stop scanning!
> > =A0> if (data[j] =3D=3D b) break;
>
> > =A0> int len =3D i-j+1;
> > =A0> // Increment the count of unique values seen when starting
> > =A0> // from position [j]
> > =A0> int unique_values =3D ++count[j];
> > =A0> if (len <=3D shortest[unique_values]) {
> > =A0> if (len >=3D 3)
> > =A0> save(unique_values, len, j);
> > =A0> shortest[unique_values] =3D len;
> > =A0> c++;
> > =A0> }
> > =A0> }
> > =A0> }
> > =A0> return c;
> > =A0> }
> > As Terje has pointed out a number of times, taking into account
> > the hit rate for L1 and L2 cache will have a significant impact
> > on how fast the code runs. While many of the other contributions
> > to this thread have ignored this aspect, a program which requires
> > only L1 cache for most of its data can run many times faster,
> > perhaps up to ten times as fast based on some of the tests I did
> > about a year ago.
>
> I have now timed my code:
>
> The full scan for potential solutions with the given 56-byte input
> string took about 8K cycles, while my big-O estimate said O(n*sqrt(c))
> which in our case is O(56*sqrt(10)) ~=3D 180, i.e. each iteration took a
> little over 40 cycles.
>
> Since the number of potential solutions was quite high (40), the part of
> the inner loop that saved each of them used significant amounts of time,
> including a significant number of mis-predicted branches.
>
> Still, running at 130+ cycles per input byte is still fast enough to
> handle significant amounts of data.
>
> I would love to have a copy of the "problematic" 4K input which the
> original poster have alluded to!

The current set of input data comprises of "just" 2329 values, and
right now it might potentially increase to around 3K. The current set
is processed without problems,  the problem arose when I started using
the code for subsets of this. The output for three of them was plain
wrong, and this could easily be verified manually - am right now in
Lithuania and didn't really want to take it, but the shortest
erroneous input set has, IIRC, *only six or seven elements*.

As for the 4k lmit, there is a chance that some users may pool their
data, and that might increase the "strings" beyond 4k values. The
values themselves will never exceed 255.

I will post the three problematic strings, and the output that results
from them, on Wednesday when I'm back in Belgium - program will be in
Pascal or PL/I, take your pick. :)

> PS. The final filter operation runs in O(n*log(n)) so it should be much
> faster than the scan, at least for large (n). With the 56-byte input
> however, the filter took about twice as long as the scan, i.e. the
> constant terms were significantly higher!
>
> This meant that the total processing time was actually about 25K cycles,
> or 400+ cycles/byte. On my laptop (2.2GHz) this corresponds to "only"
> 5.5 MB/s. :-(

Robert
```
 0

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf6aa0e\$0\$22538\$607ed4bc@cv.net...
| On 2010-05-21 10:59:37 -0400, robin said:
|
| > "John W Kennedy" <jwkenne@attglobal.net> wrote in message
| > news:4bf69720\$0\$31260\$607ed4bc@cv.net...
| > | On 2010-05-21 04:41:03 -0400, robin said:
| > |
| > | > "John W Kennedy" <jwkenne@attglobal.net> wrote in message
| > | > news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
| > | > | On 2010-05-18 22:02:31 -0400, mecej4 said:
| > | > |
| > | > | There is no earthly reason for str to be CONTROLLED.
| > | > |
| > | > | Indeed, there is no earthly reason for any variable in any of the above
| > | > | code to be CONTROLLED. Get the bounds, and then declare the arrays as
| > | > | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
| > | > | mistake; why do you think no other language has ever had it?
| > | >
| > | > Fortran has had controlled storage since 1991.
| > |
| > | No, it has had dynamic (allocatable) arrays, which are not one-tenth as
| > | complex.
| >
| > Fortran's ALLOCATABLE array is an example of controlled storage.
| > Just like PL/I, storage is alllocated and freed explicitly by the
| > programmer.  The relevent statement correcponding to FREE
| > is DEALLOCATE.
|
| And can multiple allocations be stacked? Can they be in COMMON blocks?
| Can they be SAVE variables?

That's irrelevant.  You claimed that "no other language has ever
had it", whereas Fortran does have controlled storage, in point of fact.

```
 0
Reply robin51 (256) 5/21/2010 5:35:33 PM

```John W Kennedy wrote:
> On 2010-05-21 04:41:03 -0400, robin said:
>
>> "John W Kennedy" <jwkenne@attglobal.net> wrote in message
>> news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
>> | On 2010-05-18 22:02:31 -0400, mecej4 said:
>> |
>> | There is no earthly reason for str to be CONTROLLED.
>> |
>> | Indeed, there is no earthly reason for any variable in any of the above
>> | code to be CONTROLLED. Get the bounds, and then declare the arrays as
>> | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
>> | mistake; why do you think no other language has ever had it?
>>
>> Fortran has had controlled storage since 1991.
>
> No, it has had dynamic (allocatable) arrays, which are not one-tenth as
> complex.
>
>> And CONTROLLED storage is/was not a mistake.
>
> I repeat: it was a botch that was never repeated.
>
>>> It was a botched first draft of BASED,
>>
>> No it wassn't.
>
> ...which demonstrates conclusively that you know nothing of the
> development of PL/I. Anyone who was there at the time knows how in the
> early days, the syntax for declaring a based variable was:
>
>    DECLARE P POINTER;
>    DECLARE 1 RECORD CONTROLLED(P), 2...;
>

It's not a complete botch.  I've looked at modifying some PL/I code that
used CONTROLLED to BASED for a subset-G type implementation.  It's quite
a bit more complex.  The PL/I philosophy, for better or worse, is to let
the compiler do things if possible.
```
 0
Reply Peter_Flass (956) 5/21/2010 10:01:17 PM

```On 2010-05-21 18:01:17 -0400, Peter Flass said:

> John W Kennedy wrote:
>> On 2010-05-21 04:41:03 -0400, robin said:
>>
>>> "John W Kennedy" <jwkenne@attglobal.net> wrote in message
>>> news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
>>> | On 2010-05-18 22:02:31 -0400, mecej4 said:
>>> |
>>> | There is no earthly reason for str to be CONTROLLED.
>>> |
>>> | Indeed, there is no earthly reason for any variable in any of the above
>>> | code to be CONTROLLED. Get the bounds, and then declare the arrays as
>>> | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
>>> | mistake; why do you think no other language has ever had it?
>>>
>>> Fortran has had controlled storage since 1991.
>>
>> No, it has had dynamic (allocatable) arrays, which are not one-tenth as
>> complex.
>>
>>> And CONTROLLED storage is/was not a mistake.
>>
>> I repeat: it was a botch that was never repeated.
>>
>>>> It was a botched first draft of BASED,
>>>
>>> No it wassn't.
>>
>> ...which demonstrates conclusively that you know nothing of the
>> development of PL/I. Anyone who was there at the time knows how in the
>> early days, the syntax for declaring a based variable was:
>>
>> DECLARE P POINTER;
>> DECLARE 1 RECORD CONTROLLED(P), 2...;
>>
>
> It's not a complete botch.  I've looked at modifying some PL/I code
> that used CONTROLLED to BASED for a subset-G type implementation.  It's
> quite a bit more complex.  The PL/I philosophy, for better or worse, is
> to let the compiler do things if possible.

Did you even need BASED?  99 times out of 100, what you really need

GET (M, N);
BEGIN;
DCL A(M, N);

In over a quarter century of using PL/I for nearly all my programming,
including commercial batch programs, system programs, and home-grown
on-line systems (with just enough assembler to talk to the terminals),
I never found a program that wasn't simpler, more reliable, and faster
with CONTROLLED removed.

--
John W Kennedy
"Those in the seat of power oft forget their failings and seek only the
obeisance of others!  Thus is bad government born!  Hold in your heart
that you and the people are one, human beings all, and good government
shall arise of its own accord!  Such is the path of virtue!"
-- Kazuo Koike.  "Lone Wolf and Cub:  Thirteen Strings" (tr. Dana Lewis)

```
 0
Reply jwkenne (1440) 5/21/2010 10:48:45 PM

```John W Kennedy <jwkenne@attglobal.net> wrote:
(snip)

> Did you even need BASED?  99 times out of 100, what you really need

>   GET (M, N);
>   BEGIN;
>      DCL A(M, N);

I might not believe 99, but I agree that in a large fraction
of the cases that probably works fine, and is more readable.

> In over a quarter century of using PL/I for nearly all my programming,
> including commercial batch programs, system programs, and home-grown
> on-line systems (with just enough assembler to talk to the terminals),
> I never found a program that wasn't simpler, more reliable, and faster
> with CONTROLLED removed.

I only remember doing that one time, with CALL/OS, which doesn't
have CONTROLLED.  I probably would have used CONTROLLED, though,

But it isn't so unusual to have more than one thing allocated,
and to need to deallocate one, or reallocate one, and not the
others such that BEGIN won't do it.

I might say 2 out of 3, though.

-- glen
```
 0
Reply gah (12856) 5/22/2010 12:16:25 AM

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf70dbe\$0\$2874\$607ed4bc@cv.net...

| Did you even need BASED?  99 times out of 100, what you really need
|
|   GET (M, N);
|   BEGIN;
|      DCL A(M, N);
|
| In over a quarter century of using PL/I for nearly all my programming,
| including commercial batch programs, system programs, and home-grown
| on-line systems (with just enough assembler to talk to the terminals),
| I never found a program that wasn't simpler, more reliable, and faster
| with CONTROLLED removed.

Linked lists?  Queueing? etc etc etc.

```
 0
Reply robin51 (256) 5/22/2010 12:49:27 AM

```John W Kennedy wrote:
> On 2010-05-21 18:01:17 -0400, Peter Flass said:
>
>> John W Kennedy wrote:
>>> On 2010-05-21 04:41:03 -0400, robin said:
>>>
>>>> "John W Kennedy" <jwkenne@attglobal.net> wrote in message
>>>> news:4bf5a2f7\$0\$22525\$607ed4bc@cv.net...
>>>> | On 2010-05-18 22:02:31 -0400, mecej4 said:
>>>> |
>>>> | There is no earthly reason for str to be CONTROLLED.
>>>> |
>>>> | Indeed, there is no earthly reason for any variable in any of the
>>>> above
>>>> | code to be CONTROLLED. Get the bounds, and then declare the arrays as
>>>> | AUTOMATIC in a BEGIN block. CONTROLLED storage is virtually always a
>>>> | mistake; why do you think no other language has ever had it?
>>>>
>>>> Fortran has had controlled storage since 1991.
>>>
>>> No, it has had dynamic (allocatable) arrays, which are not one-tenth
>>> as complex.
>>>
>>>> And CONTROLLED storage is/was not a mistake.
>>>
>>> I repeat: it was a botch that was never repeated.
>>>
>>>>> It was a botched first draft of BASED,
>>>>
>>>> No it wassn't.
>>>
>>> ...which demonstrates conclusively that you know nothing of the
>>> development of PL/I. Anyone who was there at the time knows how in
>>> the early days, the syntax for declaring a based variable was:
>>>
>>> DECLARE P POINTER;
>>> DECLARE 1 RECORD CONTROLLED(P), 2...;
>>>
>>
>> It's not a complete botch.  I've looked at modifying some PL/I code
>> that used CONTROLLED to BASED for a subset-G type implementation.
>> It's quite a bit more complex.  The PL/I philosophy, for better or
>> worse, is to let the compiler do things if possible.
>
> Did you even need BASED?  99 times out of 100, what you really need
>
>   GET (M, N);
>   BEGIN;
>      DCL A(M, N);
>
> In over a quarter century of using PL/I for nearly all my programming,
> including commercial batch programs, system programs, and home-grown
> on-line systems (with just enough assembler to talk to the terminals), I
> never found a program that wasn't simpler, more reliable, and faster
> with CONTROLLED removed.
>

The program uses CONTROLLED as it was intended -- as a pushdown stack.
I can replicate the behavior with BASED, but I'd need an anchor for each
variable, and have to redefine each to include a link, plus write
allocate and deallocate routines that deal with the chain.  Given a
number of such variables, this isn't trivial.  I'll just wait until Iron
Spring gets CONTROLLED.
```
 0
Reply Peter_Flass (956) 5/22/2010 12:54:16 AM

```In article <11lic7-2u52.ln1@ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> writes:
>
> I have now timed my code:
>
> The full scan for potential solutions with the given 56-byte input
> string took about 8K cycles, while my big-O estimate said O(n*sqrt(c))
> which in our case is O(56*sqrt(10)) ~= 180, i.e. each iteration took a
> little over 40 cycles.
>

That sounds reasonable. I've timed my solution also, and depending on
the CPU and compiler version and switches I get values from 7200 to
9999 cycles.

> Since the number of potential solutions was quite high (40), the part of
> the inner loop that saved each of them used significant amounts of time,
> including a significant number of mis-predicted branches.
>
> Still, running at 130+ cycles per input byte is still fast enough to
> handle significant amounts of data.
>

Sounds like you've beaten me. My O() depends on the size of the longest
sequence minus the size of the longest unique sequence, which is good
for the reference data. If I time 4K of random data I get 7250 cycles per
input byte. Since sqrt(C) is 16 for that dataset you ought to get about
800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )
```
 0

```"Robert AH Prins" <spamtrap@prino.org> ha scritto nel messaggio
news:85akp6F4bpU1@mid.individual.net...
> Hi all,

for answer, you use ot code in asm group.
than the problem in the few i understand
it seems to me about DNA search strings
or change strings; i'm again all this

```
 0

```In comp.lang.pascal.borland message <EY0Jn.4632\$z%6.1904@edtnps83>, Thu,
20 May 2010 02:17:08, James J. Weinkam <jjw@nospicedham.cs.sfu.ca>
posted:

>
>Looking at the subsequent posts in this thread, it strikes me that the
>process being used to deal with messages posted to multiple news groups
>some of which are moderated is ill conceived.

No.  It is well-defined and appropriate.  All moderators should
understand it.  All moderating software should implement it.

> The outcome should be the same as if the message were posted
>separately to each group.

No.  If you want to do that :
(a) you are one or more of : foolish, ignorant, or inconsiderate,
(b) you can do it directly and easily enough.

You should have permission from SFU (commonly considered, IIRC, a
reputable organisation) to cause span to be directed to that address?

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Grandson-Of-RFC1036 is released.  RFC 5536 Netnews Article Format is a
subset of Internet Message Format which is described in RFC 5532.  The
RFCs are read together to determine standard Netnews article format.
```
 0

```In <4bf70dbe\$0\$2874\$607ed4bc@cv.net>, on 05/21/2010
at 06:48 PM, John W Kennedy <jwkenne@attglobal.net> said:

>Did you even need BASED?

>I never found a program that wasn't simpler, more reliable, and
>faster  with CONTROLLED removed.

You've never seen a program that was simpler if it used a stack?

Note: I'm not challenging your account of the history, just looking at
the utility after the fact. Nor am I defending use of based or
controlled in contexts where they are not the appropriate tools for
the job.

BTW, what's the overhead for a BEGIN block with current compilers?
That used to be an excuse[1] for misuse of controlled, although with
the "optimizing" compiler I know that the overhead was reduced.

[1] The same overhead issue applied to the allocate.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not

```
 0
Reply spamtrap16 (3722) 5/23/2010 12:39:33 AM

```"Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote in message
news:4bf87945\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
| In <4bf70dbe\$0\$2874\$607ed4bc@cv.net>, on 05/21/2010
|   at 06:48 PM, John W Kennedy <jwkenne@attglobal.net> said:
|
| >Did you even need BASED?
|
|
| >I never found a program that wasn't simpler, more reliable, and
| >faster  with CONTROLLED removed.
|
| You've never seen a program that was simpler if it used a stack?
|
| Note: I'm not challenging your account of the history, just looking at
| the utility after the fact. Nor am I defending use of based or
| controlled in contexts where they are not the appropriate tools for
| the job.
|
| BTW, what's the overhead for a BEGIN block with current compilers?

A BEGIN block is treated as if it were a procedure.

| That used to be an excuse[1] for misuse of controlled, although with
| the "optimizing" compiler I know that the overhead was reduced.
|
| [1] The same overhead issue applied to the allocate.

Well, no.  Certainly storage is allocated [for ALLOCATE], but only for the
named variables.
But that's the end of it.
For a block, not only must storage be allocated for ALL variables
in the block, including temooraries, but also
there is the prologue and epilogue accociated with a block.
This may require that conditions be set up and dis-established, etc.

So, in the end, ALLOCATE has less overhead than a block.

```
 0
Reply robin51 (256) 5/23/2010 3:15:37 AM

```"Dr J R Stockton" <reply1020@nospicedham.merlyn.demon.co.uk> wrote in
message news:IJmTEXPevA+LFwGc@invalid.uk.co.demon.merlyn.invalid...
> In comp.lang.pascal.borland message <EY0Jn.4632\$z%6.1904@edtnps83>, Thu,
> 20 May 2010 02:17:08, James J. Weinkam <jjw@nospicedham.cs.sfu.ca>
> posted:
>
>>
>>Looking at the subsequent posts in this thread, it strikes me that the
>>process being used to deal with messages posted to multiple news groups
>>some of which are moderated is ill conceived.
>
> No.  It is well-defined and appropriate.  All moderators should
> understand it.  All moderating software should implement it.
>

The word "should" implies a *desired* state.  In practice, well...

For instance, one would expect all such software to pass-along the MIME
headers.  When we were using PyModerator last year, we discovered that it
did not do this.  So we patched it.

>> The outcome should be the same as if the message were posted
>>separately to each group.
>
> No.  If you want to do that :
>   (a) you are one or more of : foolish, ignorant, or inconsiderate,

I'm pretty sure we don't do that here at CLAX.  But, it is possible we might
not be following all of the traditional practice in other areas.  Mainly
because our tree-house is still a bit in an "under construction" state.
However, we are none of the above 'colorful words' -- we are simply
pragmatic.  :)

I am currently in the process of porting our moderation client software from
Linux to Windows (ASM code just "wants" to be OS agnostic!!), so I intend to
revisit all of the related "handbooks", RFCs, etc. during this time.

> You should have permission from SFU (commonly considered, IIRC, a
> reputable organisation) to cause span to be directed to that address?
>

A good point.  But I bet that Mr. Weinkam wants to tell you that it "isn't
his organisation's acronym.  :)

Nathan.

```
 0

```Dick Wesseling wrote:
> In article<11lic7-2u52.ln1@ntp.tmsw.no>,
> 	Terje Mathisen<"terje.mathisen at tmsw.no"@giganews.com>  writes:
>> Still, running at 130+ cycles per input byte is still fast enough to
>> handle significant amounts of data.
>>
>
> Sounds like you've beaten me. My O() depends on the size of the longest
> sequence minus the size of the longest unique sequence, which is good
> for the reference data. If I time 4K of random data I get 7250 cycles per
> input byte. Since sqrt(C) is 16 for that dataset you ought to get about
> 800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )

It is worse (or better, seen from my point):

With 4K random input the scan took 252K cycles, which means that I
averaged about 83 cycles/byte. (I.e. very fast!)

The problem was that with totally random data, there are a _lot_ of both
potential and real solutions, I found 4099 candidates (i.e. just over
one per starting byte!) and 121 after filtering.

The filter process here took a _lot_ of time though: 8M cycles!

This means that it needed 2000 cycles/input byte. :-(

This simply shows that the problem space (and therefore required
algorithm as well) almost certainly won't contain totally random data,
and this will most probably help my code...

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
at 01:15 PM, "robin" <robin51@dodo.com.au> said:

>A BEGIN block is treated as if it were a procedure.

Water is wet.

>Well, no.  Certainly storage is allocated [for ALLOCATE], but only
>for the named variables.

The storage for a DSA is pretty much noise; what matters when talking
FREEMAIN.

>This may require that conditions be set up and dis-established, etc.

Conditions are set up by the ON statement.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not

```
 0
Reply spamtrap16 (3722) 5/23/2010 1:17:41 PM

```Robert AH Prins wrote:

> Hi all,
>
> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;
>
> Problem description:
>
> The program processes an array of N cells, with N up to about 4096. The
> array is filled from the start, it may not be completely full but there
> are no empty cells between full cells. A cell contains a value of
> 1..255, so logically every cell would be a byte, but if processing
> efficiency requires each cell to be a word or even a double-word, that
> would be no problem. The values are normalized, they are effectively
> indexes into a table, so if the table contains just T entries, the
> values would go from 1 to T.
>
> Given the above setup, the problem is to find *all* sub-arrays that
> contain 3..T distinct elements and are as short as possible, i.e. if
> there are 42 sub-arrays of three elements containing three distinct
> values, there is no need to find sub-arrays of four elements containing
> three distinct values. Also, a shorter sub-array may *never* be a part
> of a longer sub-array containing *only* distinct values.
>
> An example, suppose only the array contains 56 elements, in this case
> with, for clarity, values a..j:
>
> 1 2 3 4 5 5
> ....5....0....5....0....5....0....5....0....5....0....56
> aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
>
> in this case the program should find the following sub-arrays:
>
> len aperture from to value
> 3 3 40 42 bef
> 3 3 44 46 fgh
> 7 7 50 56 gihbfja
> 8 16 41 56 efffghggggihbfja, containing efghibja
> 9 23 34 56 cbbbbbbefffghggggihbfja, containing cbefghija
> 10 25 32 56 dccbbbbbbefffghggggihbfja, containing dcbefghija
>
> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
> they are either longer than the series of 7-in-7 (12-25 contain abcd),
> or are part of the 7-in-7 series and as such they should not be included!
>
> My AD 1996 program used to slide a window that started with three
> elements (plus a sentinel on either side) over the big array and spit
> out the position of the first element if it found three distinct
> elements in the window *and* the two sentinels also contained any of the
> values in the window. The program included some fairly minor
> optimizations, such as immediately sliding the window to the last of a
> series of equal values, in the example above, the three position wide
> window would start by covering pos 6..8, and next it would move to
> 11..13, etc. If at any moment the program would find that all elements
> in the window *and the sentinel on the right* would be distinct, the
> window size would be increased by one and the scan would continue.
>
> Taking the above example, a window with an aperture of four elements
> would start its slide at position 6..9. Once it reached position 49,
> with "gihb" visible and "g" and "f" in the two sentinels, the window
> would be widened to five characters as the "f" is distinct from the four
> "visible" characters, with a new sentinel of "j". Given that this is
> again distinct from the now five characters in the window, the process
> would repeat itself, eventually resulting in the string of seven
> distinct characters starting at position 50.
>
> Once the string of seven distinct characters has been found, the window
> is widened to *nine* (if there had been a string of 8-in-8 it would have
> been found by the previous slide!) characters and the process restarts
> at position 6..14, but in the end it fails to find a series of eight
> distinct characters and so the process is repeated with windows of 10,
> 11, 12, 13, 14, 15 and finally 16 characters, when a string of eight
> distinct characters is finally found.
>
> The problem is, for relatively low values of N and C, the process may
> not be overly efficient, but it works. However, once N and C increase,
> (N is theoretically unbounded, C has an upper limit of around 210), the
> process becomes horribly inefficient: using the old IBM V2.3.0 OS PL/I
> compiler, which had a statement count facility, this method required the
> execution of 556,379,518 PL/I statements (and a number of actual machine
> instruction that is at least one order of magnitude greater) for the
> current values of N(2329) and C(66).
>
> As for some figures on required restarts, the final series of 66 is
> found in a sub-array with a length of 1950 elements and with a series of
> 65 in a sub-array of only 1889 elements, this means that the slide had
> to be restarted 61 times! Even more restarts, 312(!), are needed when
> going from 61 to 62 elements.
>
> In 1998 I posted the problem to comp.lang.pascal.borland, the post
> 9,960 (out of 28,517) posts in that group.
>
> Three people claimed that there was a much faster way, and two of them
> backed this up with programs, Brent Beach and Paul Green, with Paul's
> solution being the fastest by a fair margin. The results of his code
> matched my output, I replaced my code with his code, and that seemed to
> be the end of the story...
>
> Until a few months back...
>
> ... when I started converting the original Pascal code in assembler.
> While doing so I decided that it would be interesting to use this
> procedure in another part of the code. However, the results were not
> what I expected: result rows were missing.

If you want to do this in assembler, you might
consider to use an alternative approach:

1. Allocate 4,352 byte of memory.
2. Scan through the input:

EDI = address array (1st byte)
ECX = input size

Store EDI, ESI and EBP on stack or another lo-
before it is used (2,048 byte).

....
0:
dec  EBP
movz EAX,BYTE 00[ESI]
movz EBX,BYTE 00[EBP]
cmp DWORD 00[EDI + EAX * 8],0
jne 1f
mov DWORD 00[EDI + EAX * 8],ESI
1:
cmp DWORD 04[EDI + EBX * 8],0
jne 2f
mov DWORD 04[EDI + EBX * 8],EBP
2:
inc ESI
dec ECX
jne 0b
....

Time: ~131,072 clocks or better.

This one pass function stores the addresses of
the 1st and last occurence of a character. The
array is organised as follows:

0000 1st_00, last_00, 1st_01, last_01
0010 1st_02, last_02, 1st_03, last_03
....
07F0 1st_FE, last_FE, 1st_FF, last_FF

Array entries with a value of zero tell us the
entries are equal, the corresponding character
was found only once.

Next, sort out all non-existing characters and
create three work tables:

EDI = address array (1st byte)
EDI + 0x0800 + (n * 4) = 1st occurence
EDI + 0x0C00 + (n * 4) = last occurence
EDI + 0x1000 + (n * 1) = associated char

....
xor EAX,EAX
mov EBX,254
0:
mov ESI,DWORD 0x00[EDI + EBX * 8]
mov EBP,DWORD 0x04[EDI + EBX * 8]
mov ECX,DWORD 0x08[EDI + EBX * 8]
mov EDX,DWORD 0x0C[EDI + EBX * 8]
test ECX,ECX
je 1f
mov DWORD 0x0800[EDI + EAX * 4],ECX
mov DWORD 0x0C00[EDI + EAX * 4],EDX
mov BYTE 0x1000[EDI + EAX * 4],BL
inc EAX
1:
decl EBX
test ESI,ESI
je 2f
mov DWORD 0x0800[EDI + EAX * 4],ESI
mov DWORD 0x0C00[EDI + EAX * 4],EBP
mov BYTE 0x1000[EDI + EAX * 4],BL
inc EAX
2:
decl EBX
jns 0b
....

EAX holds the amount of found chars on exit.

Time: ~5,120 clocks or better.

Tables 1 and 2 contain lists with addresses of
found characters (1st and last), table 3 holds
the associated characters in descending order.
Both - addresses and associated char - are ac-
cessed via indices to get address / char-pairs
for the output. If you move any address to an-
other index (e.g. when sorting), you *have to*
move the other address and the associated cha-
racter to the same index. Otherwise, relation-
ship between any of them will be corrupted!

Searching non-unique patterns is done via com-
paring addresses in both tables. Every char is
included at least once in the area between its
1st and last address, even if they're equal. A
character included in *any* non-unique pattern
must 'sit' at an address equal to or above the
currently processed character. A found pattern
ends at the highest last address.

I do not know which sequences are of interest,
so I cannot provide code for that. If you look
for sequences m...n, a corresponding substring
starts at m(first) and ends at n(last). Only a
copy operation (from n(start) up to m(end)) is
required to retrieve non-unique substrings for
a print-out (address calculation is trivial).

To search for unique patterns - all characters
differ from each other - the entire input must
be scanned one time:

ESI = input_size

....
0:
xor EDX,EDX
mov AL,BYTE 00[EBP + EDX]
inc EDX
1:
mov BL,BYTE 0x00[EBP + EDX]
mov CL,BYTE 0x01[EBP + EDX]
cmp AL,BL
je 4f
inc EDX
cmp BL,CL
je 4f
cmp AL,CL
je 4f
inc EDX
cmp EDX,ESI
jae 4f
push EDX
sub EDX,2
je 3f
2:
cmp BL,BYTE 0x00[EBP + EDX]
je 3f
cmp CL,BYTE 0x00[EBP + EDX]
je 3f
dec EDX
jne 2b
3:
mov EAX,EDX
pop EDX
test EAX,EAX
je 1b
4:
cmp EDX,3
jb 5f
.... # store EBP (address 1st char)
.... # store EDX as pattern length
.... # (store EDX chars as found pattern)
5:
sub ESI,EDX
jbe done
jmp 0b

done:
....

With the largest patterns (255 unique chars in
a row), about 3,000,000 clock cycles are quite
realistic. The smaller the found patterns, the
less time is spent - with any pattern below 32
chars, it probably takes less than half a mil-
lion clocks (over the thumb). A once processed
character is not touched again, when it passed
the inner loop. If you search for one specific
pattern, time can be reduced further (by using
a '1st' address from the table and subtracting
the skipped bytes from ESI, decreasing overall
iterations).

Given clocks are for 4,096 byte arrays (linear
growth with increasing size, except the unique
pattern search).

Clock counts are for AMD Athlon, Family 8+. In
64 bit code, execution times can be reduced to
about 60...75 percent of the given ones. Eight
additional registers are snake oil for complex
functions... ;)

Greetings from Augsburg

Bernhard Schornak
```
 0

```In article <946nc7-rcj2.ln1@ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> writes:
> Dick Wesseling wrote:
>> In article<11lic7-2u52.ln1@ntp.tmsw.no>,
>> 	Terje Mathisen<"terje.mathisen at tmsw.no"@giganews.com>  writes:
>>> Still, running at 130+ cycles per input byte is still fast enough to
>>> handle significant amounts of data.
>>>
>>
>> Sounds like you've beaten me. My O() depends on the size of the longest
>> sequence minus the size of the longest unique sequence, which is good
>> for the reference data. If I time 4K of random data I get 7250 cycles per
>> input byte. Since sqrt(C) is 16 for that dataset you ought to get about
>> 800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )
>
> It is worse (or better, seen from my point):
>
> With 4K random input the scan took 252K cycles, which means that I
> averaged about 83 cycles/byte. (I.e. very fast!)
>

Are you using my data set? I completed and fixed (or broke?) your
code as follows:

#define C 0x100
#define OSIZ 409600
unsigned outpos;

struct {
unsigned len;
unsigned pos;
unsigned w;
} output[OSIZ];

static
void save(unsigned l, unsigned w, unsigned pos)
{
#if 1
output[outpos].len = l  ;
output[outpos].pos = pos;
output[outpos].w   = w  ;
#endif
outpos++;
}

int scan(byte *data, int data_len)
{
unsigned count[data_len];
int      shortest[C+1];
int i;
for (i=0; i<C+1; i++) shortest[i] = 0x7FFFFFFF;
for (i=0; i<data_len; i++) count[i] = 1;

int c = 0;
for (i = 1;             // not 2: else "abc" won't work

This version took about 15M cycles on my athlon, 10M cycles if I
just count the solutions without actually saving them.

> The problem was that with totally random data, there are a _lot_ of both
> potential and real solutions, I found 4099 candidates (i.e. just over
> one per starting byte!) and 121 after filtering.
>

Then we're definitely not using the same data set. When I ran your
program I got 78059 potential solutions instead. My program found
831 solutions, all of which were in your unfiltered output.

> The filter process here took a _lot_ of time though: 8M cycles!
>
> This means that it needed 2000 cycles/input byte. :-(
>
> This simply shows that the problem space (and therefore required
> algorithm as well) almost certainly won't contain totally random data,
> and this will most probably help my code...
>

A slightly optimized version of my program - see below - took
23M cycles on the same data set. Since I only timed the scan phase
of your program and not the filter phase it appears our approaches
are comparable for random data, not as bad as I feared.

Here's my optimized inner loop

#if 0   // 30M cycles for 4K random data
// Note: previous "final" version posted still had a bug here

dec(data[l]); l++;
while (uniq < goal && r<lim)   { r++; inc(data[r]); }
while (freq[data[l]]>1) { dec(data[l]); l++; }
#else
// 23M cycles for 4K random data
// The code generated by gcc still hurts my eyes, but this
// simplified loop is a good starting point for asm.

freq[data[l]]=0; l++; uniq--;  // data[l] is unique

while (r<lim) {              // optimize controlling expression of
unsigned char cr;        // loop
r++;
cr = data[r];
unsigned frc = freq[cr];
freq[cr] = frc+1;
if (frc == 0) {
uniq++;
break;
}
}
while (freq[data[l]]>1) {     // "uniq" does not change
freq[data[l]]--;
l++;
}
#endif

Once I get rid of the variable "uniq", which is not essential, the
remaining variables data[], freq[], l and r will fit into the x86
register set and the inner loop becomes 14 instructions per input
byte per iteration of the outermost loop.
```
 0

```Dick Wesseling wrote:
>> With 4K random input the scan took 252K cycles, which means that I
>> averaged about 83 cycles/byte. (I.e. very fast!)
>>
>
> Are you using my data set? I completed and fixed (or broke?) your

Definitely not the same data set, I am simply using
rand() & 63) + 33;
to generate random (but printable) bytes.

With rand & 255 I get slightly different counts.

> int scan(byte *data, int data_len)
> {
>          unsigned count[data_len];
>          int      shortest[C+1];
>          int i;
>          for (i=0; i<C+1; i++) shortest[i] = 0x7FFFFFFF;
>          for (i=0; i<data_len; i++) count[i] = 1;
>
>          int c = 0;
> 	for (i = 1;             // not 2: else "abc" won't work

You're absolutely right, in order to increment the count to 3 the scan

> Once I get rid of the variable "uniq", which is not essential, the
> remaining variables data[], freq[], l and r will fit into the x86
> register set and the inner loop becomes 14 instructions per input
> byte per iteration of the outermost loop.

OK, I'll have to compare that to some hand-optimised asm. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```In article <htbonq\$61b\$1@news.eternal-september.org>,
Bernhard Schornak <schornak@nospicedham.web.de> writes:

>
> If you want to do this in assembler, you might
> consider to use an alternative approach:
>
> 1. Allocate 4,352 byte of memory.
> 2. Scan through the input:
>
[snip]

I haven't studied your table construction and search algorithm
yet,
However, the following cannot be correct:

> Searching non-unique patterns is done via com-
> paring addresses in both tables. Every char is
> included at least once in the area between its
> 1st and last address, even if they're equal. A
> character included in *any* non-unique pattern
> must 'sit' at an address equal to or above the
> currently processed character. A found pattern
> ends at the highest last address.
>
> I do not know which sequences are of interest,
> so I cannot provide code for that. If you look
> for sequences m...n, a corresponding substring
> starts at m(first) and ends at n(last). Only a
> copy operation (from n(start) up to m(end)) is
> required to retrieve non-unique substrings for
> a print-out (address calculation is trivial).
>
> To search for unique patterns - all characters
> differ from each other - the entire input must
> be scanned one time:
>
> ESI = input_size
>
> ...
> 0:
> xor EDX,EDX
> mov AL,BYTE 00[EBP + EDX]
> inc EDX
> 1:
> mov BL,BYTE 0x00[EBP + EDX]
> mov CL,BYTE 0x01[EBP + EDX]
> cmp AL,BL
> je 4f
> inc EDX
> cmp BL,CL
> je 4f
> cmp AL,CL
> je 4f
> inc EDX
> cmp EDX,ESI
> jae 4f
> push EDX
> sub EDX,2
> je 3f
> 2:
> cmp BL,BYTE 0x00[EBP + EDX]
> je 3f
> cmp CL,BYTE 0x00[EBP + EDX]
> je 3f
> dec EDX
> jne 2b
> 3:
> mov EAX,EDX
> pop EDX
> test EAX,EAX
> je 1b

When label 1 is reached from 0 AL contains the first input
character, When label 1 is reached via "je 1b" AL is always
zero.
```
 0

``` >Terje Mathisen wrote:

> >Dick Wesseling wrote:
>
>>> With 4K random input the scan took 252K cycles, which means that I
>>> averaged about 83 cycles/byte. (I.e. very fast!)
>>
>> Are you using my data set? I completed and fixed (or broke?) your
>
> Definitely not the same data set, I am simply using
>     rand() & 63) + 33;
> to generate random (but printable) bytes.
>
> With rand & 255 I get slightly different counts.
>
>> int scan(byte *data, int data_len)
>> {
>>          unsigned count[data_len];
>>          int      shortest[C+1];
>>          int i;
>>          for (i=0; i<C+1; i++) shortest[i] = 0x7FFFFFFF;
>>          for (i=0; i<data_len; i++) count[i] = 1;
>>
>>          int c = 0;
>>     for (i = 1;             // not 2: else "abc" won't work
>
> You're absolutely right, in order to increment the count to 3 the scan
>
>> Once I get rid of the variable "uniq", which is not essential, the
>> remaining variables data[], freq[], l and r will fit into the x86
>> register set and the inner loop becomes 14 instructions per input
>> byte per iteration of the outermost loop.
>
> OK, I'll have to compare that to some hand-optimised as

I am attempting to follow the code for this problem, however, it becomes
a bit
difficult to follow as it changes so often.  This is not a complaint as
I very much
appreciate some real examples.

However, I would be extremely helpful if both Terje and Dick could
include all of
the code and some examples, as was done the first few times,
approximately every
fifth time the code changes.  This will enable the rest of us who want
discussion to put it into context.  The problem is very instructive and
I really enjoy
the description of what you fellows are doing.

Jerome Fine
```
 0

```In article <4bf9c369\$0\$275\$14726298@news.sunsite.dk>,
"Jerome H. Fine" <jhfinedp3k@nospicedham.dp3knospamcompsys.to.dotsrc.org> writes:
>
> I am attempting to follow the code for this problem, however, it becomes
> a bit difficult to follow as it changes so often.

I know. Never say "final version"....

> This is not a complaint as I very much appreciate some real examples.

I'd be happy to provide some provide some real examples. Tomorrow is a
holliday here, so I can afford to spend time on this puzzle without
feeling guilty about it. Depending on how things are going - I am
now coding the final asm version - I may even do so later tonight/today.
I don't have to bring my offspring to school. so I can stay up as late
as I want to.
```
 0

```"Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote in message
news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
|   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
|
| >A BEGIN block is treated as if it were a procedure.
|
| Water is wet.

I'm referring to the code that is produced.

| >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
| >for the named variables.
|
| The storage for a DSA is pretty much noise;

But it's an extra overhead that ALLOCATE doesn't have
[apart from the that required for the specificate ALLOCATE].

| what matters when talking
| FREEMAIN.
|
| >This may require that conditions be set up and dis-established, etc.
|
| Conditions are set up by the ON statement.

Conditions are set up by condition prefixes.

```
 0
Reply robin51 (256) 5/24/2010 2:44:44 AM

```Dick Wesseling wrote:

<snip>

> However, the following cannot be correct:

<snip>

>> ESI = input_size
>>
>> ...
>> 0:
>> xor EDX,EDX
>> mov AL,BYTE 00[EBP + EDX]
>> inc EDX
>> 1:
>> mov BL,BYTE 0x00[EBP + EDX]
>> mov CL,BYTE 0x01[EBP + EDX]
>> cmp AL,BL
>> je 4f
>> inc EDX
>> cmp BL,CL
>> je 4f
>> cmp AL,CL
>> je 4f
>> inc EDX
>> cmp EDX,ESI
>> jae 4f
>> push EDX
>> sub EDX,2
>> je 3f

mov AL,CL # AL = char at -1[EBP + EDX]
.......... # for the next iteration

>> 2:
>> cmp BL,BYTE 0x00[EBP + EDX]
>> je 3f
>> cmp CL,BYTE 0x00[EBP + EDX]
>> je 3f
>> dec EDX
>> jne 2b
>> 3:

mov ECX,EDX # ECX is loaded in 1:

>> pop EDX

test ECX,ECX

>> je 1b
>
> When label 1 is reached from 0 AL contains the first input
> character, When label 1 is reached via "je 1b" AL is always
> zero.

Sorry! (I optimised the most important instruction out
of my code.) AL must contain the last character in the

The table stuff is not as complicated as it seems. All
substrings can be determined with the addresses stored
in it (no extra scans required). It uses a fast method
scanning through an input array in both directions si-
multaneously. Getting the first and last positition of
each char in a single pass is (probably) faster than a
repeated scan through the entire array - the addresses
are sufficient to calculate all substrings rather than
to search for them.

Greetings from Augsburg

Bernhard Schornak
```
 0

``` >Dick Wesseling wrote:

>>In article <4bf9c369\$0\$275\$14726298@news.sunsite.dk>,
>	"Jerome H. Fine" writes:
>
>>I am attempting to follow the code for this problem, however, it becomes
>>a bit difficult to follow as it changes so often.
>>
>I know. Never say "final version"....
>
>
Having started on an IBM 650 in SOAP around 1960, I fully realize that
aspect.

>>This is not a complaint as I very much appreciate some real examples.
>>
>I'd be happy to provide some provide some real examples. Tomorrow is a
>holliday here, so I can afford to spend time on this puzzle without
>feeling guilty about it. Depending on how things are going - I am
>now coding the final asm version - I may even do so later tonight/today.
>I don't have to bring my offspring to school. so I can stay up as late
>as I want to.
>
>
The C code is even more instructive in its own way as it provides an
overview that assembly code is unable to do.  The current C code
along with a few examples as you did with your first post would be
VERY appreciated!  It also provides a comparison with the original
and shows the steps taken to make the code faster and more efficient.

As Terje notes, the efficient use of L1 cache is able to improve the
execution speed, but is rarely considered when an algorithm is
evaluated.  The C code might include a few comments as to just
how much storage is required at any point in the process.  As far
as I can determine, the L1 instruction cache will not be a problem
for this algorithm.  Likewise the L1 data cache if care is taken even
though the total storage requirements are much larger.

Perhaps having started with a system that also had an effective L1
cache of about 100 bytes helps to remember how to optimize the
code.  I think that the total memory including the disk which also
held the code and the data was about 20 KBytes.

Jerome Fine
```
 0

```On 2010-05-22 07:54:13 -0400, David W Noon said:

> -----BEGIN PGP SIGNED MESSAGE-----
>
> Hash: SHA1
>
>
>
> On Sat, 22 May 2010 10:49:27 +1000, robin wrote about Re: Optimization
>
> problem:
>
>
>
> >"John W Kennedy" <jwkenne@attglobal.net> wrote in message
>
> >news:4bf70dbe\$0\$2874\$607ed4bc@cv.net...
>
> >
>
> >| Did you even need BASED?  99 times out of 100, what you really need
>
> >| instead of CONTROLLED is:
>
> >|
>
> >|   GET (M, N);
>
> >|   BEGIN;
>
> >|      DCL A(M, N);
>
> >|
>
> >| In over a quarter century of using PL/I for nearly all my
>
> >programming, | including commercial batch programs, system programs,
>
> >and home-grown | on-line systems (with just enough assembler to talk
>
> >to the terminals), | I never found a program that wasn't simpler, more
>
> >reliable, and faster | with CONTROLLED removed.
>
> >
>
> >Linked lists?  Queueing? etc etc etc.
>
>
>
> Lists and queues can be implemented using BASED, which actually gives
>
> faster access than CONTROLLED.
>
>
>
> The real benefit that CONTROLLED offers, to my mind, is block
>
> independence.  A subroutine or function can maintain its internal state
>
> without compromising its reentrability.  Each CONTROLLED variable is
>
> anchored by an external dummy section (on the mainframe, other
>
> platforms use a work-alike) that is allocated in the Pseudo-Register
>
> Vector (PRV), which hangs off the Task Control Area -- that magic piece
>
> of memory addressed by the sacred register 12. This means that state
>
> information can be placed in dynamic storage anchored at task level,
>
> rather than block level. Subsequent calls to the subroutine or function
>
> can retrieve the previous state with ease, but can still be
>
> safely executed by multiple TCB's concurrently.
>
>
>
> Of course, this is a somewhat exotic use case.  But when confronted by
>
> such a requirement, I want CONTROLLED storage.
>

But that is a way to force a language designed in the 60s, implemented
on an operating system designed in the 60s,* to do what more modern
languages do with modules or packages, and still more modern languages
do with objects.

* Even TSS/360 had a better approach to reentrance, conceptually much
the same as is implemented in pretty much all modern operating systems.

--
John W Kennedy
"Those in the seat of power oft forget their failings and seek only the
obeisance of others!  Thus is bad government born!  Hold in your heart
that you and the people are one, human beings all, and good government
shall arise of its own accord!  Such is the path of virtue!"
-- Kazuo Koike.  "Lone Wolf and Cub:  Thirteen Strings" (tr. Dana Lewis)

```
 0
Reply jwkenne (1440) 5/24/2010 3:46:56 AM

```On 2010-05-23 22:44:44 -0400, robin said:

> "Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote
> in message
> news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
> | In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
> |   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
> |
> | >A BEGIN block is treated as if it were a procedure.
> |
> | Water is wet.
>
> I'm referring to the code that is produced.
>
> | >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
> | >for the named variables.
> |
> | The storage for a DSA is pretty much noise;
>
> But it's an extra overhead that ALLOCATE doesn't have
> [apart from the that required for the specificate ALLOCATE].

You don't know what the Hell you're talking about.

--
John W Kennedy
"The whole modern world has divided itself into Conservatives and
Progressives. The business of Progressives is to go on making mistakes.
The business of the Conservatives is to prevent the mistakes from being
corrected."
-- G. K. Chesterton

```
 0
Reply jwkenne (1440) 5/24/2010 4:11:01 AM

```Jerome H. Fine wrote:
> I am attempting to follow the code for this problem, however, it
> becomes a bit difficult to follow as it changes so often. This is not
> a complaint as I very much appreciate some real examples.

OK, here is my complete program, as it currently exists:
(I finally realized that my potential solution can explode when working
with random data, so I have increased the buffer to store them to the
maximum possible count!)

// unique-strings.cpp : Defines the entry point for the console application.
//

//#include "stdafx.h"
#include <stdio.h>
#include <tchar.h>
#include <stdlib.h>

typedef unsigned char byte;

#define MAX_DATA 4096
int data_len;

// Data to be scanned
byte data[MAX_DATA+1];

// Remember the shortest sequence with n unique symbols
short shortest[256]; // Alphabet size entries
// Running count of unique symbols seen, from [n] to current pos
short count[MAX_DATA];

typedef struct _solution_t {
int uniq;
int len;
int start;
} solution_t;

solution_t solution[MAX_DATA*256]; // Maximum possible count of possible
solutions
int solutions = 0;

void init(void)
{
for (int i = 0; i < MAX_DATA; i++)
shortest[i] = 32767;

for (int i = 0; i < MAX_DATA; i++)
count[i] = 1;

for (int i = 0; i < MAX_DATA; i++)
data[i] = (rand() & 63) + 33;
data_len = MAX_DATA;

// Here are a couple of small sets with known solutions:
//memcpy(data,
"aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja", 56);
//data_len = 56;
//memcpy(data, "abc", 3);
//data_len = 3;
}

void save(int unique, int len, int start)
{
solution[solutions].uniq = unique;
solution[solutions].len = len;
solution[solutions].start = start;
solutions++;
}

int solsort(const void *a, const void *b)
{
solution_t *as = (solution_t *) a;
solution_t *bs = (solution_t *) b;
int u = bs->uniq - as->uniq;		// Return longest unique sets first
if (!u) {
u = as->len - bs->len;			// Return shortest string first for each set
length
if (!u)
u = as->start - bs->start;	// Finally order by starting position
}
return u;
}

int filter(void)
{
int prev_uniq = 0x7fffffff; // Max 32-bit int
int prev_len = 0x7fffffff; // Max 32-bit int

// First get rid of all non-optimal strings for each unique set length:
// Sort reverse by set length, forward by string length, then by
starting position
qsort(solution, solutions, sizeof(solution[0]), solsort);

int sols = 0;
for (int s = 0; s < solutions; s++) {
if (solution[s].uniq == prev_uniq) {
if (solution[s].len > prev_len) { // Filter this entry away!
solution[s].uniq = 0; // Mark as removed
}
else { // Keep this, move down:
solution[sols++] = solution[s];
}
}
else { // (solution[s].uniq < prev_uniq) This is the first entry for a
shorter set:
prev_uniq = solution[s].uniq;
prev_len = solution[s].len;
solution[sols++] = solution[s];
}
}
solutions = sols;

// For each remaining solution, check if any other overlap-1:

for (int s = 0; s < solutions; s++) {
for (int k = s+1; k < solutions; k++) {
if ((solution[s].len == solution[k].len+1) &&
((solution[s].start == solution[k].start) ||
(solution[s].start+1 == solution[k].start)))
solution[k].uniq = 0; // Mark as removed
if (solution[s].len > solution[k].len+1) break;
}
}
// Compact the final solution set:
sols = 0;
for (int s = 0; s < solutions; s++) {
if (solution[s].uniq)
solution[sols++] = solution[s];
}
solutions = sols;

return solutions;
}

int scan(byte *data, int data_len)
{
int c = 0; // # of candidate solutions

// The first 3-byte solution cannot end before the third entry, so
// the loop below starts at i = 2, i.e. the third byte:
for (int i = 1; i < data_len; i++) {
byte b = data[i];
for (int j = i-1; j >= 0; j--) {
// A previously seen character cannot end a new longest sequence!
// It also cannot add to the sequence count for this or any
// longer string, so stop scanning!
if (data[j] == b) break;

// Increment the count of unique values seen when starting from
position [j]
int unique_values = ++count[j];
if (unique_values >= 3) {
int len = i-j+1;
if (len <= shortest[unique_values]) {
save(unique_values, len, j);
shortest[unique_values] = len;
c++;
}
}
}
}
return c;
}

typedef __int64 int64_t;

int64_t rdtsc(void)
{
__asm rdtsc
}

int _tmain(int argc, _TCHAR* argv[])
{
init();
int64_t t0 = rdtsc();
int pot = scan(data, data_len);
int64_t t1 = rdtsc();
int cnt = filter();
int64_t t2 = rdtsc();
// Print the final solutions in opposite order, from shortest to longest:
printf("len aperture from to value\n");
for (int s = cnt-1; s >= 0; s--) {
int start = solution[s].start;
int len = solution[s].len;
printf("%3d %6d %5d %3d ", solution[s].uniq, len, start+1, start+len,
data+start);
if (len > 20) len = 20;
for (int s = start; s < start+len; s++) {
byte b = data[s];
if ((b >= ' ') && (b < 127))
printf("%c", b);
else
printf("\\%03o", b);
}
printf("\n");
}

printf("Time to scan: %12I64d (%d potential solutions)\nTime to filter:
%10I64d (%d filtered solutions)\n",
t1-t0, pot, t2-t1, cnt);

return 0;
}

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```
The basic idea is to create a simple algorithm that performs
one sequential scan over the input for each sequence length.
There is no backtracking. Instead intermediate results are
stored in a table "found", and if a shorter string with
the same number of unique symbols is found the table is reset.

If all of the data fits into the L1 cache then performing
multiple passes over the data should perform well. Apart from
the input data the program needs 2 extra arrays:

- freq[C], counts the frequency of each symbol in the window
- found[?], as explained above this array contains tentative results.

freq is heavily used. It adds 1K to the amount of data that
needs to be in the L1 cache.
The cache usage of "found" is more complicated. In theory it
can become quite large, but my inituition is that it should
behave quite modest in practice.

During each pass the following variables keep track of the state
of the sliding window:
- l, the left margin
- r, the right margin (inclusive)
- uniq, the number of unique symbols within the window.

When the right margin moves, we bring a new symbol into the window,
and freq[new_symbol] must be incremented. If the new symbol does
not yet occur within the window then "uniq" must be incremented
as well:

#   define inc(c) do { if(!freq[c]) uniq++; freq[c]++; } while(0)

When the left margin moves a symbol falls out of the window and
the opposite happens:

#   define dec(c) do { freq[c]--; if(!freq[c]) uniq--; } while(0)

The macros inc() and dec() and the variable uniq are just a convenience
for bootstrapping the algorithm. With careful coding they can be
optimized away from the final version, sacrificing clarity for raw
speed.

Sliding the window needs to maintain a loop invariant. The invariant
is:
- freq[l] must be 1. If the leftmost symbol would occur elsewhere
in the window then this cannot possibly be the shortest string
for the current goal.
- freq[r] must be 1 for the same reason.
- uniq equals the goal of the current pass. Remember that each pass
searches for one particular sequence length only.

That being said, the outline of the program is:

Select a large initial window such that the loop invariant holds.
goal = number of unique symbols in this window.

do
do
slide window from left to right maintaining the invariant
while not at end of input (i.e. invariant still true)
if window size <  previous
dump the found array
and update apub: the current best value for the window
if window size <= apub
od
print results for sequence length "goal".
while goal not minimal (see below)
goal = goal -1
select an initial window such that the loop invariant holds.
od

The choice for looping from a large goal downwards is arbitrary,
it seemed the easiest way to tackle the problem. If you start
with the window being the entire input stream then you just need
to shrink the margins until freq[l]=freq[r]=1. Setting goal=uniq
completes the loop invariant.

The important step is:
slide window from left to right maintaining the invariant

This is done as follows:

#if 0
1) dec(data[l]); l++;
2) while (uniq < goal && r<lim)   { r++; inc(data[r]); }
3) while (freq[data[l]]>1) { dec(data[l]); l++; }
#else
...

1) The window must slide by at least 1 symbol.
We know that the leftmost symbol is unique. If we increment l
the number of unique symbols in the window therefore decreases
by 1.
2) We therefore move the right margin until a new unique symbol
moves into the window. Now equals==goal holds again
3) However, in step 2) we may have skipped over instances of the
leftmost symbol which is now no longer uniq:

before     aabcdbx
----     uniq=goal=4       aperture=4
after 1)   aabcdbx
---     uniq=3 goal-4     aperture=3
after 2)   aabcdbx
-----   uniq=goal=4       aperture=5

now "b" is no longer unique. We must therefore move the
left margin even further:

after 3)   aabcdbx
----   uniq=goal=4       aperture=4

Now the loop invariant holds again, both margins are
at a unique symbol.

This is the inner loop of the program. Since we know that
data[l] and data[r] are unique we can get rid of the inc/dec
macros and expand and optimize these three statements into:

...
#else
1)  freq[data[l]]=0; l++; uniq--;  // data[l] is unique

2)  while (r<lim) {              // optimize controlling expression of
unsigned char cr;        // loop
r++;
cr = data[r];
unsigned frc = freq[cr];
freq[cr] = frc+1;
if (frc == 0) {
uniq++;
break;               // See below
}
}
3)  while (freq[data[l]]>1) {     // no reason to update "uniq"
freq[data[l]]--;
l++;
}
#endif

This is better than the first version. Since only one unique
symbol moved out of the window in step 1) we can break out of
step 2) as soon as we find the first new symbol, i.e. a symbols
whose frequence was zero prior to incrementing.
We can still improve upon this version. The variable uniq is
incremented exactly once and decremented excactly once. We therefore
need not change it at all. (If we hit the end of data with r==lim
then "uniq" no longer serves a purpose).

For the purpose of designing an x86 asm program we now have everyting
in place, the inner loop now uses 4 variables l, r, freq[] and data[].
This will fit into the x86 reqister set, leaving 2 or 3 other registers
for holding intermediate results. Now we're all set to try and beat
the C compiler, which is harder than it seems.

Next we need to implement the remainder of the algorithm outline above:

print results for sequence length "goal".
while goal not minimal (see below)
goal = goal -1
select an initial window such that the loop invariant holds.
od

printing the results is straightforward. Initializing the window
for the next iteration of the main loop is similar to sliding the
window: initialize l to 0, move r to the right until we have
enough unique symbols, and move l in case the leftmost character
is no longer uniqe:

// Now the next shorter sequence.

goal--;

1)  inc(data[l]);
2)  do {
r++;
inc(data[r]);
} while (uniq != goal);

3)  while (freq[data[l]] != 1) {
dec(data[l]);
l++;
}

Using the same input data as in the previous example:

after 1)   aabcdbx
-         uniq1  goal=3     aperture=1
after 2)   aabcdbx
----      uniq=3 goal-4     aperture=4
after 3)   aabcdbx
---      uniq=goal=4       aperture=5

Again, this loop can be heavily optimized to get rid of inc/dec and
uniq, but I'm not doing that in the C prototype, after this is not
the inner loop.

How about the requirement that no sequences that are
substrings of longer unique sequences must be printed?

Good question! This is another reason why looping downwards seemed
the natural thing to do. Let me quote from my previous posting:

<EUREKA>

In the example it says:

> It is possible to find sub-arrays with 4, 5, or 6 distinct values, but
> they are either longer than the series of 7-in-7 (12-25 contain abcd),
> or are part of the 7-in-7 series and as such they should not be included!

1         2         3         4         5     5
.....5....0....5....0....5....0....5....0....5....0....56
aaaaaabbbbbabbbcbcccccccddddddddccbbbbbbefffghggggihbfja
--------------                        -------
4-in-14                               7-in-7

I did not fully understand why 12-15 was to be rejected, after all
"abcd" is not a substring of "gihbfja".
Now I understand. 7-in-7 contains substrings 6-in-6, 5-in-5, 4-in-4
and so on. These are not to be printed because they are proper
substrings of "gihbfja". However, they define "as short as possible".
for lengths <= 7.

IN OTHER WORDS, ONCE YOU KNOW THE LONGEST N-IN-N SEQUENCE ALL SHORTER
SEQUENCES MUST CONSIST ENTIRELY OF UNIQUE SYMBOLS.

This suggests two different seach strategies, one for long seqences and
one for short sequences.

</EUREKA>

In other words, as long as you do not have a n-in-n sequence yet you
need not test for substrings, and once you have a n-in-n sequence
you can switch to a different algorithm that needs to handle n-in-n
sequences only, which is much easier than the general n-in-m loop.

Enter plan B. Plan B performs ONE scan of the input to find the
remaining n-in-n sequences. Again a sliding window is used, but
this time the invariant is simply:

- All symbols in the window must be unique.

This is true for l=r. It remains true as long as the symbol just
beyond the right margin does not occur within the window.
Otherwise print the current window if >= 3, and move l until
all symbols are unique again.

1)  // l=r=0

while (r<lim) {
unsigned char cl, cr;
cl = data[l];
cr = data[r+1];                // Look ahead

if (freq[cr] != 0) {            // Maximal sequence?
if (WINSIZE >= 3) {
printres(data, l, WINSIZE, WINSIZE);
}
do {                        // Move left edge until
cl = data[l];           // we have all symbols unique.
freq[cl] = 0;
2)              l++;
} while (cl != cr);
}
3)      r++;
freq[cr]=1;
}

Example:
after 1)    abcdabcd
-
after 3)    abcdabcd
--
after 3)    abcdabcd
--
after 3)    abcdabcd
---
after 3)    abcdabcd
----
Look ahead, print "abcd" & proceed:

after 2)    abcdabcd
----

Another example:
after 1)    abcdca
-
after 3)    abcdca
--
after 3)    abcdca
---
after 3)    abcdca
----

Look ahead, print "abcd" & proceed:

after 2)    abcdca
--

As for the asm, this C program translates into +- 240 x86 instructions.
The inner loop is 15 instructions. Unfortunately most of these are
tightly interlocked, and there are 6 memory references. The L1
hitrate should be high, but there's not much room for improvement.
Processing the sample went from 9000 to 7000 cycles, but for random
data the gain went down to 2% compare to gcc's code on one CPU and
and -2% one another.
I may play with the asm code some time later, but for now I think
it is not worth bothering. Perhaps Terje is faring better then I.

Here is the next final final version. A TRACE macro is provided.
If you don't have an ansi c++ compiler handy then a trace for the
sample data can be found at http://www.fi.uu.nl/~ftu/trace.txt

-----------------------------------------------------------------------

#include <portos/profile.h>
/*
Here is the final version of my solution.

Let:    N       number of cells, size of the input
C       number of values, size of symbol set.
M       the number of distinct values in the largest sequence.
T       the size of the largest n-in-n sequence, i.e. the
largest sequence consisting of distinct values only.

The algorithm uses two different search strategies:

Plan A

Find the largest sequence with size M

for n from M downwards search n-in-m sequences, keeping only sequences
with minimal value of m.
Print sequence if n<m. (If n=m the same sequence will be found again
in the next step).

Now n=m=T.

Plan B
Print all maximal n-in-n sequences.

Plan A scans the input 1+(M-T) .. 2+(M-T) times, depending on the
input position of the longest sequence.
Plan B scans the input once.

The running time is therefore bounded by O(N * (3+(M-T)) which
is typically better than or equal to O(N*C).
For small values of N the cost of erasing the frequency array should
also be accounted for.

Two arrays are used:
found[N]    Intermediate results of plan A.
freq[C]     Frequency count for all symbols in the sliding window

Finding sequences is done using a straightforward sliding window
scan of the input. The values in freq[] are updated when the window
edges move and the number of unique symbols is updated when a
frequency count changes between zero and non-zero.

The output is unsorted.

*/

#include <stdio.h>
#include <string.h>

#define C 256               // Size of alphabet
//#define ONE 1               // Pascal array bias convention
#define ONE 0               // C array bias convention

struct {
unsigned len;
unsigned pos;
unsigned w;
} output[4096];
unsigned outpos;

static
void printres (unsigned char *data, unsigned pos, unsigned l, unsigned w)
{
#if 0
unsigned i;
printf("len %2d aperture %2d pos %2d\t", l, w, ONE+pos);
for (i=pos; i<pos+w; i++) printf("%c", data[i]);
printf("\n");
#else
output[outpos].len = l  ;
output[outpos].pos = pos;
output[outpos].w   = w  ;
outpos++;
#endif
}

#if 1
void trace(char *comment, unsigned char *data,
unsigned l, unsigned r, unsigned uniq, unsigned *freq)
{
unsigned i;
printf("%s: %d-%d aperture %d unique %d\n", comment, l, r, (r-l)+1, uniq);
printf("%s\t", data);
for (i=0; i<C; i++) {
if (freq[i]) {
printf("%c=%d ", i, freq[i]);
}
}
printf("\n");

for (i=0; i<l;  i++) printf(" ");
for (   ; i<=r; i++) printf("-");
printf("\n");
}
#define TRACE(x) trace((x), data, l, r, uniq, freq)
#else
#define TRACE(x)
#endif

void findseqs (unsigned char *data, unsigned len)
{
//  printf("%s\n", data);
outpos=0;
unsigned i;
unsigned uniq = 0;              // Nr unique symbols in window
unsigned apub = ~0;             // Aperture upper bound (aka "m")

unsigned nrfound = /* to keep gcc happy */ 0;
unsigned found[len];            // intermediate results

unsigned freq[C];               // Frequency count
bzero(freq, sizeof(freq));

#   define inc(c) do { if(!freq[c]) uniq++; freq[c]++; } while(0)
#   define dec(c) do { freq[c]--; if(!freq[c]) uniq--; } while(0)
#   define WINSIZE ((r-l)+1)

unsigned l = 0;                 // Start of window
unsigned r = len-1;             // Inclusive end of window
unsigned lim = len-1;           // Inclusive end of data

for (i=0; i<len; i++) {
inc(data[i]);
}
if (uniq < 3) return;

// Trim left & right edges

TRACE("Entire input");
while (freq[data[r]]>1) { dec(data[r]); r--; }
TRACE("Initial trim r");
while (freq[data[l]]>1) { dec(data[l]); l++; }
TRACE("Initial trim l");

// From here on we want to find only sequences that are shorter or
// equal to what we have found above:

unsigned goal = uniq;

while (1) {

//  data[l..r] contains "goal" unique symbols
//  freq[]     number of occurances of each symbol in data[l..r]
//  data[l]    is a unique symbol
//  data[r]    is a unique symbol

TRACE("Main loop");

if (WINSIZE <= apub) {          // Short enough?

// A candidate. May be premature if we have shorter sequences
// with the same nr of unique symbols.

if (WINSIZE < apub) {
nrfound = 0;            // Dump old candidates
apub = WINSIZE;         // Next matches must be <= this one
}
found[nrfound] = l;         // Add to list of tentative results
nrfound++;
}

// Slide window to next match.

#if 0   // 30M cycles for 4K random data
// Note: previous "final" version posted still had a bug here

dec(data[l]); l++;
while (uniq < goal && r<lim)   { r++; inc(data[r]); }
while (freq[data[l]]>1) { dec(data[l]); l++; }
#else
// 23M cycles for 4K random data
// The code generated by gcc still hurts my eyes, but this
// simplified loop is a good starting point for asm.

freq[data[l]]=0; l++; uniq--;  // data[l] is unique
TRACE("After slide step 1");

while (r<lim) {              // optimize controlling expression of
unsigned char cr;        // loop
r++;
cr = data[r];
unsigned frc = freq[cr];
freq[cr] = frc+1;
if (frc == 0) {
uniq++;
break;
}
}
TRACE("After slide step 3");
while (freq[data[l]]>1) {     // no reason to update "uniq"
freq[data[l]]--;
l++;
}
TRACE("After slide step 3");

#endif

if (uniq == goal) continue;

// We've hit the right edge

bzero(freq, sizeof(freq));      // Reset window & statistics
uniq = l = r = 0;

if (apub == goal) break;        // If *only* distinct values proceed
// with plan B.

for (i=0; i<nrfound; i++) {     // Print the results
printres(data, found[i], goal, apub);
}

// Now the next shorter sequence.

goal--;

inc(data[l]);
do {
r++;
inc(data[r]);
} while (uniq != goal);
TRACE("Find shorter, after slideR");

while (freq[data[l]] != 1) {
dec(data[l]);
l++;
}
TRACE("Find shorter, after slideL");

// Now the precondition holds. Also, the window size is smaller
// than the aperture upper bound from the previous iteration.
}

// Plan B
// Now we consider only sequences with all symbols distinct.
// The freq[] array therefore behaves like a bitmap with values 0&1 only.

freq[data[l]] = 1;

while (r<lim) {
unsigned char cl, cr;
cl = data[l];
cr = data[r+1];                // Look ahead

if (freq[cr] != 0) {            // Maximal sequence?
if (WINSIZE >= 3) {
printres(data, l, WINSIZE, WINSIZE);
}
do {                        // Move left edge until
cl = data[l];           // we have all symbols unique.
freq[cl] = 0;
l++;
} while (cl != cr);
}
r++;
freq[cr]=1;
}

// Final sequence.

if (WINSIZE >= 3) {
printres(data, l, WINSIZE, WINSIZE);
}
return;
}

void printresults() {
// .. qsort "output" if desired
unsigned i;
for (i=0; i<outpos; i++) {
printf("len %3d aperture %3d pos %3d\t",
output[i].len,
output[i].w,
ONE+ output[i].pos);
printf("\n");
}
}

int main(int argc, char **argv)
{
if (argc > 2) {
fprintf(stderr, "Usage: %s {string}\n", argv[0]);
return 1;
}
if (argc == 1) {
unsigned char buf[4096];
size_t inlen = fread(buf, sizeof(buf[0]), sizeof(buf), stdin);
findseqs( buf, (unsigned) inlen);
}
if (argc == 2) {
size_t l = strlen(argv[1]);
findseqs( (unsigned char *)argv[1], l );
}
//printresults();
return 0;
}
```
 0

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf9fc55\$0\$4975\$607ed4bc@cv.net...
| On 2010-05-23 22:44:44 -0400, robin said:
|
| > "Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote
| > in message
| > news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
| > | In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
| > |   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
| > |
| > | >A BEGIN block is treated as if it were a procedure.
| > |
| > | Water is wet.
| >
| > I'm referring to the code that is produced.
| >
| > | >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
| > | >for the named variables.
| > |
| > | The storage for a DSA is pretty much noise;
| >
| > But it's an extra overhead that ALLOCATE doesn't have
| > [apart from the that required for the specificate ALLOCATE].
|
| You don't know what the Hell you're talking about.

Are you just having another bad day?

```
 0
Reply robin51 (256) 5/24/2010 8:57:13 AM

``` >Dick Wesseling wrote:

>The basic idea is to create a simple algorithm that performs
>one sequential scan over the input for each sequence length.
>There is no backtracking. Instead intermediate results are
>stored in a table "found", and if a shorter string with
>the same number of unique symbols is found the table is reset.
>
>If all of the data fits into the L1 cache then performing
>multiple passes over the data should perform well. Apart from
>the input data the program needs 2 extra arrays:
>
>[Snip]
>
>
I just want to say THANK  YOU to both Dick and Terje for making the
complete program (the current version that is) available.  The discussion
of L1 cache usage also provides valuable insight to what your concepts
are and how they impact the code.  While the percentage of individuals
who actually consider cache usage when an algorithm is being designed
is certainly higher than usual with the followers of newsgroups, I doubt
that it is always a high priority.  It would be interesting to develop some
tools to measure hit rates.  In that regard, could Intel and AMD add
any microcode which would help the user?  Does anyone have any idea
how much extra time (if any) it might take to maintain a hit rate register
for cache hits and misses as part of the microcode for the CPU, just as
it is possible to obtain the number of cycles?

Just a thought, but nothing gets done if the questions are not asked.

Jerome Fine
```
 0

```robin wrote:
> "Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote in message
> news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
> | In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
> |   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
> |
> | >A BEGIN block is treated as if it were a procedure.
> |
> | Water is wet.
>
> I'm referring to the code that is produced.
>
> | >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
> | >for the named variables.
> |
> | The storage for a DSA is pretty much noise;
>
> But it's an extra overhead that ALLOCATE doesn't have
> [apart from the that required for the specificate ALLOCATE].
>

OTOH, although I haven't looked at this in detail, CONTROLLED storage
has interactions with Tasking (maybe not with the Enterprise compilers)
that adds quite another level of overhead.  It makes things much less
simple than just push and pop.

```
 0
Reply Peter_Flass (956) 5/24/2010 10:57:18 AM

```Jerome H. Fine wrote:
> I just want to say THANK YOU to both Dick and Terje for making the
> complete program (the current version that is) available. The discussion
> of L1 cache usage also provides valuable insight to what your concepts
> are and how they impact the code. While the percentage of individuals
> who actually consider cache usage when an algorithm is being designed
> is certainly higher than usual with the followers of newsgroups, I doubt
> that it is always a high priority. It would be interesting to develop some
> tools to measure hit rates. In that regard, could Intel and AMD add
> any microcode which would help the user? Does anyone have any idea
> how much extra time (if any) it might take to maintain a hit rate register
> for cache hits and misses as part of the microcode for the CPU, just as
> it is possible to obtain the number of cycles?

This has existed since the original Pentium, in the form of the EMON
counters!

These were "secret" back then, but I reverse engineered them and wrote a
BYTE article which was published in the summer of 1994. On all later x86
cpu models the EMON counters, which are different for pretty much every
cpu generation, have been documented in the cpu manuals.

And Yes, L1 hit/miss rates are one of the most basic counters, included
on every single version since the original one. :-)

Today the easiest way to access them under Win* is to use Intel's VTune,
but there are similar (and free!) tools available under Linux.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```"John W Kennedy" <jwkenne@attglobal.net> wrote in message news:4bf9fc55\$0\$4975\$607ed4bc@cv.net...
| On 2010-05-23 22:44:44 -0400, robin said:
|
| > "Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote
| > in message
| > news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
| > | In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
| > |   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
| > |
| > | >A BEGIN block is treated as if it were a procedure.
| > |
| > | Water is wet.
| >
| > I'm referring to the code that is produced.
| >
| > | >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
| > | >for the named variables.
| > |
| > | The storage for a DSA is pretty much noise;
| >
| > But it's an extra overhead that ALLOCATE doesn't have
| > [apart from the that required for the specificate ALLOCATE].
|
| You don't know what the Hell you're talking about.

The time to execute 1,000,000 ALLOCATE and FREE statements
for a vector of size N on my PC, is 1.4 secs.

The time to do the same with a BEGIN block in which
a vector of size N is declared, is 79 secs.

N = 1,000,000.

BEGIN takes more than 50 times longer.

So who doesn't know what the hell he's talking about?
But I knew that answer before conducting the test.

```
 0
Reply robin51 (256) 5/24/2010 3:36:50 PM

```In comp.lang.pl1 Jerome H. Fine <jhfinedp3k@nospicedham.dp3knospamcompsys.to.dotsrc.org> wrote:
(snip)

> I just want to say THANK  YOU to both Dick and Terje for making the
> complete program (the current version that is) available.  The discussion
> of L1 cache usage also provides valuable insight to what your concepts
> are and how they impact the code.  While the percentage of individuals
> who actually consider cache usage when an algorithm is being designed
> is certainly higher than usual with the followers of newsgroups, I doubt
> that it is always a high priority.

The only place I know it commonly done is matrix multiply and
matrix transpose.  In both cases, one matrix is traversed in the
"wrong" direction (for the cache), and so the effect is easily seen.

> It would be interesting to develop some
> tools to measure hit rates.  In that regard, could Intel and AMD add
> any microcode which would help the user?  Does anyone have any idea
> how much extra time (if any) it might take to maintain a hit rate register
> for cache hits and misses as part of the microcode for the CPU, just as
> it is possible to obtain the number of cycles?

-- glen
```
 0

```In comp.lang.pl1 Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> wrote:
>> Does anyone have any idea how much extra time (if any) it might
>> take to maintain a hit rate register for cache hits and misses
>> as part of the microcode for the CPU, just as it is possible
>> to obtain the number of cycles?

> This has existed since the original Pentium, in the form of the EMON
> counters!

Otherwise, as far as I understand it, it was first done with emulation.

There is somewhere the description of the work IBM did for the 360/85,
I believe the first machine with cache.  (Before the term cache.)

They used emulation or simulation to measure the data access patterns
of actual code to determine the expected hit rate for the cache.
That was somewhere around 1968.  (Presumably earlier, if the machine
came out in 1968.)

-- glen
```
 0

```On 2010-05-24 06:57:18 -0400, Peter Flass said:

> robin wrote:
>> "Shmuel (Seymour J.) Metz" <spamtrap@library.lspace.org.invalid> wrote
>> in message news:4bf92af5\$1\$fuzhry+tra\$mr2ice@news.patriot.net...
>> | In <4bf89ddf\$0\$89666\$c30e37c6@exi-reader.telstra.net>, on 05/23/2010
>> |   at 01:15 PM, "robin" <robin51@dodo.com.au> said:
>> |
>> | >A BEGIN block is treated as if it were a procedure.
>> |
>> | Water is wet.
>>
>> I'm referring to the code that is produced.
>>
>> | >Well, no.  Certainly storage is allocated [for ALLOCATE], but only
>> | >for the named variables.
>> |
>> | The storage for a DSA is pretty much noise;
>>
>> But it's an extra overhead that ALLOCATE doesn't have
>> [apart from the that required for the specificate ALLOCATE].
>>
>
> OTOH, although I haven't looked at this in detail, CONTROLLED storage
> has interactions with Tasking (maybe not with the Enterprise compilers)
> that adds quite another level of overhead.  It makes things much less
> simple than just push and pop.

Yes, the old PL/I tasking model imposed a great deal of strangeness and
overhead (that alone should be enough to dispose of the old canard that
PL/I was designed specifically for OS/360), and IBM's final approach to
solving the problem was simply to dump it completely and replace it
with little more than a direct interface to the underlying operating

--
John W Kennedy
"Sweet, was Christ crucified to create this chat?"
-- Charles Williams.  "Judgement at Chelmsford"

```
 0
Reply jwkenne (1440) 5/24/2010 6:38:50 PM

```On 2010-05-24 08:14:11 -0400, David W Noon said:

> -----BEGIN PGP SIGNED MESSAGE-----
>
> Hash: SHA1
>
>
>
> On Sun, 23 May 2010 23:46:56 -0400, John W Kennedy wrote about Re:
>
> Optimization problem:
>
>
>
> >On 2010-05-22 07:54:13 -0400, David W Noon said:
>
> [snip]
>
> >> Of course, this is a somewhat exotic use case.  But when confronted
>
> >> by such a requirement, I want CONTROLLED storage.
>
> >
>
> >But that is a way to force a language designed in the 60s, implemented
>
> >on an operating system designed in the 60s,* to do what more modern
>
> >languages do with modules or packages, and still more modern languages
>
> >do with objects.
>
>
>
> The 1960s have nothing to do with whether a software design is good or
>

"Nothing", forsooth? As though decades of hindsight are worthless?

> Declaring a variable at package scope effectively makes it STATIC.

"Static" in some sense or other, but in a modern operating system,
"static" has very different implications from those in OS/360. (This is
why the MVS C prelinker was invented -- the existing pseudo-register
kludge was not adequate for porting Unix C programs.) Only a year or
two after OS/360, TSS/360 introduced PSECTs to avoid the problems in
reentrance combined with static data; I don't know whether the TSS PL/I
compiler took advantage of them -- probably not.

In any case, using a package or object also makes clear the distinction
between things that ought to be stateful and things that ought not.

--
John W Kennedy
"Though a Rothschild you may be
As a Company you've come to utter sorrow--
But the Liquidators say,
'Never mind--you needn't pay,'
So you start another company to-morrow!"
-- Sir William S. Gilbert.  "Utopia Limited"

```
 0
Reply jwkenne (1440) 5/24/2010 6:51:54 PM

```On Sun, 16 May 2010 18:28:26 +0000
Robert AH Prins <spamtrap@prino.org> wrote:

> Hi all,
>
> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;

That depends on if you are using hyperthreaded CPU (cheaper variant)
or real multicore?

Greets1

--
http://maxa.homedns.org/

Sometimes online sometimes not

Svima je "dozvoljeno" biti idiot i
> mrak, ali samo neki to odaberu,

```
 0

```John W Kennedy wrote:
> On 2010-05-24 08:14:11 -0400, David W Noon said:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>>
>> Hash: SHA1
>>
>>
>>
>> On Sun, 23 May 2010 23:46:56 -0400, John W Kennedy wrote about Re:
>>
>> Optimization problem:
>>
>>
>>
>> >On 2010-05-22 07:54:13 -0400, David W Noon said:
>>
>> [snip]
>>
>> >> Of course, this is a somewhat exotic use case.  But when confronted
>>
>> >> by such a requirement, I want CONTROLLED storage.
>>
>> >
>>
>> >But that is a way to force a language designed in the 60s, implemented
>> >on an operating system designed in the 60s,* to do what more modern
>> >languages do with modules or packages, and still more modern languages
>> >do with objects.
>>
>>
>>
>> The 1960s have nothing to do with whether a software design is good or
>>
>
> "Nothing", forsooth? As though decades of hindsight are worthless?
>
>> Declaring a variable at package scope effectively makes it STATIC.
>
> "Static" in some sense or other, but in a modern operating system,
> "static" has very different implications from those in OS/360. (This is
> why the MVS C prelinker was invented -- the existing pseudo-register
> kludge was not adequate for porting Unix C programs.) Only a year or two
> after OS/360, TSS/360 introduced PSECTs to avoid the problems in
> reentrance combined with static data; I don't know whether the TSS PL/I
> compiler took advantage of them -- probably not.

I think it did -- Bitsavers has the TSS PL/I manual.  This is somewhat
unfortunate, since the TSS PL/I compiler is apparently copyrighted and
therefore unavailable.

>
> In any case, using a package or object also makes clear the distinction
> between things that ought to be stateful and things that ought not.
>
```
 0
Reply Peter_Flass (956) 5/24/2010 10:39:29 PM

```Peter Flass wrote:
....
>
> OTOH, although I haven't looked at this in detail, CONTROLLED storage
> has interactions with Tasking (maybe not with the Enterprise compilers)
> that adds quite another level of overhead.  It makes things much less
> simple than just push and pop.
>
With the F and Optimizing compilers, each task had its own allocation stack, emanating from the
pseudo register vector, for each controlled variable known within that task. It is not possible for
one task to refer directly to a generation of a controled variable allocated in another task. Any
sharing of controlled data had to be done by parameter passing or pointers, and any needed
synchronization or mutual exclusion was the responsibility of the programmer. If no sharing is

With the enterprise compiler, vaos390, and the workstation products, there is one system wide
allocation stack for each controller variable. Each thread can reference the current generation of
all controlled variables known at the point of reference, as well as allocate new generations or
free existing ones. Again any required synchronization or mutual exclusion is the programmer's
```
 0
Reply jjw (608) 5/25/2010 2:19:33 AM

```On 2010-05-21 06:56, Terje Mathisen wrote:
> James J. Weinkam wrote:
>> Robert AH Prins wrote:
>>>
>>> His code is in Pascal, and until I started running it for parts of the
>>> input string I had had blind faith in it. Wrong, as it turns out as
>>> there are three strings that it does not process correctly, and all my
>>> debugging hasn't given me any clues as to the why.
>>>
>> Can you send a copy of the data set that fails? Also, if it isn't prying
>> into your business, would you describe the application this problem is
>> abstracted from? You indicated in your original post that the values
>> (characters in the posted esample) were actually indices into a table in
>> the real application.
>
> I would also love to see that, I really do believe that my current
> approach is sound.

Group 61 - the x-marked 3-string (FEG) is not found by the Paul Green code:

61  1 A
61  2 A
61  3 B
61  4 C
61  5 D
61  6 E
61  7 F x
61  8 E x
61  9 G x
61 10 G
61 11 G
61 12 G
61 13 G
61 14 G
61 15 G
61 16 G
61 17 G
61 18 G
61 19 H
61 20 I
61 21 C
61 22 D
61 23 A
61 24 A
61 25 A
61 26 A
61 27 A

69  1 A
69  2 A
69  3 A
69  4 A
69  5 A
69  6 B
69  7 B
69  8 C
69  9 C
69 10 D
69 11 E
69 12 F
69 13 G
69 14 H
69 15 H
69 16 H
69 17 H x
69 18 I x
69 19 J x
69 20 J
69 21 J

Will extract the code from the Pascal program, and get it working RSN,
bit tired from travel, it takes as long to get from Vilnius to Charleroi
as from Charleroi to Oostende.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```On 2010-05-23 08:37, Terje Mathisen wrote:
> Dick Wesseling wrote:
>> In article<11lic7-2u52.ln1@ntp.tmsw.no>,
>> Terje Mathisen<"terje.mathisen at tmsw.no"@giganews.com> writes:
>>> Still, running at 130+ cycles per input byte is still fast enough to
>>> handle significant amounts of data.
>>>
>>
>> Sounds like you've beaten me. My O() depends on the size of the longest
>> sequence minus the size of the longest unique sequence, which is good
>> for the reference data. If I time 4K of random data I get 7250 cycles per
>> input byte. Since sqrt(C) is 16 for that dataset you ought to get about
>> 800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )
>
> It is worse (or better, seen from my point):
>
> With 4K random input the scan took 252K cycles, which means that I
> averaged about 83 cycles/byte. (I.e. very fast!)
>
> The problem was that with totally random data, there are a _lot_ of both
> potential and real solutions, I found 4099 candidates (i.e. just over
> one per starting byte!) and 121 after filtering.
>
> The filter process here took a _lot_ of time though: 8M cycles!
>
> This means that it needed 2000 cycles/input byte. :-(
>
> This simply shows that the problem space (and therefore required
> algorithm as well) almost certainly won't contain totally random data,
> and this will most probably help my code...

The data is in fact not very random, but you'll be hard pressed to
figure out what it represents. Here is the current full set, represented
by 66

abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-/*

printable chars:

eeeebbeebeeeeeefeeegggdggggggggggggggddddddddddddddddccbbbbhbbbbibjjjj
ccccddbbbbbbqlllnhnnnnrhblkaabpbbbbbllknnnnnbbaaaaaabbbbnkaaalbllllhln
nnnsinnkkkkhhpblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucg
gdbhhhhhhhhhhhhhhhhohhhhhhhhrhhhhhhhhhhhhbhhhinjnnnnaaaaaaaiiiiiioojoa
aaaooooojjjjiiiiiioviiiaaaaaabbibrrrrrrrrrrrrrrrnnnnnbrrrrbbaaaiiiiiii
iiiiiiiiiiiiiiiiiiwiiiiiiiiixyiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iihiiiiiiziiiiiiiiiiiooaaaaaaabbbbfbbbrrrrgginnralarabbaaaiiiiiiiiAiii
iiibiiiiiiiiiiiiiiiiiiiiiiiibiiiiiiiiiiiiiiiiiiiiiiiiiiabbbblbrrairrbr
lnnnjnirrrblbbjaaaaabbbbbaiiiiiiiiiiiiiiiBiiiiiiiiiiriiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiCiiiiiiiiiiiiiiiriiiiiiiiiiiiibiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiviiiiiiiiiiiiiiiiiiiiiiabhbiblelrrribnbbbo
bbbbbbbbbbbbbbbGbbbbbbbbbbrrrrrnlrHcbbrbbbaiiiiiiIiiiicoaaaaabbbbbibab
bbbybbbabbbbbbooooaiiiiiiiiiiiiiiiiiiaaaabalbbbbbbblbnrrrrcbbbaiiiiixi
ziiiiiiiiaaaaaabbbabrJbinnaiiiiaaaaaabbGJKLLLLLMLdddddddaaaaaaioiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiNiiiiiiiiiiiiiifiiibbhcbbacccbccO
bbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgdd
dgSdddddjcccccccccjaaaaaaafabbNGGLLLLLLLLGGGrbbbaoiiiiiiiiiiiiiiiaaiii
iibbbbbGGizzzziiiTiiiiijbNbLaGGGGKLLLLLLLLLLDDDDDDDDDDDDDDDDaaaaaabbbG
GGGGGGGLLLLGbaaaaaoooobiaiaoiiiiiiiiiiibaMbbUbGGGLLLLLLGGGGGGNbbooiiii
iiiiiiijibmabbGLLLLLLVbbbioiiiiiiiiiiiiiiiiiiiWiiiiiiXiaaaoooiaiiiiiii
iaaaaaGaVaYGZLLLLLLL0KGGGGGGGGbbooooiaiiiiii1iooaabbGaGGGGGGLLLLLLLLbG
GGGGGbaaaaajwiiiiiiiiiihaoGCGLLLLLLLLLLKbaoiiiiiiioabGLLLLLLLLEGGGbbob
ooon2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaa
aaGGLLLLLLLLGGGGGGGGVGGVGbbbaaNbooiaiiiii0ooaYLLLLKGGGGGGLihiiiiijjoog
5zabbbbGLLLiiiiiiiiziiiiiiooaaaabbbGbooboooommmmoooooooaaoaoiooowooabm
mmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Yoooooo
iGbbbbbaaaaao+oobGbbbLGLLLLLbbbjiiooobbGbbmo-ooaoJbbbbVbbbbbb/baoooooo
o*bbbbbbbbbbbbbbbbbbaotoommooojaoooo-ooahbbbbbbbaaa3ooooobbllbbrbbbbb*
bmaoooooobobbbbbooo (2329 characters)

The top-3 account for 58.3 % of all values, the top-10 for 85.2%.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0
Reply spamtrap14 (66) 5/25/2010 11:04:58 PM

```In article <862qk6Fe5gU1@mid.individual.net>,
Robert AH Prins <spamtrap@prino.org> writes:

> Group 61 - the x-marked 3-string (FEG) is not found by the Paul Green code:

What a relief, mine works:

5641 cycles 6 solutions
AABCDEFEGGGGGGGGGGHICDAAAAA
len  win  pos
9   19    1 ABCDEFEGGGGGGGGGGHI
8   17    6 FEGGGGGGGGGGHICDA
7    8    1 ABCDEFEG
6    6    1 ABCDEF
3    3    6 FEG                      <there it is!
6    6   17 GHICDA

5630 cycles 6 solutions
AAAAABBCCDEFGHHHHIJJJ
len  win  pos
10   15    4 ABBCCDEFGHHHHIJ
9   13    6 BCCDEFGHHHHIJ
8   10    4 ABBCCDEFGH
7    8    6 BCCDEFGH
6    6    8 CDEFGH
3    3   16 HIJ                      <there it is!

> Will extract the code from the Pascal program, and get it working RSN,
> bit tired from travel, it takes as long to get from Vilnius to Charleroi
> as from Charleroi to Oostende.

If it isn't asking too much, once you've taken some rest would you mind
posting one of the longer (2K or so) data sets?
Earlier I had to disappoint the readers of clax by saying that it wasn't
worth bothering about an asm version because it outperformed the C
version by a small margin only. However, that was for random data.
Random data suffocates branch prediction causing _any_ version to come
to a grinding halt (*).

For real data such as group 61 and 69 and the original example the asm
outperformed the high level version by 30% or so. Still not something to
get excited about, but I'm curious how it would perform on data sets
that are both real and large.

(*) I expect professor Weinkam's solution to perform much better for
random data. If my understanding is correct most of the time is spend
in:

do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;

a branchless implementation of min() should work fine, and even a
branching one should should have a 99% branch prediction success rate
because overflow will happen only for very large data sets.
```
 0

```Dick Wesseling wrote:
> In article<862qk6Fe5gU1@mid.individual.net>,
> 	Robert AH Prins<spamtrap@prino.org>  writes:
>
>> Group 61 - the x-marked 3-string (FEG) is not found by the Paul Green code:
>
> What a relief, mine works:
>
> 5641 cycles 6 solutions
> AABCDEFEGGGGGGGGGGHICDAAAAA
>   len  win  pos
>     9   19    1 ABCDEFEGGGGGGGGGGHI
>     8   17    6 FEGGGGGGGGGGHICDA
>     7    8    1 ABCDEFEG
>     6    6    1 ABCDEF
>     3    3    6 FEG<there it is!
>     6    6   17 GHICDA

I got these:
C:\>\c2\unique-strings\Release\unique-strings.exe
len aperture from to value
3      3     7   9 FEG
6      6    18  23 GHICDA
6      6     2   7 ABCDEF
7      8     2   9 ABCDEFEG
8     17     7  23 FEGGGGGGGGGGHICDA
9     19     2  20 ABCDEFEGGGGGGGGGGHI
Time to scan:         3278 (26 potential solutions)
Time to filter:      10736 (6 filtered solutions)
>
>
> 5630 cycles 6 solutions
> AAAAABBCCDEFGHHHHIJJJ
>   len  win  pos
>    10   15    4 ABBCCDEFGHHHHIJ
>     9   13    6 BCCDEFGHHHHIJ
>     8   10    4 ABBCCDEFGH
>     7    8    6 BCCDEFGH
>     6    6    8 CDEFGH
>     3    3   16 HIJ<there it is!

and so did I:

len aperture from to value
3      3    17  19 HIJ
6      6     9  14 CDEFGH
7      8     7  14 BCCDEFGH
8     10     5  14 ABBCCDEFGH
9     13     7  19 BCCDEFGHHHHIJ
10     15     5  19 ABBCCDEFGHHHHIJ
Time to scan:         2453 (25 potential solutions)
Time to filter:       9735 (6 filtered solutions)

I have however found a _much_ faster way to filter my potential
solutions, I believe I can make it run in close to linear time. :-)

Terje
>
>> Will extract the code from the Pascal program, and get it working RSN,
>> bit tired from travel, it takes as long to get from Vilnius to Charleroi
>> as from Charleroi to Oostende.
>
> If it isn't asking too much, once you've taken some rest would you mind
> posting one of the longer (2K or so) data sets?
> Earlier I had to disappoint the readers of clax by saying that it wasn't
> worth bothering about an asm version because it outperformed the C
> version by a small margin only. However, that was for random data.
> Random data suffocates branch prediction causing _any_ version to come
> to a grinding halt (*).
>
> For real data such as group 61 and 69 and the original example the asm
> outperformed the high level version by 30% or so. Still not something to
> get excited about, but I'm curious how it would perform on data sets
> that are both real and large.
>
> (*) I expect professor Weinkam's solution to perform much better for
> random data. If my understanding is correct most of the time is spend
> in:
>
>    do i=c-1 to 1 by -1; a(i,*)=min(32766,a(i+1,*)+1); a(i,x(i))=1; end;
>
> a branchless implementation of min() should work fine, and even a
> branching one should should have a 99% branch prediction success rate
> because overflow will happen only for very large data sets.

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Robert AH Prins wrote:
> On 2010-05-23 08:37, Terje Mathisen wrote:
>> Dick Wesseling wrote:
>>> In article<11lic7-2u52.ln1@ntp.tmsw.no>,
>>> Terje Mathisen<"terje.mathisen at tmsw.no"@giganews.com> writes:
>>>> Still, running at 130+ cycles per input byte is still fast enough to
>>>> handle significant amounts of data.
>>>>
>>>
>>> Sounds like you've beaten me. My O() depends on the size of the longest
>>> sequence minus the size of the longest unique sequence, which is good
>>> for the reference data. If I time 4K of random data I get 7250 cycles
>>> per
>>> input byte. Since sqrt(C) is 16 for that dataset you ought to get about
>>> 800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )
>>
>> It is worse (or better, seen from my point):
>>
>> With 4K random input the scan took 252K cycles, which means that I
>> averaged about 83 cycles/byte. (I.e. very fast!)
>>
>> The problem was that with totally random data, there are a _lot_ of both
>> potential and real solutions, I found 4099 candidates (i.e. just over
>> one per starting byte!) and 121 after filtering.
>>
>> The filter process here took a _lot_ of time though: 8M cycles!
>>
>> This means that it needed 2000 cycles/input byte. :-(
>>
>> This simply shows that the problem space (and therefore required
>> algorithm as well) almost certainly won't contain totally random data,
>> and this will most probably help my code...
>
> The data is in fact not very random, but you'll be hard pressed to
> figure out what it represents. Here is the current full set, represented
> by 66
>
> abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-/*
>
> printable chars:
>
> eeeebbeebeeeeeefeeegggdggggggggggggggddddddddddddddddccbbbbhbbbbibjjjj
> ccccddbbbbbbqlllnhnnnnrhblkaabpbbbbbllknnnnnbbaaaaaabbbbnkaaalbllllhln
> nnnsinnkkkkhhpblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucg
> gdbhhhhhhhhhhhhhhhhohhhhhhhhrhhhhhhhhhhhhbhhhinjnnnnaaaaaaaiiiiiioojoa
> aaaooooojjjjiiiiiioviiiaaaaaabbibrrrrrrrrrrrrrrrnnnnnbrrrrbbaaaiiiiiii
> iiiiiiiiiiiiiiiiiiwiiiiiiiiixyiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
> iihiiiiiiziiiiiiiiiiiooaaaaaaabbbbfbbbrrrrgginnralarabbaaaiiiiiiiiAiii
> iiibiiiiiiiiiiiiiiiiiiiiiiiibiiiiiiiiiiiiiiiiiiiiiiiiiiabbbblbrrairrbr
> lnnnjnirrrblbbjaaaaabbbbbaiiiiiiiiiiiiiiiBiiiiiiiiiiriiiiiiiiiiiiiiiii
> iiiiiiiiiiiiiiiiiiiiiiiCiiiiiiiiiiiiiiiriiiiiiiiiiiiibiiiiiiiiiiiiiiii
> iiiiiiiiiiiiiiiiiiiiiiiiiiiiviiiiiiiiiiiiiiiiiiiiiiabhbiblelrrribnbbbo
> bbbbbbbbbbbbbbbGbbbbbbbbbbrrrrrnlrHcbbrbbbaiiiiiiIiiiicoaaaaabbbbbibab
> bbbybbbabbbbbbooooaiiiiiiiiiiiiiiiiiiaaaabalbbbbbbblbnrrrrcbbbaiiiiixi
> ziiiiiiiiaaaaaabbbabrJbinnaiiiiaaaaaabbGJKLLLLLMLdddddddaaaaaaioiiiiii
> iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiNiiiiiiiiiiiiiifiiibbhcbbacccbccO
> bbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgdd
> dgSdddddjcccccccccjaaaaaaafabbNGGLLLLLLLLGGGrbbbaoiiiiiiiiiiiiiiiaaiii
> iibbbbbGGizzzziiiTiiiiijbNbLaGGGGKLLLLLLLLLLDDDDDDDDDDDDDDDDaaaaaabbbG
> GGGGGGGLLLLGbaaaaaoooobiaiaoiiiiiiiiiiibaMbbUbGGGLLLLLLGGGGGGNbbooiiii
> iiiiiiijibmabbGLLLLLLVbbbioiiiiiiiiiiiiiiiiiiiWiiiiiiXiaaaoooiaiiiiiii
> iaaaaaGaVaYGZLLLLLLL0KGGGGGGGGbbooooiaiiiiii1iooaabbGaGGGGGGLLLLLLLLbG
> GGGGGbaaaaajwiiiiiiiiiihaoGCGLLLLLLLLLLKbaoiiiiiiioabGLLLLLLLLEGGGbbob
> ooon2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaa
> aaGGLLLLLLLLGGGGGGGGVGGVGbbbaaNbooiaiiiii0ooaYLLLLKGGGGGGLihiiiiijjoog
> 5zabbbbGLLLiiiiiiiiziiiiiiooaaaabbbGbooboooommmmoooooooaaoaoiooowooabm
> mmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Yoooooo
> iGbbbbbaaaaao+oobGbbbLGLLLLLbbbjiiooobbGbbmo-ooaoJbbbbVbbbbbb/baoooooo
> o*bbbbbbbbbbbbbbbbbbaotoommooojaoooo-ooahbbbbbbbaaa3ooooobbllbbrbbbbb*
> bmaoooooobobbbbbooo (2329 characters)
>
> The top-3 account for 58.3 % of all values, the top-10 for 85.2%.

Here's my results, with the longer result strings truncated:
len aperture from to value
3      3  2291 2293 a3o
3      3  2271 2273 jao
3      3  2270 2272 oja
3      3  2241 2243 o*b
3      3  2201 2203 bji
3      3  2191 2193 bLG
3      3  2186 2188 obG
3      3  2182 2184 ao+
3      3  2159 2161 oMm
3      3  2158 2160 moM
3      3  2117 2119 IjL
3      3  2089 2091 aoi
3      3  2066 2068 Gbo
3      3  2037 2039 bGL
3      3  2010 2012 LKG
3      3  2001 2003 i0o
3      3  1994 1996 oia
3      3  1984 1986 VGb
3      3  1955 1957 iba
3      3  1939 1941 oai
3      3  1935 1937 4ba
3      3  1915 1917 brH
3      3  1906 1908 oVb
3      3  1903 1905 jio
3      3  1882 1884 LEG
3      3  1848 1850 CGL
3      3  1825 1827 Gba
3      3  1818 1820 LbG
3      3  1802 1804 bGa
3      3  1795 1797 1io
3      3  1786 1788 oia
3      3  1757 1759 GaV
3      3  1741 1743 oia
3      3  1734 1736 Xia
3      3  1705 1707 bio
3      3  1701 1703 LVb
3      3  1694 1696 bGL
3      3  1691 1693 mab
3      3  1671 1673 GNb
3      3  1655 1657 UbG
3      3  1651 1653 aMb
3      3  1637 1639 aoi
3      3  1636 1638 iao
3      3  1573 1575 GKL
3      3  1549 1551 Giz
3      3  1514 1516 Grb
3      3  1500 1502 bNG
3      3  1497 1499 fab
3      3  1488 1490 cja
3      3  1478 1480 djc
3      3  1472 1474 gSd
3      3  1471 1473 dgS
3      3  1455 1457 biG
3      3  1437 1439 bao
3      3  1399 1401 cOb
3      3  1392 1394 bac
3      3  1389 1391 hcb
3      3  1388 1390 bhc
3      3  1322 1324 aio
3      3  1308 1310 MLd
3      3  1286 1288 nai
3      3  1259 1261 xiz
3      3  1252 1254 bai
3      3  1248 1250 rcb
3      3  1233 1235 alb
3      3  1232 1234 bal
3      3  1208 1210 oai
3      3  1187 1189 iba
3      3  1162 1164 bai
3      3  1151 1153 rnl
3      3  1049 1051 bor
3      3  1039 1041 elr
3      3  1034 1036 hbi
3      3   924 926 ria
3      3   795 797 bai
3      3   784 786 bja
3      3   780 782 rbl
3      3   765 767 air
3      3   764 766 rai
3      3   761 763 lbr
3      3   755 757 iab
3      3   682 684 rab
3      3   680 682 lar
3      3   674 676 gin
3      3   589 591 xyi
3      3   588 590 ixy
3      3   543 545 nbr
3      3   522 524 ibr
3      3   509 511 ovi
3      3   508 510 iov
3      3   488 490 joa
3      3   400 402 knb
3      3   377 379 gda
3      3   354 356 sin
3      3   353 355 nsi
3      3   348 350 hln
3      3   341 343 alb
3      3   318 320 lkn
3      3   309 311 abp
3      3   296 298 lnh
3      3   292 294 bql
3      3   237 239 dgc
3      3   219 221 apb
3      3   211 213 iba
3      3   207 209 lib
3      3   198 200 nbk
3      3   186 188 kbn
3      3   166 168 cba
3      3   151 153 acb
3      3   140 142 jkb
3      3   135 137 ibj
3      3    60  62 acd
3      3    58  60 bca
3      3    46  48 dea
4      4  2279 2282 oahb
4      4  2260 2263 baot
4      4  2232 2235 /bao
4      4  2218 2221 aoJb
4      4  2212 2215 bmo-
4      4  2170 2173 oiGb
4      4  2162 2165 m9Yo
4      4  2149 2152 Fb0o
4      4  2121 2124 oC7j
4      4  2112 2115 BI6j
4      4  2097 2100 oabm
4      4  2017 2020 GLih
4      4  2004 2007 oaYL
4      4  1990 1993 aNbo
4      4  1909 1912 Vbr3
4      4  1893 1896 on2i
4      4  1831 1834 ajwi
4      4  1770 1773 L0KG
4      4  1649 1652 ibaM
4      4  1632 1635 obia
4      4  1621 1624 LGba
4      4  1563 1566 ijbN
4      4  1518 1521 baoi
4      4  1446 1449 iFra
4      4  1438 1441 aobd
4      4  1279 1282 abrJ
4      4  1242 1245 lbnr
4      4  1174 1177 icoa
4      4  1108 1111 Fahb
4      4  1054 1057 rabE
4      4  1043 1046 ribn
4      4  1036 1039 ible
4      4  1031 1034 iabh
4      4   775 778 jnir
4      4   769 772 brln
4      4   677 680 nral
4      4   465 468 hinj
4      4   421 424 gdbh
4      4   417 420 bucg
4      4   374 377 cbdg
4      4   363 366 hpbl
4      4   336 339 bnka
5      5  2310 2314 *bmao
5      5  2145 2149 0G8bF
5      5  2125 2129 jbmoa
5      5  2119 2123 LjoC7
5      5  1920 1924 alhGb
5      5  1688 1692 jibma
5      5  1566 1570 NbLaG
5      5  1465 1469 LMKgd
5      5  1443 1447 bRriF
5      5  1406 1410 PQBub
5      5  1405 1409 bPQBu
5      5  1299 1303 bGJKL
5      5  1281 1285 rJbin
5      5   389 393 fbtkn
5      5   180 184 ablmk
6      6  2029 2034 og5zab
6      6  1918 1923 HbalhG
6      6  1870 1875 ioabGL
6      6  1859 1864 LKbaoi
6      6  1843 1848 ihaoGC
6      6  1759 1764 VaYGZL
6      6  1447 1452 Frainb
6      6  1152 1157 nlrHcb
7      7   302 308 nrhblka
8     10  2025 2034 ijjoog5zab
8     10  1438 1447 aobdbbRriF
8     10   302 311 nrhblkaabp
9     13  2117 2129 IjLjoC7jjbmoa
9     13  2027 2039 joog5zabbbbGL
9     13  1439 1451 obdbbRriFrain
10     15  2025 2039 ijjoog5zabbbbGL
11     18  2112 2129 BI6jjIjLjoC7jjbmoa
11     18  2017 2034 GLihiiiiijjoog5zab
12     21  1903 1923 jiooVbVbr3rrbrHHbalhG
13     25  2112 2136 BI6jjIjLjoC7jjbmoaoooiooE
14     28  2112 2139 BI6jjIjLjoC7jjbmoaoooiooEEE0
15     33  2117 2149 IjLjoC7jjbmoaoooiooEEE0000000G8bF
16     36  2114 2149 6jjIjLjoC7jjbmoaoooiooEEE0000000G8bF
16     36  2112 2147 BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8
17     38  2112 2149 BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bF
18     48  2117 2164 IjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
19     51  2114 2164 6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
20     53  2112 2164 BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
21     70  2095 2164
wooabmmmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
22     81  1400 1480
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
22     81  1399 1479
cObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKg
23     91  1389 1479
hcbbacccbccObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGG
23     91  1383 1473
fiiibbhcbbacccbccObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainb
24     97  1383 1479
fiiibbhcbbacccbccObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainb
25    112  1368 1479
NiiiiiiiiiiiiiifiiibbhcbbacccbccObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibb
26    131  2112 2242
BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9YooooooiGbbbbbaaaa
27    148  2095 2242
wooabmmmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
28    169  2095 2263
wooabmmmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
29    181  2112 2292
BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9YooooooiGbbbbbaaaa
30    189  2112 2300
BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9YooooooiGbbbbbaaaa
31    193  2112 2304
BI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9YooooooiGbbbbbaaaa
32    210  2095 2304
wooabmmmmBBBBBBBBBI6jjIjLjoC7jjbmoaoooiooEEE0000000G8bFb0oooommmoMmm9Y
33    252  1991 2242
Nbooiaiiiii0ooaYLLLLKGGGGGGLihiiiiijjoog5zabbbbGLLLiiiiiiiiziiiiiiooaa
33    252  1912 2163
3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaGGLLLLLLLLGGGGGGGGV
34    267  1894 2160
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
35    270  1894 2163
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
36    291  1894 2184
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
37    314  1991 2304
Nbooiaiiiii0ooaYLLLLKGGGGGGLihiiiiijjoog5zabbbbGLLLiiiiiiiiziiiiiiooaa
38    327  1894 2220
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
39    339  1894 2232
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
40    349  1894 2242
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
41    370  1894 2263
n2iiiiijjjiooVbVbr3rrbrHHbalhGbbGGGLLLbbb4baaoaiiiiwiiYiiiiiiibaaaaaaG
42    469  1795 2263
1iooaabbGaGGGGGGLLLLLLLLbGGGGGGbaaaaajwiiiiiiiiiihaoGCGLLLLLLLLLLKbaoi
43    501  1763 2263
ZLLLLLLL0KGGGGGGGGbbooooiaiiiiii1iooaabbGaGGGGGGLLLLLLLLbGGGGGGbaaaaaj
44    516  1727 2242
WiiiiiiXiaaaoooiaiiiiiiiiaaaaaGaVaYGZLLLLLLL0KGGGGGGGGbbooooiaiiiiii1i
45    536  1400 1935
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
46    609  1655 2263
UbGGGLLLLLLGGGGGGNbbooiiiiiiiiiiijibmabbGLLLLLLVbbbioiiiiiiiiiiiiiiiii
47    664  1600 2263
DaaaaaabbbGGGGGGGGLLLLGbaaaaaoooobiaiaoiiiiiiiiiiibaMbbUbGGGLLLLLLGGGG
48    706  1558 2263
TiiiiijbNbLaGGGGKLLLLLLLLLLDDDDDDDDDDDDDDDDaaaaaabbbGGGGGGGGLLLLGbaaaa
49    724  1400 2123
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
50    748  1400 2147
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
51    764  1400 2163
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
52    785  1400 2184
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
53    815  1406 2220
PQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgdddgSdd
54    821  1400 2220
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
55    833  1400 2232
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
56    843  1400 2242
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
57    864  1400 2263
ObbbbbPQBubbbiiiiiiiiiiiiiiiiKiiiiiibbaobdbbRriFrainbbbbiGGGLLLLLLMKgd
58   1005  1259 2263
xiziiiiiiiiaaaaaabbbabrJbinnaiiiiaaaaaabbGJKLLLLLMLdddddddaaaaaaioiiii
59   1070  1194 2263
ybbbabbbbbbooooaiiiiiiiiiiiiiiiiiiaaaabalbbbbbbblbnrrrrcbbbaiiiiixizii
60   1225  1039 2263
61   1255  1009 2263
62   1567   697 2263
Aiiiiiibiiiiiiiiiiiiiiiiiiiiiiiibiiiiiiiiiiiiiiiiiiiiiiiiiiabbbblbrrai
63   1852   391 2242
tknnnnnnnknbbbbbaaaaaaabbbbucggdbhhhhhhhhhhhhhhhhohhhhhhhhrhhhhhhhhhhh
64   1879   364 2242
pblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucggdbhhhhhhhhhh
64   1879   354 2232
sinnkkkkhhpblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucggdb
65   1889   354 2242
sinnkkkkhhpblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucggdb
66   1950   293 2242
qlllnhnnnnrhblkaabpbbbbbllknnnnnbbaaaaaabbbbnkaaalbllllhlnnnnsinnkkkkh

Time to scan:      2285008 (1200 potential solutions)
Time to filter:    2459754 (253 filtered solutions)

Does this seem reasonable?

The total solution time is about 2.2 ms...

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```James J. Weinkam wrote:
> Peter Flass wrote:
> ...
>>
>> OTOH, although I haven't looked at this in detail, CONTROLLED storage
>> has interactions with Tasking (maybe not with the Enterprise
>> compilers) that adds quite another level of overhead.  It makes things
>> much less simple than just push and pop.
>>
> With the F and Optimizing compilers, each task had its own allocation
> stack, emanating from the pseudo register vector, for each controlled
> variable known within that task. It is not possible for one task to
> refer directly to a generation of a controled variable allocated in
> another task. Any sharing of controlled data had to be done by parameter
> passing or pointers, and any needed synchronization or mutual exclusion
> was the responsibility of the programmer. If no sharing is taking place

That's interesting.  The manual describes this behavior, but gives no
idea that it's just an unfortunate artifact of the implementation.  I
was thinking it was intended, for some reason or other.  Thanks.
```
 0
Reply Peter_Flass (956) 5/26/2010 10:45:31 AM

```Robert AH Prins wrote:
> On 2010-05-26 08:43, Terje Mathisen wrote:
> A very quick look at the results tells me they are OK. My results
> perform one other, minor massage of the output by removing duplicates,
> so to take an example
>
> 15 33 2117 2149 IjLjoC7jjbmoaoooiooEEE0000000G8bF
>
> would come out as
>
> 15 33 2117 2149 IjLoC7bmaiE0G8F
>
> Or even more correct, as the 4 character strings indexed by the values
> of these characters. Nit-picking...
>
>> Time to scan: 2285008 (1200 potential solutions)
>> Time to filter: 2459754 (253 filtered solutions)
>>
>> Does this seem reasonable?
>>
>> The total solution time is about 2.2 ms...
>
> I think my very first Pascal (Turbo Pascal 3.01a) solution using a
> sliding window without any optimizations ran for around 26 *minutes*
> (OK, on a 16MHz 386) on a far smaller subset (less than half) of this
> data, so I think 2.2 ms is pretty reasonable. ;)

My first usable compiler (on a PC) was TP 1.0. :-)

Currently my best runtime has been:

65   1889   354 2242
sinnkkkkhhpblbbbbbcccbdgdaaaaabbbbbfbtknnnnnnnknbbbbbaaaaaaabbbbucggdb
66   1950   293 2242
qlllnhnnnnrhblkaabpbbbbbllknnnnnbbaaaaaabbbbnkaaalbllllhlnnnnsinnkkkkh
Time to scan:      1247257 (1200 potential solutions)
Time to filter:    1226621 (253 filtered solutions)

I.e. 2.47M cycles on a 2.2GHz cpu, for a total of about 1.1 ms.

I have also implemented the promised improved filter function, using
that I got:

Time to scan:      1048432 (1200 potential solutions)
Time to filter:     982135 (237 filtered solutions)

I.e. 0.9 ms, but I'm missing 16 of the shorter solutions. :-(

I'll have to recheck my logic...

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Robert AH Prins wrote:
> A very quick look at the results tells me they are OK. My results
> perform one other, minor massage of the output by removing duplicates,
> so to take an example
>
> 15 33 2117 2149 IjLjoC7jjbmoaoooiooEEE0000000G8bF
>
> would come out as
>
> 15 33 2117 2149 IjLoC7bmaiE0G8F

Here's with the same stripping of repeating result bytes:

len aperture from to value
3      3  2291 2293 a3o
3      3  2271 2273 jao
3      3  2270 2272 oja
3      3  2241 2243 o*b
3      3  2201 2203 bji
3      3  2191 2193 bLG
3      3  2186 2188 obG
3      3  2182 2184 ao+
3      3  2159 2161 oMm
3      3  2158 2160 moM
3      3  2117 2119 IjL
3      3  2089 2091 aoi
3      3  2066 2068 Gbo
3      3  2037 2039 bGL
3      3  2010 2012 LKG
3      3  2001 2003 i0o
3      3  1994 1996 oia
3      3  1984 1986 VGb
3      3  1955 1957 iba
3      3  1939 1941 oai
3      3  1935 1937 4ba
3      3  1915 1917 brH
3      3  1906 1908 oVb
3      3  1903 1905 jio
3      3  1882 1884 LEG
3      3  1848 1850 CGL
3      3  1825 1827 Gba
3      3  1818 1820 LbG
3      3  1802 1804 bGa
3      3  1795 1797 1io
3      3  1786 1788 oia
3      3  1757 1759 GaV
3      3  1741 1743 oia
3      3  1734 1736 Xia
3      3  1705 1707 bio
3      3  1701 1703 LVb
3      3  1694 1696 bGL
3      3  1691 1693 mab
3      3  1671 1673 GNb
3      3  1655 1657 UbG
3      3  1651 1653 aMb
3      3  1637 1639 aoi
3      3  1636 1638 iao
3      3  1573 1575 GKL
3      3  1549 1551 Giz
3      3  1514 1516 Grb
3      3  1500 1502 bNG
3      3  1497 1499 fab
3      3  1488 1490 cja
3      3  1478 1480 djc
3      3  1472 1474 gSd
3      3  1471 1473 dgS
3      3  1455 1457 biG
3      3  1437 1439 bao
3      3  1399 1401 cOb
3      3  1392 1394 bac
3      3  1389 1391 hcb
3      3  1388 1390 bhc
3      3  1322 1324 aio
3      3  1308 1310 MLd
3      3  1286 1288 nai
3      3  1259 1261 xiz
3      3  1252 1254 bai
3      3  1248 1250 rcb
3      3  1233 1235 alb
3      3  1232 1234 bal
3      3  1208 1210 oai
3      3  1187 1189 iba
3      3  1162 1164 bai
3      3  1151 1153 rnl
3      3  1049 1051 bor
3      3  1039 1041 elr
3      3  1034 1036 hbi
3      3   924 926 ria
3      3   795 797 bai
3      3   784 786 bja
3      3   780 782 rbl
3      3   765 767 air
3      3   764 766 rai
3      3   761 763 lbr
3      3   755 757 iab
3      3   682 684 rab
3      3   680 682 lar
3      3   674 676 gin
3      3   589 591 xyi
3      3   588 590 ixy
3      3   543 545 nbr
3      3   522 524 ibr
3      3   509 511 ovi
3      3   508 510 iov
3      3   488 490 joa
3      3   400 402 knb
3      3   377 379 gda
3      3   354 356 sin
3      3   353 355 nsi
3      3   348 350 hln
3      3   341 343 alb
3      3   318 320 lkn
3      3   309 311 abp
3      3   296 298 lnh
3      3   292 294 bql
3      3   237 239 dgc
3      3   219 221 apb
3      3   211 213 iba
3      3   207 209 lib
3      3   198 200 nbk
3      3   186 188 kbn
3      3   166 168 cba
3      3   151 153 acb
3      3   140 142 jkb
3      3   135 137 ibj
3      3    60  62 acd
3      3    58  60 bca
3      3    46  48 dea
4      4  2279 2282 oahb
4      4  2260 2263 baot
4      4  2232 2235 /bao
4      4  2218 2221 aoJb
4      4  2212 2215 bmo-
4      4  2170 2173 oiGb
4      4  2162 2165 m9Yo
4      4  2149 2152 Fb0o
4      4  2121 2124 oC7j
4      4  2112 2115 BI6j
4      4  2097 2100 oabm
4      4  2017 2020 GLih
4      4  2004 2007 oaYL
4      4  1990 1993 aNbo
4      4  1909 1912 Vbr3
4      4  1893 1896 on2i
4      4  1831 1834 ajwi
4      4  1770 1773 L0KG
4      4  1649 1652 ibaM
4      4  1632 1635 obia
4      4  1621 1624 LGba
4      4  1563 1566 ijbN
4      4  1518 1521 baoi
4      4  1446 1449 iFra
4      4  1438 1441 aobd
4      4  1279 1282 abrJ
4      4  1242 1245 lbnr
4      4  1174 1177 icoa
4      4  1108 1111 Fahb
4      4  1054 1057 rabE
4      4  1043 1046 ribn
4      4  1036 1039 ible
4      4  1031 1034 iabh
4      4   775 778 jnir
4      4   769 772 brln
4      4   677 680 nral
4      4   465 468 hinj
4      4   421 424 gdbh
4      4   417 420 bucg
4      4   374 377 cbdg
4      4   363 366 hpbl
4      4   336 339 bnka
5      5  2310 2314 *bmao
5      5  2145 2149 0G8bF
5      5  2125 2129 jbmoa
5      5  2119 2123 LjoC7
5      5  1920 1924 alhGb
5      5  1688 1692 jibma
5      5  1566 1570 NbLaG
5      5  1465 1469 LMKgd
5      5  1443 1447 bRriF
5      5  1406 1410 PQBub
5      5  1405 1409 bPQBu
5      5  1299 1303 bGJKL
5      5  1281 1285 rJbin
5      5   389 393 fbtkn
5      5   180 184 ablmk
6      6  2029 2034 og5zab
6      6  1918 1923 HbalhG
6      6  1870 1875 ioabGL
6      6  1859 1864 LKbaoi
6      6  1843 1848 ihaoGC
6      6  1759 1764 VaYGZL
6      6  1447 1452 Frainb
6      6  1152 1157 nlrHcb
7      7   302 308 nrhblka
8     10  2025 2034 ijog5zab
8     10  1438 1447 aobdRriF
8     10   302 311 nrhblkap
9     13  2117 2129 IjLoC7bma
9     13  2027 2039 jog5zabGL
9     13  1439 1451 obdRriFan
10     15  2025 2039 ijog5zabGL
11     18  2112 2129 BI6jLoC7bma
11     18  2017 2034 GLihjog5zab
12     21  1903 1923 jioVbr3HalhG
13     25  2112 2136 BI6jLoC7bmaiE
14     28  2112 2139 BI6jLoC7bmaiE0
15     33  2117 2149 IjLoC7bmaiE0G8F
16     36  2114 2149 6jILoC7bmaiE0G8F
16     36  2112 2147 BI6jLoC7bmaiE0G8
17     38  2112 2149 BI6jLoC7bmaiE0G8F
18     48  2117 2164 IjLoC7bmaiE0G8FM9Y
19     51  2114 2164 6jILoC7bmaiE0G8FM9Y
20     53  2112 2164 BI6jLoC7bmaiE0G8FM9Y
21     70  2095 2164 woabmBI6jLC7iE0G8FM9Y
22     81  1400 1480 ObPQBuiKaodRrFnGLMgSjc
22     81  1399 1479 cObPQBuiKaodRrFnGLMgSj
23     91  1389 1479 hcbaOPQBuiKodRrFnGLMgSj
23     91  1383 1473 fibhcaOPQBuKodRrFnGLMgS
24     97  1383 1479 fibhcaOPQBuKodRrFnGLMgSj
25    112  1368 1479 NifbhcaOPQBuKodRrFnGLMgSj
26    131  2112 2242 BI6jLoC7bmaiE0G8FM9Y+-JV/*
27    148  2095 2242 woabmBI6jLC7iE0G8FM9Y+-JV/*
28    169  2095 2263 woabmBI6jLC7iE0G8FM9Y+-JV/*t
29    181  2112 2292 BI6jLoC7bmaiE0G8FM9Y+-JV/*th3
30    189  2112 2300 BI6jLoC7bmaiE0G8FM9Y+-JV/*th3l
31    193  2112 2304 BI6jLoC7bmaiE0G8FM9Y+-JV/*th3lr
32    210  2095 2304 woabmBI6jLC7iE0G8FM9Y+-JV/*th3lr
33    252  1991 2242 Nboia0YLKGhjg5zmwBI6C7E8FM9+-JV/*
33    252  1912 2163 3rbHalhGL4oiwYVN0Kjg5zmBI6C7E8FM9
34    267  1894 2160 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM
35    270  1894 2163 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9
36    291  1894 2184 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9+
37    314  1991 2304 Nboia0YLKGhjg5zmwBI6C7E8FM9+-JV/*t3lr
38    327  1894 2220 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9+-J
39    339  1894 2232 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9+-J/
40    349  1894 2242 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9+-J/*
41    370  1894 2263 n2ijoVbr3HalhGL4wYN0Kg5zmBI6C7E8FM9+-J/*t
42    469  1795 2263 1ioabGLjwhCKEn2Vr3Hl4YN0g5zmBI678FM9+-J/*t
43    501  1763 2263 ZL0KGboia1jwhCEn2Vr3Hl4YNg5zmBI678FM9+-J/*t
44    516  1727 2242 WiXaoGVYZL0Kb1jwhCEn2r3Hl4Ng5zmBI678FM9+-J/*
45    536  1400 1935 ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl4
46    609  1655 2263 UbGLNoijmaVWXYZ0K1whCEn2r3Hl4g5zBI678FM9+-J/*t
47    664  1600 2263 DabGLoiMUNjmVWXYZ0K1whCEn2r3Hl4g5zBI678F9+-J/*t
48    706  1558 2263 TijbNLaGKDoMUmVWXYZ01whCEn2r3Hl4g5zBI678F9+-J/*t
49    724  1400 2123 ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I67
50    748  1400 2147 ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I678
51    764  1400 2163 ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789
52    785  1400 2184 ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+
53    815  1406 2220 PQBubiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+-J
54    821  1400 2220
ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+-J
55    833  1400 2232
ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+-J/
56    843  1400 2242
ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+-J/*
57    864  1400 2263
ObPQBuiKaodRrFnGLMgSjcfNzTDUmVWXYZ01whCE23Hl45I6789+-J/*t
58   1005  1259 2263
xizabrJnGKLMdoNfhcOPQBuRFgSjTDUmVWXYZ01wCE23Hl45I6789+-/*t
59   1070  1194 2263
ybaoilnrcxzJGKLMdNfhOPQBuRFgSjTDUmVWXYZ01wCE23H45I6789+-/*t
60   1225  1039 2263
elribnoaEdcBjFhGHIyxzJKLMNfOPQuRgSTDUmVWXYZ01wC23456789+-/*t
61   1255  1009 2263
viabhlernoEdcBjFGHIyxzJKLMNfOPQuRgSTDUmVWXYZ01wC23456789+-/*t
62   1567   697 2263
AibalrnjBCDvheoEdcFGHIyxzJKLMNfOPQuRgSTUmVWXYZ01w23456789+-/*t
63   1852   391 2242
tknbaucgdhorijvwxyzflABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
64   1879   364 2242
pblcdgaftknuhorijvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
64   1879   354 2232
sinkhpblcdgaftuorjvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/
65   1889   354 2242
sinkhpblcdgaftuorjvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
66   1950   293 2242
qlnhrbkapsicdgftuojvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
Time to scan:       853094 (1200 potential solutions)
Time to filter:     874588 (253 filtered solutions)

1.7M cycles, 0.75ms total time.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Robert AH Prins wrote:
> I'm afraid the 2329 set of input data, built up over 29 years, is all
> that I have at the moment. It's likely to grow a bit more in the coming
> years, at a rate of 40 to 80 entries per year (in 2008 there were 138
> new values, but only 5 of them had never been seen, in 2009 there were
> 56 with 3 previously unseen ones, and so far this year there have been
> 67 new entries, but all of them have occurred before) Most of the growth
> will come from the top-10 values with possibly minor additions from the
> values in the remainder of the top-20.
>
> The amount of data from other users is likely to be rather a lot smaller
> and more likely to contain far fewer unique values, I'll see if I can
> get any.

In that case, and with my current code running at sub-ms speed, it
really seems like this is a completely solved problem.

I really can't see how you could need to to re-run the scanner 1000
times/second, every second, for almost a week while waiting for the next
:-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-26 08:43, Terje Mathisen wrote:
> Robert AH Prins wrote:
>> On 2010-05-23 08:37, Terje Mathisen wrote:
>>> Dick Wesseling wrote:
>>>> In article<11lic7-2u52.ln1@ntp.tmsw.no>,
>>>> Terje Mathisen<"terje.mathisen at tmsw.no"@giganews.com> writes:
>>>>> Still, running at 130+ cycles per input byte is still fast enough to
>>>>> handle significant amounts of data.
>>>>>
>>>>
>>>> Sounds like you've beaten me. My O() depends on the size of the longest
>>>> sequence minus the size of the longest unique sequence, which is good
>>>> for the reference data. If I time 4K of random data I get 7250 cycles
>>>> per
>>>> input byte. Since sqrt(C) is 16 for that dataset you ought to get about
>>>> 800 cycles/byte ( http://www.fi.uu.nl/~ftu/random.bin )
>>>
>>> It is worse (or better, seen from my point):
>>>
>>> With 4K random input the scan took 252K cycles, which means that I
>>> averaged about 83 cycles/byte. (I.e. very fast!)
>>>
>>> The problem was that with totally random data, there are a _lot_ of both
>>> potential and real solutions, I found 4099 candidates (i.e. just over
>>> one per starting byte!) and 121 after filtering.
>>>
>>> The filter process here took a _lot_ of time though: 8M cycles!
>>>
>>> This means that it needed 2000 cycles/input byte. :-(
>>>
>>> This simply shows that the problem space (and therefore required
>>> algorithm as well) almost certainly won't contain totally random data,
>>> and this will most probably help my code...
>>
>> The data is in fact not very random, but you'll be hard pressed to
>> figure out what it represents. Here is the current full set, represented
>> by 66
>>
>> abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-/*
>>
>> printable chars:
>>

<SNIP>

>> bmaoooooobobbbbbooo (2329 characters)
>>
>> The top-3 account for 58.3 % of all values, the top-10 for 85.2%.
>
> Here's my results, with the longer result strings truncated:
> len aperture from to value
> 3 3 2291 2293 a3o

<SNIP>

> 66 1950 293 2242
> qlllnhnnnnrhblkaabpbbbbbllknnnnnbbaaaaaabbbbnkaaalbllllhlnnnnsinnkkkkh

A very quick look at the results tells me they are OK. My results
perform one other, minor massage of the output by removing duplicates,
so to take an example

15 33  2117 2149 IjLjoC7jjbmoaoooiooEEE0000000G8bF

would come out as

15 33  2117 2149 IjLoC7bmaiE0G8F

Or even more correct, as the 4 character strings indexed by the values
of these characters. Nit-picking...

> Time to scan: 2285008 (1200 potential solutions)
> Time to filter: 2459754 (253 filtered solutions)
>
> Does this seem reasonable?
>
> The total solution time is about 2.2 ms...

I think my very first Pascal (Turbo Pascal 3.01a) solution using a
sliding window without any optimizations ran for around 26 *minutes*
(OK, on a 16MHz 386) on a far smaller subset (less than half) of this
data, so I think 2.2 ms is pretty reasonable. ;)

Still cutting the Pascal - the PL/I would be easier, but PL/I compilers
are a bit thin on the ground...

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```On 2010-05-26 03:48, Dick Wesseling wrote:
> In article<862qk6Fe5gU1@mid.individual.net>,
> 	Robert AH Prins<spamtrap@prino.org>  writes:
>
>> Group 61 - the x-marked 3-string (FEG) is not found by the Paul Green code:
>
> What a relief, mine works:
>
> 5641 cycles 6 solutions
> AABCDEFEGGGGGGGGGGHICDAAAAA
>   len  win  pos
>     9   19    1 ABCDEFEGGGGGGGGGGHI
>     8   17    6 FEGGGGGGGGGGHICDA
>     7    8    1 ABCDEFEG
>     6    6    1 ABCDEF
>     3    3    6 FEG<there it is!
>     6    6   17 GHICDA
>
>
> 5630 cycles 6 solutions
> AAAAABBCCDEFGHHHHIJJJ
>   len  win  pos
>    10   15    4 ABBCCDEFGHHHHIJ
>     9   13    6 BCCDEFGHHHHIJ
>     8   10    4 ABBCCDEFGH
>     7    8    6 BCCDEFGH
>     6    6    8 CDEFGH
>     3    3   16 HIJ<there it is!
>
>> Will extract the code from the Pascal program, and get it working RSN,
>> bit tired from travel, it takes as long to get from Vilnius to Charleroi
>> as from Charleroi to Oostende.
>
> If it isn't asking too much, once you've taken some rest would you mind
> posting one of the longer (2K or so) data sets?
> Earlier I had to disappoint the readers of clax by saying that it wasn't
> worth bothering about an asm version because it outperformed the C
> version by a small margin only. However, that was for random data.
> Random data suffocates branch prediction causing _any_ version to come
> to a grinding halt (*).

I'm afraid the 2329 set of input data, built up over 29 years, is all
that I have at the moment. It's likely to grow a bit more in the coming
years, at a rate of 40 to 80 entries per year (in 2008 there were 138
new values, but only 5 of them had never been seen, in 2009 there were
56 with 3 previously unseen ones, and so far this year there have been
67 new entries, but all of them have occurred before) Most of the growth
will come from the top-10 values with possibly minor additions from the
values in the remainder of the top-20.

The amount of data from other users is likely to be rather a lot smaller
and more likely to contain far fewer unique values, I'll see if I can
get any.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```In article <864cvsFor7U1@mid.individual.net>,
Robert AH Prins <spamtrap@prino.org> writes:
>
[snip]
>> Time to scan: 2285008 (1200 potential solutions)
>> Time to filter: 2459754 (253 filtered solutions)
>>
>> Does this seem reasonable?
>>
>> The total solution time is about 2.2 ms...
>
> I think my very first Pascal (Turbo Pascal 3.01a) solution using a
> sliding window without any optimizations ran for around 26 *minutes*
> (OK, on a 16MHz 386) on a far smaller subset (less than half) of this
> data, so I think 2.2 ms is pretty reasonable. ;)
>
> Still cutting the Pascal - the PL/I would be easier, but PL/I compilers
> are a bit thin on the ground...
>

My results:

2165682 cycles 253 solutions.

Using 2K of random data instead takes 6 times longer.
```
 0

```Peter Flass <Peter_Flass@yahoo.com> wrote:
> James J. Weinkam wrote:
(snip)

>> With the F and Optimizing compilers, each task had its own allocation
>> stack, emanating from the pseudo register vector, for each controlled
>> variable known within that task. It is not possible for one task to
>> refer directly to a generation of a controled variable allocated in
>> another task. Any sharing of controlled data had to be done by parameter
>> passing or pointers, and any needed synchronization or mutual exclusion
>> was the responsibility of the programmer. If no sharing is taking place

> That's interesting.  The manual describes this behavior, but gives no
> idea that it's just an unfortunate artifact of the implementation.  I
> was thinking it was intended, for some reason or other.  Thanks.

It could probably be argued either way.  Since PL/I, and the F compiler,
were designed before much experience with multitasking systems, it
might not have been obvious which way it should work.

keys might stop you from accessing something, but it is addressable.

It seems to me that the definition of threads is such that they
are allowed to run in different address spaces.

-- glen
```
 0
Reply gah (12856) 5/26/2010 5:14:32 PM

```LS0tLS1CRUdJTiBQR1AgU0lHTkVEIE1FU1NBR0UtLS0tLQ0KSGFzaDogU0hBMQ0KDQpPbiBXZWQs
IDI2IE1heSAyMDEwIDE3OjE0OjMyICswMDAwIChVVEMpLCBnbGVuIGhlcnJtYW5uc2ZlbGR0IHdy
b3RlDQphYm91dCBSZTogQUxMT0NBVEUgdnMuIEJFR0lOOg0KDQpbc25pcF0NCj5JdCBjb3VsZCBw
cm9iYWJseSBiZSBhcmd1ZWQgZWl0aGVyIHdheS4gIFNpbmNlIFBML0ksIGFuZCB0aGUgRg0KPmNv
bXBpbGVyLCB3ZXJlIGRlc2lnbmVkIGJlZm9yZSBtdWNoIGV4cGVyaWVuY2Ugd2l0aCBtdWx0aXRh
c2tpbmcNCj5zeXN0ZW1zLCBpdCBtaWdodCBub3QgaGF2ZSBiZWVuIG9idmlvdXMgd2hpY2ggd2F5
IGl0IHNob3VsZCB3b3JrLg0KPg0KPkV2ZW4gbW9yZSwgT1MvMzYwIGhhcyBvbmUgYWRkcmVzcyBz
cGFjZSBmb3IgYWxsIHRhc2tzLiAgU3RvcmFnZQ0KPmtleXMgbWlnaHQgc3RvcCB5b3UgZnJvbSBh
Y2Nlc3Npbmcgc29tZXRoaW5nLCBidXQgaXQgaXMgYWRkcmVzc2FibGUuDQoNCk9TLzM2MCBoYWQg
bXVsdGktdGFza2luZywgYXQgbGVhc3QgaW4gaXRzIE1GVCBhbmQgTVZUIGluY2FybmF0aW9uczsN
CnRoZXNlIGJlY2FtZSBPUy9WUzEgYW5kIE9TL1ZTMiwgcmVzcGVjdGl2ZWx5LCBvbmNlIHBhZ2lu
ZyBhbmQgYSBmZXcNCm90aGVyIGtuaWNrLWtuYWNrcyB3ZXJlIGFkZGVkLg0KDQo+SXQgc2VlbXMg
dG8gbWUgdGhhdCB0aGUgZGVmaW5pdGlvbiBvZiB0aHJlYWRzIGlzIHN1Y2ggdGhhdCB0aGV5DQo+
cnVuIGluIHRoZSBzYW1lIGFkZHJlc3Mgc3BhY2UsDQoNClRoaXMgaXMgdHJ1ZS4NCg0KPndoaWNo
IHRhc2tzIChub3QgY291bnRpbmcgT1MvMzYwKQ0KPmFyZSBhbGxvd2VkIHRvIHJ1biBpbiBkaWZm
ZXJlbnQgYWRkcmVzcyBzcGFjZXMuDQoNClRoaXMgaXMgbm90Lg0KDQpVbnByaXZpbGVnZWQgY29k
ZSB0aGF0IHVzZXMgdGhlIEFUVEFDSCBzdXBlcnZpc29yIGNhbGwgY3JlYXRlcyBhIG5ldw0Kc3Vi
LXRhc2sgaW4gdGhlIGN1cnJlbnQgYWRkcmVzcyBzcGFjZS4gIE1vc3QgcHJpdmlsZWdlZCBjb2Rl
IChlLmcuDQpDSUNTLCBJTVMpIGRvZXMgdGhlIHNhbWUuIEhvd2V2ZXIsIHRoZXJlIGlzIGFuIG9i
c2N1cmUgcGFyYW1ldGVyaXphdGlvbg0Kb2YgQVRUQUNIIHRoYXQgSkVTWzIzXSB1c2VzIHRvIHN0
YXJ0IGluaXRpYXRvcnMgaW4gbmV3IGFkZHJlc3Mgc3BhY2VzLA0KYnV0IGFsbCBvdGhlciBBVFRB
Q0hlcyByZW1haW4gd2l0aGluIHRoZSBhZGRyZXNzIHNwYWNlIGZyb20gd2hpY2ggdGhleQ0Kd2Vy
ZSBpc3N1ZWQuDQotIC0tIA0KUmVnYXJkcywNCg0KRGF2ZSAgW1JMVSAjMzE0NDY1XQ0KPT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT0NCmR3bm9vbkBzcGFtdHJhcC5udGx3b3JsZC5jb20gKERhdmlkIFcgTm9vbikNClJl
bW92ZSBzcGFtIHRyYXAgdG8gcmVwbHkgYnkgZS1tYWlsLg0KPT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLS0t
QkVHSU4gUEdQIFNJR05BVFVSRS0tLS0tDQpWZXJzaW9uOiBHbnVQRyB2Mi4wLjE0IChHTlUvTGlu
dXgpDQoNCmlFWUVBUkVDQUFZRkFrdjllREFBQ2drUTlNcWFVSlF3Mk1uMm9nQ2ZmUkNDaHp1OGRt
ZlZHdUhGM1VhT3dFU1YNCkZPMEFuUlpkOXNyS21pa1NVM2ppS2laV0FKR1pBbE40DQo9SEF6Vg0K
LS0tLS1FTkQgUEdQIFNJR05BVFVSRS0tLS0tDQo=

```
 0
Reply dwnoon (51) 5/26/2010 7:36:09 PM

```David W Noon <dwnoon@spamtrap.ntlworld.com> wrote:
(snip, I wrote)

>>It seems to me that the definition of threads is such that they
>>run in the same address space,

> This is true.

>>are allowed to run in different address spaces.

> This is not.

> Unprivileged code that uses the ATTACH supervisor call creates a new

I was thinking of it in the general computer sense, as, for example,
in the wikipedia entry under Computer_multitasking.

Of course IBM and OS/360 were and are (with successors) important
completely separate from them.

-- glen
```
 0
Reply gah (12856) 5/26/2010 8:11:30 PM

```glen herrmannsfeldt wrote:
> David W Noon <dwnoon@spamtrap.ntlworld.com> wrote:
> (snip, I wrote)
>
>>> It seems to me that the definition of threads is such that they
>>> run in the same address space,
>
>> This is true.
>
>>> which tasks (not counting OS/360)
>>> are allowed to run in different address spaces.
>
>> This is not.
>
>> Unprivileged code that uses the ATTACH supervisor call creates a new
>
> I was thinking of it in the general computer sense, as, for example,
> in the wikipedia entry under Computer_multitasking.
>
> Of course IBM and OS/360 were and are (with successors) important
> completely separate from them.
>

Wiki is probably screwed up.  Those would be "processes".  AFAIK, same
```
 0
Reply Peter_Flass (956) 5/26/2010 11:12:30 PM

```This message was originally posted on 2010/05/21 at 23:09 PDT.
It has not yet appeared on any of the newsgroups; numerous other messages in this thread have
appeared since then.

When I tried to repost it just now, a message box appeared saying "Sending of message failed. You
may only send to one newsgroup at a time."

Accordingly, I am now posting it to each group separately.

Dick Wesseling wrote:
> /*
> Here is the final version of my solution.
>
> Let:    N       number of cells, size of the input
>         V       number of values, size of symbol set.
>         M       the number of distinct values in the largest sequence.
>         T       the size of the largest n-in-n sequence, i.e. the
>                 largest sequence consisting of distinct values only.
>
> As explained in my previous message <4bf43d60\$0\$22939\$c5fe704e@news6.xs4all.nl>
> the algorithm uses two different search strategies:
>
> Plan A
>
> Find the largest sequence with size M
>
> for n from M downwards search n-in-m sequences, keeping only sequences
> with minimal value of m.
> Print sequence if n<m. (If n=m the same sequence will be found again
> in the next step).
>
> Now n=m=T.
>
> Plan B
> Prints all maximal n-in-n sequences.
>
> Plan A scans the input 1+(M-T) .. 2+(M-T) times, depending on the
> input position of the longest sequence.
> Plan B scans the input once.
>
> The running time is therefore bounded by O(N * (3+(M-T)) which
> is typically better than or equal to O(N*C).
> For small values of N the cost of erasing the frequency array should
> also be accounted for.
>
> Two arrays are used:
>     found[N]    Intermediate results of plan A.
>     freq[C]     Frequency count for all symbols in the sliding window
>
>
> Finding sequences is done using a straightforward sliding window
> scan of the input. The values in freq[] are updated when the window
> edges move and the numer of unique symbols is updated when a
> frequency count changes between zero and non-zero.
>
>
> The output is unsorted.
....
When I did my original solution a few days ago, I didn't even consider a moving window approach
because the OP had said that it's performance was terrible. He also mentioned complexities such as
having to restart the scan multiple times, etc.

Seeing your sliding window approach inspired me to give that approach a try.

I have come up with a streamlined version of your algorithm that I believe meets the OP's
specifications. All of us who have participated in discussing this problem have been bandying about
terms such as unique, distinct, minimal, shortest, without making the intended interpretation
completely clear. I had to do a bit of reading between the lines to come up with what I believe are
the OP's intended specifications, which is why I only say that I believe my solution meets his
specifications. I also inadvertently added to the confusion: by the time I wrote my first solution,
I did not have the original post in front of me; I remembered that the OP had referred to cells and
values so I used C for the number of cells and V for the number of values. Only later did I discover
that he had used N and T for these quantities. I have renamed some of the variables in my program to
make it consistent with the OP's notation.

Let me suggest the following definitions and nomenclature:

The data consist of a sequence of n cells stored in an array, x, with subscripts running from 1 to n.

Each cell contains one of a set of t values which are integers between 1 and t inclusive.

A subsequence of length l consists of l consecutive cells from the array x.

A k-subsequence is a subsequence of length l>=k which contains exactly k of the t possible values.

A minimal k-subsequence is one for which the removal of the first or last cell would reduce it to a
(k-1)-subsequence.

A shortest k-subsequence is a minimal k-subsequence whose length is less than of equal to that of
any other minimal k-subsequence in the array.

A shortest possible k-subsequence is a k-subsequence of length k.

If a minimal k-subsequence is not shortest possible, it contains at least one repeated value. Since
every minimal 1- or 2-subsequence is shortest possible shortest k-subsequences are only of interest
for k>=3. The array may but need not contain a shortest possible k-subsequence for 3<=k<=t.
If the array contains a shortest possible k-subsequence, it also contains shortest possible
j-subsequences for 3<=j<k.

Assuming that each of the t values appears in the array at least once, the array contains at least
one shortest k-subsequence for 1<=k<=t.

Within this framework, here is my understanding of the OP's requirements: "For 3<=k<=t, list all
shortest k-subsequences except, for k<t, omit any shortest possible k-subsequences which are
contained within a shortest possible (k+1)-subsequence."

A shortest possible k-subsequence, z, is not contained within a shortest possible k+1-subsequence
iff each of the immediately preceding and following cells (if it exists) contains a value already
present in z. I shall refer to this as the inclusion test. Note that the inclusion test always
passes for k=t.

In my new program, I have abandoned the use of characters for the external representation and
switched to integers between 1 and t. I have also changed the data representation from bin fixed(15)
to bin fixed(16) unsigned. This allows values of n and t up to 65534. Finally, I added timing of the
input, processing, and output.

To facilitate application of the inclusion test, the data array, x, has sentinel cells x(0) and
x(n+1) both set equal to the non existent value 0. The frequency table, f, has an extra entry f(0)
set equal to 1.

For each k, 3<=k<=t, a window delimited by the subscripts i on the left and j on the right is moved
across the entire data array. The width of the window is w=j-i+1. Initially, the window is at cell 1
(i=1) and is empty (w=0). The relationship w=j-i+1 gives j=0. Since no subsequence can be longer
than n, l, the length of the shortest minimal k-subsequence found so far is set to n+1; the number
of shortest minimal k-subsequences found so far, s, is set to 0; the frequency table, f, is
initialized to 0 except for f(0) which is set to 1 as previously explained; finally the number of
distinct values within the window, d, is set to 0.

The scan of the data consists of repeatedly finding the next minimal k-subsequence and deciding
whether or not to remember it. This is done as follows: first the window is increased on the right
one cell at a time until the number of distinct values is equal to k or the end of the data is
reached, in which case there are no more minimal k-subsequences so the scan is terminated.  At this
point, the number of instances of the value at position j is necessarily 1. Next values are removed
from the window on the left until the leftmost value has only 1 instance in the window. At the point
the window contains the leftmost minimal k-subsequence not already examined. Next the width of the
window, w, is computed and the disposition of the subsequence decided. If w>l, the subsequence is
skipped; if w<l, a shorter minimal k-subsequence has been found, so the previous set of
k-subsequences is thrown away and the subsequence becomes the first element in a new set if it
passes the inclusion test, otherwise the set is left empty; otherwise w=l, so the subsequence is
added to the set of remembered k-subsequences if it passes the inclusion test, which is only applied
if w=k.

Finally the first cell is deleted from the window leaving a not necessarily minimal
(k-1)-subsequence and the loop repeats.

Here is the PL/I source code:

%process mar(2,100) offset;
subsets: proc options(main) reorder;
dcl
(n,t,i,j,k,v,d,w,l,s) bin fixed(16) unsigned,
(x(0:n+1),a(n),f(0:t)) bin fixed(16) unsigned ctl,
(ti,to,tc init(0)) bin float(53),
vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50) var),
uttime0 entry, uttime entry returns(bin float(53)),
sysin file input,
output file output stream;
call uttime0;
get file(sysin) list(n,t);
allocate x,a,f;
get file(sysin) list((x(i) do i=1 to n)); x(0),x(n+1)=0;
ti=uttime;
open file(output) title('/stdout:,type(crlf),recsize(32000)');
put file(output) edit('n: ',vfmt(n,10,0),', t: ',vfmt(t,10,0))(col(1),4 a);
put file(output) edit('Input: ',(vfmt(x(i),10,0) do i=1 to n))(col(1),a,(n)(a,x(1)));
to=uttime;
do k=3 to t; l=n+1; f(0)=1; do i=1 to t; f(i)=0; end; d=0; i=1; j=0; s=0;
scan: do forever;
do j=j+1 to n until(d=k); v=x(j); f(v)+=1; if f(v)=1 then d+=1; end; if d<k then leave scan;
v=x(i); do while(f(v)>1); f(v)-=1; i+=1; v=x(i); end; w=j-i+1;
if w>l then;
else if w<l then do; l=w; s=0;
if w>k then do; s=1; a(1)=i; end;
else if f(x(i-1))>0 then if f(x(j+1))>0 then do; s=1; a(1)=i; end;
end;
else do;
if w>k then do; s+=1; a(s)=i; end;
else if f(x(i-1))>0 then if f(x(j+1))>0 then do; s+=1; a(s)=i; end;
end;
f(x(i))=0; d-=1; i+=1;
end;
tc+=uttime;
if s>0 then do;
put file(output) edit('Distinct: ',vfmt(k,10,0),', length: ',vfmt(l,10,0))(col(1),4 a);
do j=1 to s; i=a(j);
put file(output) edit(j,': (',vfmt(i,10,0),') ',(vfmt(x(i+v),10,0) do v=0 to l-1))
(col(1),f(6),3 a,(l)(a,x(1)));
end;
end;
to+=uttime;
end;
put file(output) skip(2) edit('Elapsed time: input ',vfmt(ti,15,8),', processing ',
vfmt(tc,15,8),', output ',vfmt(to,15,8),'.')(col(1),7 a);
end subsets;

Here is the output from one small example of randomly generated data:

n: 64, t: 12
Input: 11 12 10 9 3 7 10 8 6 7 6 3 5 12 10 10 11 2 12 9 6 4 5 4 5 6 7 3 12 5 4 1 1 8 12 6 7 11 3 8 3
11 11 7 11 6 9 8 9 1 3 11 1 6 9 10 2 7 10 8 4 9 7 5
Distinct: 3, length: 3
1: (40) 8 3 11
Distinct: 4, length: 4
1: (7) 10 8 6 7
Distinct: 5, length: 5
1: (44) 7 11 6 9 8
2: (48) 8 9 1 3 11
Distinct: 6, length: 6
1: (1) 11 12 10 9 3 7
2: (4) 9 3 7 10 8 6
3: (10) 7 6 3 5 12 10
4: (24) 4 5 6 7 3 12
5: (35) 12 6 7 11 3 8
6: (57) 2 7 10 8 4 9
7: (59) 10 8 4 9 7 5
Distinct: 7, length: 7
1: (26) 6 7 3 12 5 4 1
2: (33) 1 8 12 6 7 11 3
Distinct: 8, length: 8
1: (16) 10 11 2 12 9 6 4 5
2: (51) 3 11 1 6 9 10 2 7
Distinct: 9, length: 10
1: (30) 5 4 1 1 8 12 6 7 11 3
2: (51) 3 11 1 6 9 10 2 7 10 8
3: (52) 11 1 6 9 10 2 7 10 8 4
Distinct: 10, length: 11
1: (51) 3 11 1 6 9 10 2 7 10 8 4
Distinct: 11, length: 14
1: (51) 3 11 1 6 9 10 2 7 10 8 4 9 7 5
Distinct: 12, length: 19
1: (16) 10 11 2 12 9 6 4 5 4 5 6 7 3 12 5 4 1 1 8

Elapsed time: input 0, processing 0, output 0.

The table below summarizes some features of the results of some larger examples, a couple of which
go way beyond the limits indicated by the OP.

n    t   i    c     o  max shortest possible  shortest minimal t-subsequence
64   12   0    0     0                   8(2)                           19(1)
4096  256 .01  .03   .19                  54(1)                         1181(1)
8192  250 .02  .06   .19                  54(1)                          994(1)
16384  256 .03  .13    .2                  64(2)                          981(1)
32000  250 .06  .21   .31                  64(1)                          931(1)
60000  800 .13 1.18  1.85                 117(1)                         4043(1)
65534 4096 .15 7.42 42.68                 220(1)                        29213(1)

All times are in seconds. The numbers in parentheses are the number of instances; the first example
has 2 shortest possible minimal 8-subsequences and the third example has 2 shortest possible minimal
64-subsequences.

The program is vary fast. The complexity is O(n*t). For n=4096 and t=256 which is the maximum size
contemplated by the OP, the computational portion of the program requires only 30ms.

```
 0

```This message was originally posted on 2010/05/21 at 23:09 PDT.
It has not yet appeared on any of the newsgroups; numerous other messages in this thread have
appeared since then.

When I tried to repost it just now, a message box appeared saying "Sending of message failed. You
may only send to one newsgroup at a time."

Accordingly, I am now posting it to each group separately.

Dick Wesseling wrote:
> /*
> Here is the final version of my solution.
>
> Let:    N       number of cells, size of the input
>         V       number of values, size of symbol set.
>         M       the number of distinct values in the largest sequence.
>         T       the size of the largest n-in-n sequence, i.e. the
>                 largest sequence consisting of distinct values only.
>
> As explained in my previous message <4bf43d60\$0\$22939\$c5fe704e@news6.xs4all.nl>
> the algorithm uses two different search strategies:
>
> Plan A
>
> Find the largest sequence with size M
>
> for n from M downwards search n-in-m sequences, keeping only sequences
> with minimal value of m.
> Print sequence if n<m. (If n=m the same sequence will be found again
> in the next step).
>
> Now n=m=T.
>
> Plan B
> Prints all maximal n-in-n sequences.
>
> Plan A scans the input 1+(M-T) .. 2+(M-T) times, depending on the
> input position of the longest sequence.
> Plan B scans the input once.
>
> The running time is therefore bounded by O(N * (3+(M-T)) which
> is typically better than or equal to O(N*C).
> For small values of N the cost of erasing the frequency array should
> also be accounted for.
>
> Two arrays are used:
>     found[N]    Intermediate results of plan A.
>     freq[C]     Frequency count for all symbols in the sliding window
>
>
> Finding sequences is done using a straightforward sliding window
> scan of the input. The values in freq[] are updated when the window
> edges move and the numer of unique symbols is updated when a
> frequency count changes between zero and non-zero.
>
>
> The output is unsorted.
....
When I did my original solution a few days ago, I didn't even consider a moving window approach
because the OP had said that it's performance was terrible. He also mentioned complexities such as
having to restart the scan multiple times, etc.

Seeing your sliding window approach inspired me to give that approach a try.

I have come up with a streamlined version of your algorithm that I believe meets the OP's
specifications. All of us who have participated in discussing this problem have been bandying about
terms such as unique, distinct, minimal, shortest, without making the intended interpretation
completely clear. I had to do a bit of reading between the lines to come up with what I believe are
the OP's intended specifications, which is why I only say that I believe my solution meets his
specifications. I also inadvertently added to the confusion: by the time I wrote my first solution,
I did not have the original post in front of me; I remembered that the OP had referred to cells and
values so I used C for the number of cells and V for the number of values. Only later did I discover
that he had used N and T for these quantities. I have renamed some of the variables in my program to
make it consistent with the OP's notation.

Let me suggest the following definitions and nomenclature:

The data consist of a sequence of n cells stored in an array, x, with subscripts running from 1 to n.

Each cell contains one of a set of t values which are integers between 1 and t inclusive.

A subsequence of length l consists of l consecutive cells from the array x.

A k-subsequence is a subsequence of length l>=k which contains exactly k of the t possible values.

A minimal k-subsequence is one for which the removal of the first or last cell would reduce it to a
(k-1)-subsequence.

A shortest k-subsequence is a minimal k-subsequence whose length is less than of equal to that of
any other minimal k-subsequence in the array.

A shortest possible k-subsequence is a k-subsequence of length k.

If a minimal k-subsequence is not shortest possible, it contains at least one repeated value. Since
every minimal 1- or 2-subsequence is shortest possible shortest k-subsequences are only of interest
for k>=3. The array may but need not contain a shortest possible k-subsequence for 3<=k<=t.
If the array contains a shortest possible k-subsequence, it also contains shortest possible
j-subsequences for 3<=j<k.

Assuming that each of the t values appears in the array at least once, the array contains at least
one shortest k-subsequence for 1<=k<=t.

Within this framework, here is my understanding of the OP's requirements: "For 3<=k<=t, list all
shortest k-subsequences except, for k<t, omit any shortest possible k-subsequences which are
contained within a shortest possible (k+1)-subsequence."

A shortest possible k-subsequence, z, is not contained within a shortest possible k+1-subsequence
iff each of the immediately preceding and following cells (if it exists) contains a value already
present in z. I shall refer to this as the inclusion test. Note that the inclusion test always
passes for k=t.

In my new program, I have abandoned the use of characters for the external representation and
switched to integers between 1 and t. I have also changed the data representation from bin fixed(15)
to bin fixed(16) unsigned. This allows values of n and t up to 65534. Finally, I added timing of the
input, processing, and output.

To facilitate application of the inclusion test, the data array, x, has sentinel cells x(0) and
x(n+1) both set equal to the non existent value 0. The frequency table, f, has an extra entry f(0)
set equal to 1.

For each k, 3<=k<=t, a window delimited by the subscripts i on the left and j on the right is moved
across the entire data array. The width of the window is w=j-i+1. Initially, the window is at cell 1
(i=1) and is empty (w=0). The relationship w=j-i+1 gives j=0. Since no subsequence can be longer
than n, l, the length of the shortest minimal k-subsequence found so far is set to n+1; the number
of shortest minimal k-subsequences found so far, s, is set to 0; the frequency table, f, is
initialized to 0 except for f(0) which is set to 1 as previously explained; finally the number of
distinct values within the window, d, is set to 0.

The scan of the data consists of repeatedly finding the next minimal k-subsequence and deciding
whether or not to remember it. This is done as follows: first the window is increased on the right
one cell at a time until the number of distinct values is equal to k or the end of the data is
reached, in which case there are no more minimal k-subsequences so the scan is terminated.  At this
point, the number of instances of the value at position j is necessarily 1. Next values are removed
from the window on the left until the leftmost value has only 1 instance in the window. At the point
the window contains the leftmost minimal k-subsequence not already examined. Next the width of the
window, w, is computed and the disposition of the subsequence decided. If w>l, the subsequence is
skipped; if w<l, a shorter minimal k-subsequence has been found, so the previous set of
k-subsequences is thrown away and the subsequence becomes the first element in a new set if it
passes the inclusion test, otherwise the set is left empty; otherwise w=l, so the subsequence is
added to the set of remembered k-subsequences if it passes the inclusion test, which is only applied
if w=k.

Finally the first cell is deleted from the window leaving a not necessarily minimal
(k-1)-subsequence and the loop repeats.

Here is the PL/I source code:

%process mar(2,100) offset;
subsets: proc options(main) reorder;
dcl
(n,t,i,j,k,v,d,w,l,s) bin fixed(16) unsigned,
(x(0:n+1),a(n),f(0:t)) bin fixed(16) unsigned ctl,
(ti,to,tc init(0)) bin float(53),
vfmt entry(bin float(53),bin fixed(15),bin fixed(15)) returns(char(50) var),
uttime0 entry, uttime entry returns(bin float(53)),
sysin file input,
output file output stream;
call uttime0;
get file(sysin) list(n,t);
allocate x,a,f;
get file(sysin) list((x(i) do i=1 to n)); x(0),x(n+1)=0;
ti=uttime;
open file(output) title('/stdout:,type(crlf),recsize(32000)');
put file(output) edit('n: ',vfmt(n,10,0),', t: ',vfmt(t,10,0))(col(1),4 a);
put file(output) edit('Input: ',(vfmt(x(i),10,0) do i=1 to n))(col(1),a,(n)(a,x(1)));
to=uttime;
do k=3 to t; l=n+1; f(0)=1; do i=1 to t; f(i)=0; end; d=0; i=1; j=0; s=0;
scan: do forever;
do j=j+1 to n until(d=k); v=x(j); f(v)+=1; if f(v)=1 then d+=1; end; if d<k then leave scan;
v=x(i); do while(f(v)>1); f(v)-=1; i+=1; v=x(i); end; w=j-i+1;
if w>l then;
else if w<l then do; l=w; s=0;
if w>k then do; s=1; a(1)=i; end;
else if f(x(i-1))>0 then if f(x(j+1))>0 then do; s=1; a(1)=i; end;
end;
else do;
if w>k then do; s+=1; a(s)=i; end;
else if f(x(i-1))>0 then if f(x(j+1))>0 then do; s+=1; a(s)=i; end;
end;
f(x(i))=0; d-=1; i+=1;
end;
tc+=uttime;
if s>0 then do;
put file(output) edit('Distinct: ',vfmt(k,10,0),', length: ',vfmt(l,10,0))(col(1),4 a);
do j=1 to s; i=a(j);
put file(output) edit(j,': (',vfmt(i,10,0),') ',(vfmt(x(i+v),10,0) do v=0 to l-1))
(col(1),f(6),3 a,(l)(a,x(1)));
end;
end;
to+=uttime;
end;
put file(output) skip(2) edit('Elapsed time: input ',vfmt(ti,15,8),', processing ',
vfmt(tc,15,8),', output ',vfmt(to,15,8),'.')(col(1),7 a);
end subsets;

Here is the output from one small example of randomly generated data:

n: 64, t: 12
Input: 11 12 10 9 3 7 10 8 6 7 6 3 5 12 10 10 11 2 12 9 6 4 5 4 5 6 7 3 12 5 4 1 1 8 12 6 7 11 3 8 3
11 11 7 11 6 9 8 9 1 3 11 1 6 9 10 2 7 10 8 4 9 7 5
Distinct: 3, length: 3
1: (40) 8 3 11
Distinct: 4, length: 4
1: (7) 10 8 6 7
Distinct: 5, length: 5
1: (44) 7 11 6 9 8
2: (48) 8 9 1 3 11
Distinct: 6, length: 6
1: (1) 11 12 10 9 3 7
2: (4) 9 3 7 10 8 6
3: (10) 7 6 3 5 12 10
4: (24) 4 5 6 7 3 12
5: (35) 12 6 7 11 3 8
6: (57) 2 7 10 8 4 9
7: (59) 10 8 4 9 7 5
Distinct: 7, length: 7
1: (26) 6 7 3 12 5 4 1
2: (33) 1 8 12 6 7 11 3
Distinct: 8, length: 8
1: (16) 10 11 2 12 9 6 4 5
2: (51) 3 11 1 6 9 10 2 7
Distinct: 9, length: 10
1: (30) 5 4 1 1 8 12 6 7 11 3
2: (51) 3 11 1 6 9 10 2 7 10 8
3: (52) 11 1 6 9 10 2 7 10 8 4
Distinct: 10, length: 11
1: (51) 3 11 1 6 9 10 2 7 10 8 4
Distinct: 11, length: 14
1: (51) 3 11 1 6 9 10 2 7 10 8 4 9 7 5
Distinct: 12, length: 19
1: (16) 10 11 2 12 9 6 4 5 4 5 6 7 3 12 5 4 1 1 8

Elapsed time: input 0, processing 0, output 0.

The table below summarizes some features of the results of some larger examples, a couple of which
go way beyond the limits indicated by the OP.

n    t   i    c     o  max shortest possible  shortest minimal t-subsequence
64   12   0    0     0                   8(2)                           19(1)
4096  256 .01  .03   .19                  54(1)                         1181(1)
8192  250 .02  .06   .19                  54(1)                          994(1)
16384  256 .03  .13    .2                  64(2)                          981(1)
32000  250 .06  .21   .31                  64(1)                          931(1)
60000  800 .13 1.18  1.85                 117(1)                         4043(1)
65534 4096 .15 7.42 42.68                 220(1)                        29213(1)

All times are in seconds. The numbers in parentheses are the number of instances; the first example
has 2 shortest possible minimal 8-subsequences and the third example has 2 shortest possible minimal
64-subsequences.

The program is vary fast. The complexity is O(n*t). For n=4096 and t=256 which is the maximum size
contemplated by the OP, the computational portion of the program requires only 30ms.

```
 0
Reply jjw (608) 5/27/2010 12:49:38 AM

```On 2010-05-26 06:45:31 -0400, Peter Flass said:

> James J. Weinkam wrote:
>> Peter Flass wrote:
>> ...
>>>
>>> OTOH, although I haven't looked at this in detail, CONTROLLED storage
>>> has interactions with Tasking (maybe not with the Enterprise compilers)
>>> that adds quite another level of overhead.  It makes things much less
>>> simple than just push and pop.
>>>
>> With the F and Optimizing compilers, each task had its own allocation
>> stack, emanating from the pseudo register vector, for each controlled
>> variable known within that task. It is not possible for one task to
>> refer directly to a generation of a controled variable allocated in
>> another task. Any sharing of controlled data had to be done by
>> parameter passing or pointers, and any needed synchronization or mutual
>> exclusion was the responsibility of the programmer. If no sharing is
>
> That's interesting.  The manual describes this behavior, but gives no
> idea that it's just an unfortunate artifact of the implementation.  I
> was thinking it was intended, for some reason or other.  Thanks.

As far as I can recall, and as far as the available information on
BitSavers indicates, it isn't an implementation artifact. It is simply
the way the old PL/I tasking features were designed, and was in place
earlier than July, 1966. Compared to the contortions that both the F
and Optimizing runtimes had to go through to make EVENT variables work
as defined, it's simplicity itself.

replaced it with something completely new.

--
John W Kennedy
"But now is a new thing which is very old--
that the rich make themselves richer and not poorer,
which is the true Gospel, for the poor's sake."
-- Charles Williams.  "Judgement at Chelmsford"

```
 0
Reply jwkenne (1440) 5/27/2010 1:13:23 AM

```On 2010-05-26 19:12:30 -0400, Peter Flass said:

> glen herrmannsfeldt wrote:
>> David W Noon <dwnoon@spamtrap.ntlworld.com> wrote:
>> (snip, I wrote)
>>
>>>> It seems to me that the definition of threads is such that they
>>>> run in the same address space,
>>
>>> This is true.
>>
>>>> which tasks (not counting OS/360)
>>>> are allowed to run in different address spaces.
>>
>>> This is not.
>>
>>> Unprivileged code that uses the ATTACH supervisor call creates a new
>>
>> I was thinking of it in the general computer sense, as, for example,
>> in the wikipedia entry under Computer_multitasking.
>>
>> Of course IBM and OS/360 were and are (with successors) important
>> completely separate from them.
>>
>
> Wiki is probably screwed up.  Those would be "processes".  AFAIK, same
> spaces."

OS/360 most certainly did not call anything "address spaces". Prior to
TSO, the term was "job step". However, a job step is not a unit of
the very terms MFT and MVT use "task" in this way.)

You can't really establish any decent isomorphisms here. The OS/360
concept set is wholly different.

And old PL/I confused the issue further. PL/I defined a tree of mother
and daughter tasks, but in the underlying implementation all the
mothers and daughters were represented by first-generation daughters of

--
John W Kennedy
"There are those who argue that everything breaks even in this old dump
of a world of ours. I suppose these ginks who argue that way hold that
because the rich man gets ice in the summer and the poor man gets it in
the winter things are breaking even for both. Maybe so, but I'll swear
I can't see it that way."
-- The last words of Bat Masterson

```
 0
Reply jwkenne (1440) 5/27/2010 1:28:54 AM

```John W Kennedy <jwkenne@attglobal.net> wrote:
(snip)

> As far as I can recall, and as far as the available information on
> BitSavers indicates, it isn't an implementation artifact. It is simply
> the way the old PL/I tasking features were designed, and was in place
> earlier than July, 1966. Compared to the contortions that both the F
> and Optimizing runtimes had to go through to make EVENT variables work
> as defined, it's simplicity itself.

I always thought that there was some connection between EVENT
variables and the OS/360 ECB (event contol block) such as
used by the WAIT macro.

-- glen
```
 0
Reply gah (12856) 5/27/2010 1:34:30 AM

```James J. Weinkam wrote:
> This message was originally posted on 2010/05/21 at 23:09 PDT.
> It has not yet appeared on any of the newsgroups; numerous other
> messages in this thread have
> appeared since then.
>
> When I tried to repost it just now, a message box appeared saying
> "Sending of message failed. You
> may only send to one newsgroup at a time."

Possibly a problem at xs4all(???). I can guarantee that neither this
messages, nor the message you reported earlier, comes from our
moderation process.

You can see "pending" articles in the clax submission inbox:

http://clax.inspiretomorrow.net/msgs.php

If your posting doesn't appear there, we didn't get it, and there isn't
much we can do about that. It shouldn't stay there long. It is not our
usual policy to discuss, in the group, who's on our "whitelist" and
who's not, but your post "should" be auto-approved within 15 minutes, IF
all goes well. It's an imperfect world.

> Accordingly, I am now posting it to each group separately.

Unfortunate you had to do that. Some people have a "colorful" opinion of
the practice. I'm glad you did it, though... better than not posting at
all!!!

Best,
Frank

```
 0

```James J. Weinkam wrote:
> Here is the PL/I source code:
>
> %process mar(2,100) offset;
> subsets: proc options(main) reorder;

This source code really needs some indenting and white space!

As it is it is almost unreadable. :-(

> The program is vary fast. The complexity is O(n*t). For n=4096 and t=256

Since my approach terminates each scan on the first collision, it is
O(n*sqrt(t)), while keeping the working set very low.

> which is the maximum size
> contemplated by the OP, the computational portion of the program
> requires only 30ms.

You need to run your program on all of the same data sets given by the
OP, verifying both the results and the runtime!

As I wrote yesterday, my current program has done the 2300 char set in
less than a ms total, both scanning and sorting/filtering.

OTOH, all of the competing solutions we've come up with are _far_ faster
than what could be needed. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```Terje Mathisen wrote:
> As I wrote yesterday, my current program has done the 2300 char set in
> less than a ms total, both scanning and sorting/filtering.

I have to reply to my own post here:

I did a few more micro-optimizations, streamlining the filtering process:

64   1879   364 2242
pblcdgaftknuhorijvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
64   1879   354 2232
sinkhpblcdgaftuorjvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/
65   1889   354 2242
sinkhpblcdgaftuorjvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
66   1950   293 2242
qlnhrbkapsicdgftuojvwxyzABCDeEFGHIJKLMNOPQRSTUmVWXYZ0123456789+-/*
Time to scan:       768218 (1200 potential solutions)
Time to filter:     707784 (253 filtered solutions)

I.e. 1.47M cycles total, or about 0.67ms.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```John W Kennedy wrote:
> On 2010-05-26 19:12:30 -0400, Peter Flass said:
>
>> glen herrmannsfeldt wrote:
>>> David W Noon <dwnoon@spamtrap.ntlworld.com> wrote:
>>> (snip, I wrote)
>>>
>>>>> It seems to me that the definition of threads is such that they
>>>>> run in the same address space,
>>>
>>>> This is true.
>>>
>>>>> which tasks (not counting OS/360)
>>>>> are allowed to run in different address spaces.
>>>
>>>> This is not.
>>>
>>>> Unprivileged code that uses the ATTACH supervisor call creates a new
>>>
>>> I was thinking of it in the general computer sense, as, for example,
>>> in the wikipedia entry under Computer_multitasking.
>>>
>>> Of course IBM and OS/360 were and are (with successors) important
>>> completely separate from them.
>>>
>>
>> Wiki is probably screwed up.  Those would be "processes".  AFAIK, same
>> spaces."
>
> OS/360 most certainly did not call anything "address spaces". Prior to
> TSO, the term was "job step".

Right, or "region" or "partition" depending on which flaor of OS you
were running.

However, a job step is not a unit of
> tasking; from that viewpoint, the word "task" was used. (Consider that
> the very terms MFT and MVT use "task" in this way.)
>
> You can't really establish any decent isomorphisms here. The OS/360
> concept set is wholly different.
>
> And old PL/I confused the issue further. PL/I defined a tree of mother
> and daughter tasks, but in the underlying implementation all the mothers
> and daughters were represented by first-generation daughters of a
>
```
 0
Reply Peter_Flass (956) 5/27/2010 10:45:50 AM

```glen herrmannsfeldt wrote:
> John W Kennedy <jwkenne@attglobal.net> wrote:
> (snip)
>
>> As far as I can recall, and as far as the available information on
>> BitSavers indicates, it isn't an implementation artifact. It is simply
>> the way the old PL/I tasking features were designed, and was in place
>> earlier than July, 1966. Compared to the contortions that both the F
>> and Optimizing runtimes had to go through to make EVENT variables work
>> as defined, it's simplicity itself.
>
> I always thought that there was some connection between EVENT
> variables and the OS/360 ECB (event contol block) such as
> used by the WAIT macro.
>

Yes, I did stuff in assembler that posted ECBs that were a PL/I EVENT
```
 0
Reply Peter_Flass (956) 5/27/2010 10:47:17 AM

```In <htk9su\$1ji\$1@news.eternal-september.org>, on 05/26/2010
at 07:12 PM, Peter Flass <Peter_Flass@Yahoo.com> said:

>Wiki is probably screwed up.

Most Wiki editors don't know History. What we now call threads used to
be called light weight processes (LWP).

>AFAIK, same

MVS can start a new Unix process in the same address space.

1. See above.

came along.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not

```
 0
Reply spamtrap16 (3722) 5/27/2010 10:55:54 AM

```In <4bfdcac8\$0\$31283\$607ed4bc@cv.net>, on 05/26/2010
at 09:28 PM, John W Kennedy <jwkenne@attglobal.net> said:

>OS/360 most certainly did not call anything "address spaces". Prior
>to  TSO, the term was "job step".

No, "partition" or "region", depending on which option you were
running. An address space does not terminate when a job step ends, or
even when a batch job ends.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not

```
 0
Reply spamtrap16 (3722) 5/27/2010 10:59:42 AM

```On 2010-05-16 18:28, Robert AH Prins wrote:
> Hi all,
>
> Can anyone give me some hints as to solve the following problem,
> preferably in a way that is faster than the way I used to do it, and
> without the bug in the current version;

Follow-up to the very first post, with compilable source extracted from
the current Pascal program:

{
This was Paul Green's literal comment to the code:

The method I used was as follows. As I read the data file, I keep a
record of the last occurance of each string, in a list sorted by the
line numbers of the occurances. For each line, I read the string and
find the last occurance. Then I check each possible series ending on
the penultament line. Then I update the list and continue. Now let's see
if I can explain this.

The reason this works is because the shortest series, as you describe
it, would both start and end with a string that occurs only once in
the series. If you have a sequence like C A B C D E for example, C is
in it twice; you can shorten the sequence by dropping the C. So that
means you only have to check the length of series that begin with the
last previous occurance of some string. And we have a list of those
lines. In order to check that the string previous to that is in the
series, we check the next last occurance to make sure it is at least
two lines before the first string in the series. In order to make sure
the current line is in the series, you skip series that start after
the last occurance of the current line, because that string must be
included in the series. Finally, I put in one extra test. You don't
have to check a series that extends past the penultamate occurance of
the penultamate line, since that means the last element of the series is
in the series twice. Well that's clear as mud. Does it sound like the
right method though?

My remarks:

Paul's code was fitted into the existing program that used a linked
list due to the unknown amount of records being read in. To simulate
the reading in, an array is allocated and that is later used in the
scanner. Going backwards through the array turned out to be necessary
to get the output in the correct order. Also his original routine
expected four character strings, for speed purposes these are cast into
longints, the real program uses longints and only at the stage where
the data is being printed, it uses these as index into the array with
the 4-character strings.

Below are the two sets of data that fail to return the 3-in-3 strings,
I've been tearing my hair out trying to find why, but without any
result, it's probably a case of Edgar Allan Poe's "The Purloined
Letter", with a solution so obvious that one misses it by staring at
the problem too long. (Wishful thinking?)

The two input files that fail are:

A   61  1
A   61  2
B   61  3
C   61  4
D   61  5
E   61  6
F   61  7
E   61  8
G   61  9
G   61 10
G   61 11
G   61 12
G   61 13
G   61 14
G   61 15
G   61 16
G   61 17
G   61 18
H   61 19
I   61 20
C   61 21
D   61 22
A   61 23
A   61 24
A   61 25
A   61 26
A   61 27

Result:

|    6 |    6 |    2 | A   B   C   D   E   F                   |
|      |      |   18 | G   H   I   C   D   A                   |
|    7 |    8 |    2 | A   B   C   D   E   F   G               |
|    8 |   17 |    7 | F   E   G   H   I   C   D   A           |
|    9 |   19 |    2 | A   B   C   D   E   F   G   H   I       |

Missing:

|    3 |    3 |    7 | F   E   G                               |

and

A   69  1
A   69  2
A   69  3
A   69  4
A   69  5
B   69  6
B   69  7
C   69  8
C   69  9
D   69 10
E   69 11
F   69 12
G   69 13
H   69 14
H   69 15
H   69 16
H   69 17
I   69 18
J   69 19
J   69 20
J   69 21

Result:

|    6 |    6 |    9 | C   D   E   F   G   H                   |
|    7 |    8 |    7 | B   C   D   E   F   G   H               |
|    8 |   10 |    5 | A   B   C   D   E   F   G   H           |
|    9 |   13 |    7 | B   C   D   E   F   G   H   I   J       |
|   10 |   15 |    5 | A   B   C   D   E   F   G   H   I   J   |

Missing:

|    3 |    3 |   17 | H   I   J                               |

}
type
tycona       = array [1..4] of char;
scanptr      = ^scan_list;
runptr       = ^run_list;

scan_list    = record                                     {s  16}
scan_nxt : scanptr;                       {   4}
id       : longint;                       {   4}
sca      : tycona;                        {   4}
fil      : array [1..4] of char;          {   4}
end;

spot_rec     = record                                     {s  16}
sca   : tycona;                           {   4}
atl   : longint;                          {   4}
id    : longint;                          {   4}
fil   : array [1..4] of char;             {   4}
end;

run_list     = record                                     {s1044}
run_nxt: runptr;                          {   4}
run_prv: runptr;                          {   4}
rlen   : longint;                         {   4}
scas   : longint;                         {   4}
id     : longint;                         {   4}
run    : array [1..256] of tycona;        {1024}
end;

run_data     = record                                     {s  16}
run_top: runptr;                          {   4}
run_end: runptr;                          {   4}
run_cnt: longint;                         {   4}
fil    : array [1..4] of char;            {   4}
end;

_sc_ptr      = ^_sc_tab;
_sc_tab      = array [1..65536 div sizeof(scanptr)] of scanptr;

_spot_ptr    = ^_spot_arr;
_spot_arr    = array [0..65536 div sizeof(spot_rec) - 1] of spot_rec;

_run_ptr     = ^_run_arr;
_run_arr     = array [0..65536 div sizeof(run_data) - 1] of run_data;

const sca_skel  : string[24]  = ' |      |      |      | ';
const run_ptr  : runptr    = nil;
const sc_tab   : _sc_ptr   = nil;
const spot_arr : _spot_ptr = nil;
const run_arr  : _run_ptr  = nil;
const _atline  : longint = 0;
const _lnls    : longint = 0;
const _lnls0   : longint = 0;
const _lsat    : longint = 0;
const _i       : longint = 0;
const _nct     : longint = 0;
const _nreq    : longint = 0;
const _scount  : longint = 0;
const
scan_ptr: scanptr = nil;
scan_top: scanptr = nil;
scan_end: scanptr = nil;
const
_line  : string[255]         = '';
_linex : array[1..4] of char = #0#0#0#0;

var w_tycona   : tycona;
var print      : string;
var scanin     : text;

begin
str(_l:4, print);
_line:= _line + print + ' | ';

procedure dispose_run(_i: longint);
var nxt_run: runptr;

begin
run_ptr:= run_arr^[_i].run_top;

while run_ptr <> nil do
begin
nxt_run:= run_ptr^.run_nxt;
dispose(run_ptr);
run_ptr:= nxt_run;
end;

run_arr^[_i].run_top:= nil;
run_arr^[_i].run_cnt:= 0;
end; {dispose_run}

procedure setup_run(_n: longint);
var _in: longint;

begin
new(run_ptr);
fillchar(run_ptr^.run_nxt, sizeof(run_ptr^), #0);
fillchar(run_ptr^.run    , sizeof(run_ptr^.run), ' ');

if run_arr^[_n].run_top = nil then
run_arr^[_n].run_top:= run_ptr
else
begin
run_arr^[_n].run_end^.run_nxt:= run_ptr;
run_ptr^.run_prv             := run_arr^[_n].run_end;
end;

run_arr^[_n].run_end:= run_ptr;

inc(run_arr^[_n].run_cnt);

run_ptr^.rlen := _atline - spot_arr^[_n].atl;
run_ptr^.scas := _n;
run_ptr^.id   := spot_arr^[1].id;

for _in:= 1 to _n do
run_ptr^.run[_in]:= spot_arr^[_in].sca;
end; {setup_run}

procedure check_runs;
var _j: longint;
var _k: longint;
var _l: longint;

begin
_k:= 3;
if _k < _lsat then
_k:= _lsat;

for _j:= _k to _nct do
begin
_l:= spot_arr^[_j].atl;

if _lnls0 <= _l then
if (_j = _nct) or
(_l > spot_arr^[_j + 1].atl + 1) then
begin
if (run_arr^[_j].run_top = nil) or
(run_arr^[_j].run_top^.rlen > _atline - _l) then
begin
dispose_run(_j);
setup_run(_j);
end
else
if (run_arr^[_j].run_top <> nil) and
(run_arr^[_j].run_top^.rlen = _atline - _l) then
setup_run(_j);
end;
end;
end; {check_run}

{
* Scanner:
*
* Part of the code in this procedure was originally written by Paul
* Green as a result of a posting in comp.lang.pascal.borland, asking
* for a faster method of performing scan.
}
procedure scanner;
var _i     : longint;
var _k     : longint;
var _id    : longint;

var scn_in : tycona;

begin;
_i:= (_scount + 2) * sizeof(run_data);
getmem(run_arr, _i);
fillchar(run_arr^, _i, #0);

_i:= (_scount + 2) * sizeof(spot_rec);
getmem(spot_arr, _i);
fillchar(spot_arr^, _i, #0);

writeln('Scanning');

for _i:= _scount downto 1 do
begin
move(sc_tab^[_i]^.sca[1], w_tycona, sizeof(w_tycona));
_id:= sc_tab^[_i]^.id;

inc(_atline);

_lsat:= 1;
while (_lsat             <= _nct) and
(longint(w_tycona) <> longint(spot_arr^[_lsat].sca)) do
inc(_lsat);

if _lsat > _nct then
_lsat:= 0;

_lnls:= 0;
if _lsat <> 0 then
begin
_lnls:= spot_arr^[_lsat].atl;

if _nct >= 3 then
check_runs;
end;

_lnls0:= _lnls;

if _lsat = 0 then
begin
inc(_nct);
_lsat:= _nct;
end;

if _lsat > 1 then
move(spot_arr^[1], spot_arr^[2], sizeof(spot_rec) * pred(_lsat));

spot_arr^[1].sca := w_tycona;
spot_arr^[1].id  := _id;
spot_arr^[1].atl := _atline;
end;

inc(_atline);
_lsat:= 1;
_lnls:= _atline + 1;
check_runs;

for _lnls:= 3 to _nct do
begin
run_ptr:= run_arr^[_lnls].run_end;

if (run_ptr <> nil) and
((_lnls = _nct)   or
(_lnls < _nct)   and
(run_arr^[succ(_lnls)].run_end <> nil) and
(run_ptr^.rlen < run_arr^[succ(_lnls)].run_end^.rlen)) then
begin
_line:= sca_skel;
_line[0]:= #3;

_line[0]:= #24;

while run_ptr <> nil do
begin
_k:= 1;

repeat
move(run_ptr^.run[_k], _line[25], 40);
inc(_line[0], 39);
_line:= _line + ' | ';
writeln(_line);

_line:= sca_skel;
inc(_k, 10);
until run_ptr^.run[_k][1] = ' ';

run_ptr:= run_ptr^.run_prv;

_line[0]:= #17;
if run_ptr <> nil then
_line[0]:= #24;
end;
end;

dispose_run(_lnls);
end;
end; {scanner}

begin
assign(scanin, 'scanin.txt');
reset (scanin);

_scount:= 0;
repeat
_line:= '    ';
new(scan_ptr);

if scan_top = nil then
scan_top:= scan_ptr
else
scan_end^.scan_nxt:= scan_ptr;

scan_end:= scan_ptr;

inc(_scount);
scan_ptr^.scan_nxt:= nil;
scan_ptr^.id:= _scount;
move(_line[1], scan_ptr^.sca, 4);
until eof(scanin);

getmem(sc_tab, _scount * sizeof(scan_ptr));

scan_ptr:= scan_top;
_i      := 0;

while (scan_ptr <> nil) do
begin
inc(_i);
sc_tab^[_i]:= scan_ptr;
scan_ptr   := scan_ptr^.scan_nxt;
end;

scanner;
end.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```Robert AH Prins wrote:
> On 2010-05-16 18:28, Robert AH Prins wrote:
>> Hi all,
>>
>> Can anyone give me some hints as to solve the following problem,
>> preferably in a way that is faster than the way I used to do it, and
>> without the bug in the current version;
>
> Follow-up to the very first post, with compilable source extracted from
> the current Pascal program:
>
> {
> This was Paul Green's literal comment to the code:
>
> The method I used was as follows. As I read the data file, I keep a
> record of the last occurance of each string, in a list sorted by the
> line numbers of the occurances. For each line, I read the string and
> find the last occurance. Then I check each possible series ending on
> the penultament line. Then I update the list and continue. Now let's see
> if I can explain this.

Robert, there are two critical issues here:

a) Do you have to run this Pascal version of the program? I.e. is there
anything that stops you from simply switching to one of the two fast &
working versions we have written as a result of your OP?

b) How fast is this Pascal algorithm?

If it is significantly faster, then it might be worthwhile to figure out
where it fails and see if a fix is possible that doesn't make it too slow!

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```In article <kac1d7-22h.ln1@ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> writes:
> James J. Weinkam wrote:
>> Here is the PL/I source code:
>>
>> %process mar(2,100) offset;
>> subsets: proc options(main) reorder;
>
>
> This source code really needs some indenting and white space!
>
> As it is it is almost unreadable. :-(
>

So is your reply, unless you put comp.lang.pl1 back in the newsgroups
list.

>> The program is vary fast. The complexity is O(n*t). For n=4096 and t=256
>
> Since my approach terminates each scan on the first collision, it is
> O(n*sqrt(t)), while keeping the working set very low.
>
>> which is the maximum size
>> contemplated by the OP, the computational portion of the program
>> requires only 30ms.
>
> You need to run your program on all of the same data sets given by the
> OP, verifying both the results and the runtime!
>
> As I wrote yesterday, my current program has done the 2300 char set in
> less than a ms total, both scanning and sorting/filtering.
>
> OTOH, all of the competing solutions we've come up with are _far_ faster
> than what could be needed. :-)
>
> Terje
>
```
 0

```On 2010-05-27 16:48, Terje Mathisen wrote:
> Robert AH Prins wrote:
>> On 2010-05-16 18:28, Robert AH Prins wrote:
>>> Hi all,
>>>
>>> Can anyone give me some hints as to solve the following problem,
>>> preferably in a way that is faster than the way I used to do it, and
>>> without the bug in the current version;
>>
>> Follow-up to the very first post, with compilable source extracted from
>> the current Pascal program:
>>
>> {
>> This was Paul Green's literal comment to the code:
>>
>> The method I used was as follows. As I read the data file, I keep a
>> record of the last occurance of each string, in a list sorted by the
>> line numbers of the occurances. For each line, I read the string and
>> find the last occurance. Then I check each possible series ending on
>> the penultament line. Then I update the list and continue. Now let's see
>> if I can explain this.
>
> Robert, there are two critical issues here:
>
> a) Do you have to run this Pascal version of the program? I.e. is there
> anything that stops you from simply switching to one of the two fast &
> working versions we have written as a result of your OP?

No, I can (and quite likely will) change the code any way I like, but
the code will remain Pascal (and PL/I), as this routine is part of a
much larger program - this routine produces just one of the many tables
and the post-processing logic, a few more Pascal programs on the PC and
a bunch of ISPF edit macro's on z/OS rely on the current order of the
tables.

The "problem" is that I have to translate the two fast solutions to
Pascal and as far as yours is concerned... I've got some fairly basic
constructions above my rather modest knowledge of the language.

BTW, I see (in your 2010-05-24 06:09 posting) that you use a qsort. The
PG solution does not need any sorts... Or maybe your as yet unpublished
fastest version has also dispensed with it?

> b) How fast is this Pascal algorithm?
>
> If it is significantly faster, then it might be worthwhile to figure out
> where it fails and see if a fix is possible that doesn't make it too slow!

Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
somewhat fancy printout adds around another 4M cycles. Source is
compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
express it politely, a very optimizing compiler, so I guess it's pretty
good - using the assembler output option of VP it would also be pretty
easy to convert it into in-line assembler, giving considerable scope for
optimizations.

Sadly it cannot handle two cases and although I'm currently out of work,
there is more to life than sitting behind a screen for up to 10 hours a
day, certainly now that Mrs Prins has also returned from Vilnius...

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```On 2010-05-26 21:34:30 -0400, glen herrmannsfeldt said:

> John W Kennedy <jwkenne@attglobal.net> wrote:
> (snip)
>
>> As far as I can recall, and as far as the available information on
>> BitSavers indicates, it isn't an implementation artifact. It is simply
>> the way the old PL/I tasking features were designed, and was in place
>> earlier than July, 1966. Compared to the contortions that both the F
>> and Optimizing runtimes had to go through to make EVENT variables work
>> as defined, it's simplicity itself.
>
> I always thought that there was some connection between EVENT
> variables and the OS/360 ECB (event contol block) such as
> used by the WAIT macro.

Oh, undoubtedly there is a /connection/. But nothing like a one-to-one
mapping. You see, there are two major problems:

Only one OS/360 task can wait on a given ECB, but any number of old
PL/I TASKs could wait on a given EVENT.

Furthermore, as soon as an EVENT went complete, /all/ the old PL/I
TASKs waiting on it would become dispatchable, even if the EVENT went
incomplete again before they could be dispatched.

The contortions the runtimes, both F and Optimizing, had to go through
to fix this functional mismatch were something extraordinary, and
rather expensive. I once tried using PL/I tasking to decouple PLISRTD's
E15 and E35 to produce an interface more resembling the COBOL SORT,
RELEASE, and RETURN verbs, but I found that the CPU overhead entailed
was unacceptable, because each operation involved several shoulder taps
back and forth among the PL/I TASKs and the supernumerary Control Task.

--
John W Kennedy
A proud member of the reality-based community.

```
 0
Reply jwkenne (1440) 5/28/2010 3:19:25 AM

```Robert AH Prins wrote:
> On 2010-05-27 16:48, Terje Mathisen wrote:
>> Robert, there are two critical issues here:
>>
>> a) Do you have to run this Pascal version of the program? I.e. is there
>> anything that stops you from simply switching to one of the two fast &
>> working versions we have written as a result of your OP?
>
> No, I can (and quite likely will) change the code any way I like, but
> the code will remain Pascal (and PL/I), as this routine is part of a
> much larger program - this routine produces just one of the many tables
> and the post-processing logic, a few more Pascal programs on the PC and
> a bunch of ISPF edit macro's on z/OS rely on the current order of the
> tables.

OK. I assume you for some reason cannot link in functions written in C
either?
>
> The "problem" is that I have to translate the two fast solutions to
> Pascal and as far as yours is concerned... I've got some fairly basic
> understanding of C, but even your source is already using language
> constructions above my rather modest knowledge of the language.

That's easy: There's no "funky" C at all in my code, it is really an asm
algorithm written in C shorthand.
>
> BTW, I see (in your 2010-05-24 06:09 posting) that you use a qsort. The
> PG solution does not need any sorts... Or maybe your as yet unpublished
> fastest version has also dispensed with it?

The qsort is actually slower than a tiny custom (sort) routine to get
the potential solutions into the order I need for the final filtering.

I have written many such routines in my Pascal days. :-)
>
>> b) How fast is this Pascal algorithm?
>>
>> If it is significantly faster, then it might be worthwhile to figure out
>> where it fails and see if a fix is possible that doesn't make it too
>> slow!
>
> Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
> somewhat fancy printout adds around another 4M cycles. Source is

OK, so it is 4-10 times slower than my best (working) code, and it
doesn't actually work.

> compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
> express it politely, a very optimizing compiler, so I guess it's pretty
> good - using the assembler output option of VP it would also be pretty
> easy to convert it into in-line assembler, giving considerable scope for
> optimizations.

So maybe an inline asm of my C code would be the easiest? Then you would
just have to change the function definition. :-?

int scan(byte *data, int data_len)
{
/* esi -> data
ecx = data_len
*/
__asm {
mov esi,[data]
mov ecx,[data_len]
push ebp
lea ebp,[solution] // solutions = 0

add ecx,esi	// Points one beyond the end of data array
inc esi		// Skipping the first byte

cmp esi,ecx	// If we had less than 3 chars, then there
jae outer_done // cannot be any solutions!

outer_loop:		// Scanning from i=1 to data_len
mov bl,[esi]	// Current byte
lea edi,[esi-1]	// Start at prev byte
inc esi		// Get ready for next iteration

inner_loop:		// Scanning backwards from i-1 to 0
cmp bl,[edi]	// Same as byte we started with?
je found_equal	// Yes! (Not a new unique char, so abort)

// Increment count of unique values for this starting position:
mov edx,count[edi*4]
mov eax,esi
inc edx
sub eax,edi	// String length
mov count[edi*4],edx
cmp edx,3	// At least 3 unique values?
jb skip_too_short

cmp eax,shortest[edx*4]
ja not_shortest_string

// Save as potential solution:
mov shortest[edx*4],eax
mov [ebp.len],eax
mov [ebp.uniq],edx
mov [ebp.start],edi

not_shortest_string:
sub edi,1
jnc inner_loop

found_equal:
cmp esi,ecx
jb outer_loop
outer_done:
lea eax,[ebp - offset solution]
pop ebp
shr eax,4	// EAX has potential solution count
}
}

Better?

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-05-28, Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> wrote:
> So maybe an inline asm of my C code would be the easiest? Then you would
> just have to change the function definition. :-?

(and assumptions about calling conventions and other ABI stuff).

But it is more readable than PL/I, and that's what counts :-)

/me ducks for cover.

```
 0

```On 2010-05-28, Robert AH Prins <spamtrap@prino.org> wrote:
>> where it fails and see if a fix is possible that doesn't make it too slow!
>
> Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
> somewhat fancy printout adds around another 4M cycles. Source is
> compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
> express it politely, a very optimizing compiler, so I guess it's pretty
> good - using the assembler output option of VP it would also be pretty
> easy to convert it into in-line assembler, giving considerable scope for
> optimizations.

Any chance to repeat the timing with Delphi or FPC? It would make a great
table.

```
 0

```On 2010-05-28 08:16, Terje Mathisen wrote:
> Robert AH Prins wrote:
>> On 2010-05-27 16:48, Terje Mathisen wrote:
>>> Robert, there are two critical issues here:
>>>
>>> a) Do you have to run this Pascal version of the program? I.e. is there
>>> anything that stops you from simply switching to one of the two fast &
>>> working versions we have written as a result of your OP?
>>
>> No, I can (and quite likely will) change the code any way I like, but
>> the code will remain Pascal (and PL/I), as this routine is part of a
>> much larger program - this routine produces just one of the many tables
>> and the post-processing logic, a few more Pascal programs on the PC and
>> a bunch of ISPF edit macro's on z/OS rely on the current order of the
>> tables.
>
> OK. I assume you for some reason cannot link in functions written in C
> either?
>>
>> The "problem" is that I have to translate the two fast solutions to
>> Pascal and as far as yours is concerned... I've got some fairly basic
>> understanding of C, but even your source is already using language
>> constructions above my rather modest knowledge of the language.
>
> That's easy: There's no "funky" C at all in my code, it is really an asm
> algorithm written in C shorthand.
>>
>> BTW, I see (in your 2010-05-24 06:09 posting) that you use a qsort. The
>> PG solution does not need any sorts... Or maybe your as yet unpublished
>> fastest version has also dispensed with it?
>
> The qsort is actually slower than a tiny custom (sort) routine to get
> the potential solutions into the order I need for the final filtering.
>
> I have written many such routines in my Pascal days. :-)
>>
>>> b) How fast is this Pascal algorithm?
>>>
>>> If it is significantly faster, then it might be worthwhile to figure out
>>> where it fails and see if a fix is possible that doesn't make it too
>>> slow!
>>
>> Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
>> somewhat fancy printout adds around another 4M cycles. Source is
>
> OK, so it is 4-10 times slower than my best (working) code, and it
> doesn't actually work.
>
>
>
>> compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
>> express it politely, a very optimizing compiler, so I guess it's pretty
>> good - using the assembler output option of VP it would also be pretty
>> easy to convert it into in-line assembler, giving considerable scope for
>> optimizations.
>
> So maybe an inline asm of my C code would be the easiest? Then you would
> just have to change the function definition. :-?
>
> int scan(byte *data, int data_len)
> {
> /* esi -> data
> ecx = data_len
> */
> __asm {
> mov esi,[data]
> mov ecx,[data_len]
> push ebp
> lea ebp,[solution] // solutions = 0
>
> add ecx,esi // Points one beyond the end of data array
> inc esi // Skipping the first byte
>
> cmp esi,ecx // If we had less than 3 chars, then there
> jae outer_done // cannot be any solutions!
>
> outer_loop: // Scanning from i=1 to data_len
> mov bl,[esi] // Current byte
> lea edi,[esi-1] // Start at prev byte
> inc esi // Get ready for next iteration
>
> inner_loop: // Scanning backwards from i-1 to 0
> cmp bl,[edi] // Same as byte we started with?
> je found_equal // Yes! (Not a new unique char, so abort)
>
> // Increment count of unique values for this starting position:
> mov edx,count[edi*4]
> mov eax,esi
> inc edx
> sub eax,edi // String length
> mov count[edi*4],edx
> cmp edx,3 // At least 3 unique values?
> jb skip_too_short
>
> cmp eax,shortest[edx*4]
> ja not_shortest_string
>
> // Save as potential solution:
> mov shortest[edx*4],eax
> mov [ebp.len],eax
> mov [ebp.uniq],edx
> mov [ebp.start],edi
>
> not_shortest_string:
> sub edi,1
> jnc inner_loop
>
> found_equal:
> cmp esi,ecx
> jb outer_loop
> outer_done:
> lea eax,[ebp - offset solution]
> pop ebp
> shr eax,4 // EAX has potential solution count
> }
> }
>
> Better?

No! ;) (But thank you anyway!)

I should have done it myself.

Anyway, I need to get back to the C code, as I also need a PL/I version!
But before doing anything, I will spend one more day trying to find the
bug in the existing code.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```On 2010-05-28 09:05, Marco van de Voort wrote:
> On 2010-05-28, Robert AH Prins<spamtrap@prino.org>  wrote:
>>> where it fails and see if a fix is possible that doesn't make it too slow!
>>
>> Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
>> somewhat fancy printout adds around another 4M cycles. Source is
>> compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
>> express it politely, a very optimizing compiler, so I guess it's pretty
>> good - using the assembler output option of VP it would also be pretty
>> easy to convert it into in-line assembler, giving considerable scope for
>> optimizations.
>
> Any chance to repeat the timing with Delphi or FPC? It would make a great
> table.

I only have an uninstalled version of Turbo Delphi somewhere and after
having found a full blown PL/I compiler for Windows, equivalent to
V3.9.0 on z/OS, I'm not sure if I want to install FPC, when VP does, at
least at the moment, everything I need.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```["Followup-To:" header set to comp.lang.pascal.borland.]
On 2010-05-28, Robert AH Prins <spamtrap@prino.org> wrote:
> On 2010-05-28 09:05, Marco van de Voort wrote:
>> On 2010-05-28, Robert AH Prins<spamtrap@prino.org>  wrote:
>>>> where it fails and see if a fix is possible that doesn't make it too slow!
>>>
>>> Just timed it via RDTSC. Actual scan time is around 2.7M cycles, and the
>>> somewhat fancy printout adds around another 4M cycles. Source is
>>> compiled with Virtual Pascal 2.1 build 279, and VP isn't really, to
>>> express it politely, a very optimizing compiler, so I guess it's pretty
>>> good - using the assembler output option of VP it would also be pretty
>>> easy to convert it into in-line assembler, giving considerable scope for
>>> optimizations.
>>
>> Any chance to repeat the timing with Delphi or FPC? It would make a great
>> table.
>
> I only have an uninstalled version of Turbo Delphi somewhere and after
> having found a full blown PL/I compiler for Windows, equivalent to
> V3.9.0 on z/OS, I'm not sure if I want to install FPC, when VP does, at
> least at the moment, everything I need.

FPC is relatively self contained (the worst can happen is that the installer
adds the directory to the path, which can be undone easily)

The question was not as much meant to convince you to swap compilers, just
that it would be interesting to see how the compilers fare relatively to
eachother.  (and maybe pit it against a GCC implementation of the same
algorithm)

```
 0

```On Fri, 28 May 2010 10:16:53 +0200
Terje Mathisen <"terje.mathisen at tmsw.no"@giganews.com> wrote:

>
> int scan(byte *data, int data_len)
> {
>      /* esi -> data
>         ecx = data_len
>       */
>      __asm {
> 	mov esi,[data]
> 	mov ecx,[data_len]
> 	push ebp
> 	lea ebp,[solution] // solutions = 0
>
> 	add ecx,esi	// Points one beyond the end of data array
> 	inc esi		// Skipping the first byte
>
> 	cmp esi,ecx	// If we had less than 3 chars, then there
> 	 jae outer_done // cannot be any solutions!
>
> outer_loop:		// Scanning from i=1 to data_len
> 	mov bl,[esi]	// Current byte
> 	lea edi,[esi-1]	// Start at prev byte
> 	inc esi		// Get ready for next iteration
>
> inner_loop:		// Scanning backwards from i-1 to 0
> 	cmp bl,[edi]	// Same as byte we started with?
> 	 je found_equal	// Yes! (Not a new unique char, so
> abort)
>
> // Increment count of unique values for this starting position:
> 	mov edx,count[edi*4]
> 	mov eax,esi
> 	inc edx
> 	sub eax,edi	// String length
> 	mov count[edi*4],edx
> 	cmp edx,3	// At least 3 unique values?
> 	 jb skip_too_short
>
> 	cmp eax,shortest[edx*4]
> 	 ja not_shortest_string
>
> // Save as potential solution:
> 	mov shortest[edx*4],eax
> 	mov [ebp.len],eax
> 	mov [ebp.uniq],edx
> 	mov [ebp.start],edi
>
> not_shortest_string:
> 	sub edi,1
> 	 jnc inner_loop
>
> found_equal:
> 	cmp esi,ecx
> 	 jb outer_loop
> outer_done:
> 	lea eax,[ebp - offset solution]
> 	pop ebp
> 	shr eax,4	// EAX has potential solution count
>      }
> }
>
> Better?
>
> Terje

bmaxa@maxa:~/fasm/better\$ cat test.asm
format ELF64 executable
segment writeable executable

dta rb 256
len dw 256

{
mov ebx, fd
mov ecx, buf
mov edx, size
int 0x80
}
macro sys_write fd, buf, size
{
mov eax, 4 ; sys_write
mov ebx, fd
mov ecx, buf
mov edx, size
int 0x80
}

macro sys_exit rc
{
mov eax,1 ; exit
mov ebx,rc
int 0x80
}

macro scan data, data_len, srch
{
mov edi,dword[data]
mov ecx,dword[data_len]
mov eax,srch
cld
repne scasb
}

entry \$
scan dta,len,'a'
sys_exit 0

;Better?

Greets!

--
http://maxa.homedns.org/

Sometimes online sometimes not

Svima je "dozvoljeno" biti idiot i
> mrak, ali samo neki to odaberu,

```
 0

```On 2010-05-21 01:38, James J. Weinkam wrote:
> Robert AH Prins wrote:
>>
>> His code is in Pascal, and until I started running it for parts of the
>> input string I had had blind faith in it. Wrong, as it turns out as
>> there are three strings that it does not process correctly, and all my
>> debugging hasn't given me any clues as to the why.
>>
> Can you send a copy of the data set that fails? Also, if it isn't prying
> into your business, would you describe the application this problem is
> abstracted from? You indicated in your original post that the values
> (characters in the posted esample) were actually indices into a table in
> the real application.

Somehow I missed this post, apologies!

The values are indices into a table of "DISTINGUISHING SIGNS USED ON
VEHICLES IN INTERNATIONAL TRAFFIC", or in more understandable language,
"international licence plate country code" signs - the current full
official list can be found at

http://www.unece.org/trans/conventn/Distsigns.pdf

one also containing informal ones can be found on Wikipedia,

The upper limit of 255 comes from here.

As for the application itself?

Ahum, it's from a program that processes hitchhike data...

The addition of this particular table (one of about 60 or so) was
inspired in 1995 by a ride with Danish truck driver who took me from
somewhere around J�nk�ping to Stockholm. He was the fourth consecutive
driver with the fourth distinct nationality, D-NL-S-DK. When I told him
so, he asked me if this had happened before and I told him I did not
have a clue, but the eventual addition of the table produced by my
sliding window code told me that it had in fact happened on 13 previous
occasions.

The program is used by a few other hitch-hikers and one of them asked me
if I could change the code to also produce this table for the trips done
in a single year, and it was this change that resulted in the discovery
of the erroneous results for several sets of data.

The hitchhiking angle also explains why the data grows at a rather
leisurely pace, I'm no longer a spring-chicken and I do not hitchhike as
much as I used to do.

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```Robert AH Prins wrote:
> Ahum, it's from a program that processes hitchhike data...
<BG>
>
> The addition of this particular table (one of about 60 or so) was
> inspired in 1995 by a ride with Danish truck driver who took me from
> somewhere around J=F6nk=F6ping to Stockholm. He was the fourth consecut=
ive
> driver with the fourth distinct nationality, D-NL-S-DK. When I told him
> so, he asked me if this had happened before and I told him I did not
> have a clue, but the eventual addition of the table produced by my
> sliding window code told me that it had in fact happened on 13 previous
> occasions.
>
> The program is used by a few other hitch-hikers and one of them asked m=
e
> if I could change the code to also produce this table for the trips don=
e
> in a single year, and it was this change that resulted in the discovery
> of the erroneous results for several sets of data.

So in reality, this program only has to keep up with the rate of new=20
rides, comparing the current string with the previous records, right?
>
> The hitchhiking angle also explains why the data grows at a rather
> leisurely pace, I'm no longer a spring-chicken and I do not hitchhike a=
s
> much as I used to do.

I think I stopped doing that nearly 25 years ago: My wife & I tramped=20
around the US for two months on our honeymoon, flying all the=20
long-distance stretches, hitching rides on many shorter stretches.

In the US you'd have to check state licence plates instead, I guess.:-)

Anyway, I love the idea: Getting a large group of hotshot programmers to=20
optimize a program that could be handled with a small set of=20
handwritten/colored cards. :-)

Terje
>
> Robert

--=20
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```On 2010-06-09 06:11, Terje Mathisen wrote:
> Robert AH Prins wrote:
>> Ahum, it's from a program that processes hitchhike data...
> <BG>
>>
>> The addition of this particular table (one of about 60 or so) was
>> inspired in 1995 by a ride with Danish truck driver who took me from
>> somewhere around J�nk�ping to Stockholm. He was the fourth consecutive
>> driver with the fourth distinct nationality, D-NL-S-DK. When I told him
>> so, he asked me if this had happened before and I told him I did not
>> have a clue, but the eventual addition of the table produced by my
>> sliding window code told me that it had in fact happened on 13 previous
>> occasions.
>>
>> The program is used by a few other hitch-hikers and one of them asked me
>> if I could change the code to also produce this table for the trips done
>> in a single year, and it was this change that resulted in the discovery
>> of the erroneous results for several sets of data.
>
> So in reality, this program only has to keep up with the rate of new
> rides, comparing the current string with the previous records, right?

"Yes", he replied with a blush...

>> The hitchhiking angle also explains why the data grows at a rather
>> leisurely pace, I'm no longer a spring-chicken and I do not hitchhike as
>> much as I used to do.
>
> I think I stopped doing that nearly 25 years ago: My wife & I tramped
> around the US for two months on our honeymoon, flying all the
> long-distance stretches, hitching rides on many shorter stretches.

My wife (I met her at the 4th International Hitch-Hikers Congress in
2000) and I did a bit of hitchhiking on our honeymoon in Japan, and only
short stretches.

> In the US you'd have to check state licence plates instead, I guess.:-)

Your guess is as good as mine, I don't know if any Americans are using
the program.

> Anyway, I love the idea: Getting a large group of hotshot programmers to
> optimize a program that could be handled with a small set of
> handwritten/colored cards. :-)

The very first posting started with "Can anyone give me some *hints* as
to solve the following problem, preferably in a way that is faster than
the way I used to do it, and without the bug in the current version;"

How could I possibly have known that a large group of hotshot
programmers would jump on something one of them referred to as "This is
an intriguing problem, I'll have think some more... :-)" ;)

A few years ago I came across "CSCRX2HT.txt" on
and found that it had a raft of bugs. I managed to eliminate them,
thought that it would be fun to write similar routines for a bunch of
other "legacy" languages, even spent some of my own money on the
Javascript that produces the z/OS ISPF-like scrolling HTML and in the
end I made it freely available under the provisions of the GPL V3. See
http://www.cbttape.org/ftp/cbt/CBT769.zip - the contents are in TSO XMIT
format, use XMIT Manager, http://www.cbttape.org/njw/index.html to
extract the REXX code.

I've got a quote somewhere from a Dutch computer scientist, W.L. (Willem
Louis) van der Poel: "The intrinsic value of software is nil"

I may have cheated a little (or a lot, take your pick) by not telling
what the actual code was used for - if I had done so, the thread might
never have developed into the current pretty high quality-for-Usenet one.

The value of my hitchhike program is nil (or at least pretty close to
it), but to quote another contributor, Jerome Fine, "The problem is
very instructive and I really enjoy the description of what you fellows
are doing."

Robert
--
Robert AH Prins
spamtrap(a)prino(d)org
```
 0

```In comp.lang.pascal.borland message <877iv0FbfdU1@mid.individual.net>,
Tue, 8 Jun 2010 21:11:27, Robert AH Prins <spamtrap@prino.org> posted:

>
>The addition of this particular table (one of about 60 or so) was
>inspired in 1995 by a ride with Danish truck driver who took me from
>somewhere around J�nk�ping to Stockholm. He was the fourth consecutive
>driver with the fourth distinct nationality, D-NL-S-DK. When I told him
>so, he asked me if this had happened before and I told him I did not
>have a clue, but the eventual addition of the table produced by my
>sliding window code told me that it had in fact happened on 13 previous
>occasions.

That's a more understandable description than before, IMHO.

It does indeed seem a close match to what I wrote after and because of
reading that earlier description - JavaScript to find the shortest
interval with (all) 35 different Easter Sunday dates.

My script has a sliding window; the front of the window moves forward
until the content of the window becomes interesting, then the back moves
forward until the content is about to become uninteresting, and reports
if necessary.  Passes are so fast that they can be repeated for
different "interesting".

See <URL:http://www.merlyn.demon.co.uk/estrcons.HTM#ESP>, and "Pop Code"
- the code is essentially pascal-compatible.  Sets are not used; an
array of as many integer elements as there are possible elements is
needed.

FYI, the minimum span is 72 years, for both Gregorian and Julian; the
Julians are in the middle of such a period, but the Gregorians will have
to wait a few millennia for a shortest span.

--
(c) John Stockton, nr London, UK.    ?@merlyn.demon.co.uk     Turnpike v6.05.
Web  <URL:http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc : <URL:http://www.merlyn.demon.co.uk/programs/> - see 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.
```
 0

```Dr J R Stockton wrote:
> In comp.lang.pascal.borland message<877iv0FbfdU1@mid.individual.net>,
> Tue, 8 Jun 2010 21:11:27, Robert AH Prins<spamtrap@prino.org>  posted:
>
>>
>> The addition of this particular table (one of about 60 or so) was
>> inspired in 1995 by a ride with Danish truck driver who took me from
>> somewhere around J=F6nk=F6ping to Stockholm. He was the fourth consecu=
tive
>> driver with the fourth distinct nationality, D-NL-S-DK. When I told hi=
m
>> so, he asked me if this had happened before and I told him I did not
>> have a clue, but the eventual addition of the table produced by my
>> sliding window code told me that it had in fact happened on 13 previou=
s
>> occasions.
>
>
> That's a more understandable description than before, IMHO.
>
> It does indeed seem a close match to what I wrote after and because of
> reading that earlier description - JavaScript to find the shortest
> interval with (all) 35 different Easter Sunday dates.

Nice problem...
>
> My script has a sliding window; the front of the window moves forward
> until the content of the window becomes interesting, then the back move=
s
> forward until the content is about to become uninteresting, and reports
> if necessary.  Passes are so fast that they can be repeated for
> different "interesting".

I assume you run the scan from any given starting year, then cross out=20
the various dates as you get to them?

As soon as you've found all possible dates, you optimize by running the=20
back end forward until you get to the first date which isn't repeated=20
anywhere later in the interval.

This is very easy to do by using a counting array to cross out each date=20
as you pass it: Increment when the front end finds a date, decrement=20
when the back end passes it.

Increment the total count each time a counter goes from 0->1, decrement=20
for 1->0.

(Actually, in asm you could do a little trick, by using -1 as the=20
initial count for each date and -35 as the initial day count:

;; Forward scan, for each year
;; EAX =3D easter_sunday(year)

add day_count[EAX*4],1	; Generates a carry for -1 -> 0
adc EBX,0			; Total number of different days
jc Found_35_Front

Similarly for the tail end:

sub day_count[EAX*4],1
sbb EBX,0
jc Found_35_Tail

This makes the code totally branchless except for each time you actually=20
find an edge of a new interval.

> See<URL:http://www.merlyn.demon.co.uk/estrcons.HTM#ESP>, and "Pop Code"
> - the code is essentially pascal-compatible.  Sets are not used; an
> array of as many integer elements as there are possible elements is
> needed.
>
> FYI, the minimum span is 72 years, for both Gregorian and Julian; the
> Julians are in the middle of such a period, but the Gregorians will hav=
e
> to wait a few millennia for a shortest span.
>
:-)

Terje

--=20
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
```
 0

```In comp.lang.pascal.borland message <vta6e7-4vv2.ln1@ntp.tmsw.no>, Thu,
10 Jun 2010 07:48:13, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> posted:

>Dr J R Stockton wrote:

>> It does indeed seem a close match to what I wrote after and because of
>> reading that earlier description - JavaScript to find the shortest
>> interval with (all) 35 different Easter Sunday dates.
>
>Nice problem...
>>
>> My script has a sliding window; the front of the window moves forward
>> until the content of the window becomes interesting, then the back moves
>> forward until the content is about to become uninteresting, and reports
>> if necessary.  Passes are so fast that they can be repeated for
>> different "interesting".
>
>I assume you run the scan from any given starting year, then cross out
>the various dates as you get to them?

You assume incorrectly, to judge by that sentence.  But read on ...

>As soon as you've found all possible dates, you optimize by running the
>back end forward until you get to the first date which isn't repeated
>anywhere later in the interval.

Yes.

>This is very easy to do by using a counting array to cross out each
>date as you pass it:

I don't cross out.

> Increment when the front end finds a date, decrement when the back end
>passes it.

Yes.

>Increment the total count each time a counter goes from 0->1, decrement
>for 1->0.

Yes.

>(Actually, in asm you could do a little trick, by using -1 as the
>initial count for each date and -35 as the initial day count:

It's fast enough in JavaScript, even for the 5700000 year Gregorian
cycle.  There should be no need to optimise ASM code, fun though that
can be.

>> See<URL:http://www.merlyn.demon.co.uk/estrcons.HTM#ESP>, and "Pop Code"
>> - the code is essentially pascal-compatible.  Sets are not used; an
>> array of as many integer elements as there are possible elements is
>> needed.
>>
>> FYI, the minimum span is 72 years, for both Gregorian and Julian; the
>> Julians are in the middle of such a period, but the Gregorians will have
>> to wait a few millennia for a shortest span.

I did think of adapting it for the maximum span not containing all 35
dates, but that must be given by the longest interval between date

Aficionados of ISO 8601 will realise, with a little thought, that the
number of different dates is actually (probably; provably) 35 + 36 + 6.

various JavaScript full-range algorithms for Easter Sunday in
estralgs.txt - could you beat
<http://www.hugi.scene.org/compo/compoold.htm#compo20> in ASM while
retaining traceability to the formal standards (the Papal Bull, the
British Calendar Act, and whatever Norway relies on?!?)?  The rules for
compo20 give Oudin's algorithm, which I can beat.

--
(c) John Stockton, nr London, UK.    ?@merlyn.demon.co.uk     Turnpike v6.05.
Web  <URL:http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc : <URL:http://www.merlyn.demon.co.uk/programs/> - see 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.
```
 0

131 Replies
161 Views

Similar Articles

11/30/2013 6:51:10 AM
page loaded in 251553 ms -0

Similar Artilces:

Problem...
Hi all, I have a problem with my OGL application.... I realized a 3DLaser Sensor simulator: in a windows I put a point (glVertex)for every misure made by sensor... when I realize the 9434th misure and display the 9434th point I get an error: "A requested resource was not available". I think there is a limit in vertex number in OpenGL, is that True? Is there a way to jump over the problem? Any subjestion will be really appreciated Thank you in advance Walter Myzhar wrote: > Hi all, > I have a problem with my OGL application.... > I realized a 3DLaser Sensor simulator: in a windows I put a point > (glVertex)for every misure made by sensor... when I realize the 9434th > misure and display the 9434th point I get an error: "A requested resource > was not available". > > I think there is a limit in vertex number in OpenGL, is that True? Is there > a way to jump over the problem? No, there is no limit in the number of vertices that you can generate. There are limits on resources, but since I don't know if you are using display lists or vertex lists or neither, I don't know what resource you are running into. -- Andy V

Problem....
;Don't use passwords people can guess'. Thanks all for reply... Ive fixed problem...in legal way...;))) I used my login and pass on my collage ...when i closed explorer the cooke stayed there..so it was easy for who know who to send some abusive lines...it took me a few days too see all shit that someone was written.. cause i havent been on internet in that period .... aah...story is too long...and franky i dont even care anymore... aah damn...i cant even spell right...sorry for grammar errors.....im in a hurry...

IF problem?!
How is possible that My printout prints all value of \$sop, e.g. ----------------- My printout: ERI SOF HTM SOF ------------------ My Perl program: .... if (\$sup=="HTM") { print "\$sup\n"; } .... ------------------ <tihana@kata.com> wyrze�bi�(a): > if (\$sup=="HTM") { change == to eq. that's first. second - what about "new line characters"? depesz -- napisanie do mnie na priv daje 99.9% gwarancj� braku odpowiedzi *-----------------------------------------------------------------* sklep z rzeczami do domu:

problem!!!!
as 'Problem !!!!'. (Some of us won't even bother to read posts with these urgent, non-descriptive labels. But we WILL read posts with well-constructed subjects, if we feel that we can add anything useful. Cheers, Brett -- char(cumsum(... [32 83 -11 7 -10 7 7 -4 -1 -46 40 -3 7 -3 15 -74 64 -5 -1 -58 57 8 7]))

Problem with
calling generated SFunction, cupic_simulator_sfun. > > So it looks like I got a bit further, but now it just won't tell me what's wrong with the function. > > I tried a very simple simulation/model with a very simple embedded function -- same error. I have the same exact problem. Has anyone ever figured this out? The issue was that I needed the Windows Visual Studio 2008. It has the proper *things* to run embedded code in Simulink. Go get it! I only found out about this because a buddy of mine ran into the same problem when working on similar things. I would... have the same exact problem. Has anyone ever figured this out? "Matthew Lenda" <lendam@gmail.com> wrote in message <hqtdhh\$orc\$1@fred.mathworks.com>... > The issue was that I needed the Windows Visual Studio 2008. It has the proper *things* to run embedded code in Simulink. Go get it! > > I only found out about this because a buddy of mine ran into the same problem when working on similar things. I would have had no idea otherwise! Hi Matthew, I was wondering what you did to solve this. I'm having the same problem at the moment, but I've

problem
I send a UDP packet appears the following problem data access Exception current instruction address: 0x00494584 Machine Status Register: 0x0000b030 Data Access Register: 0x5a485a66 Condition Register: 0x44000040 Data storage interrupt Register: 0x40000000 Task: 0xf56c638 "VISW0:3" can you help me? alexbecks wrote: > I send a UDP packet appears the following problem > > data access > Exception current instruction address: 0x00494584 > Machine Status Register: 0x0000b030 > Data Access Register: 0x5a485a66 > Condition Register: 0x44000040 > Data storage interrupt Register: 0x40000000 > Task: 0xf56c638 "VISW0:3" > > > can you help me? Probably not, but my best guess is that you have mangled a pointer, or corrupted the stack. Speaking only for myself, Joe Durusau Joe Durusau <durusau_novirus@bellsouth.net> wrote in message news:<414C4648.9020506@bellsouth.net>... > alexbecks wrote: > > I send a UDP packet appears the following problem > > > > data access > > Exception current instruction address: 0x00494584 > > Machine Status Register: 0x0000b030 > > Data Access

problem with a minimazition problem.
Hello, I have a problem with a minimazition problem: Let my aim is to find x and a where f(x,a) is minimized with the constraint g(x,a)=0. Here "a" is parameter. It is known that for some values of x and a, function g does not have real roots of x. In matlab, I use fmincon function to solve this minimazition problem with "interior-point" algorithm. However, even in situations where g doesnt have real x roots , somehow Fmincon finds solution. I want to determine which solutions have real roots which are not. How can I do that? To which criterian do I have to look. I dont have optimization theory background. So if you help me, I will be very very glad. Thanks a lot. hagii <oercim@yahoo.com> wrote in message <c541e3ed-7416-4873-982b-23e93c4612fc@googlegroups.com>... > > However, even in situations where g doesnt have real x roots > , somehow Fmincon finds solution I don't believe it. That means, either 1) the setup of fmincon is not right, 2) you didn't not check carefully at the output flag of fmincon Bruno On Thursday, August 23, 2012 1:29:07 PM UTC+3, Bruno Luong wrote: > hagii <

mail problem?, DNS problem?
NWSTC.noaa.gov internet address = 195.56.32.3 SESTC.noaa.gov internet address = 196.112.123.245 MESTC.noaa.gov internet address = 197.94.173.133 MWSTC.noaa.gov internet address = 147.175.10.10 > exit Well it looks like there isn't a problem getting the mx records, at least with nslookup.. Is my client .cf file an optionor two short of a full deck? Or should I look else where for the problem.. Stuart Dr. Stuart A. Weinstein Ewa Beach Institute of Tectonics "To err is human, but to really foul things up requires a creationist" -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

A simple problem, but a problem ...
Hello, can anyone solve the following problem ? It looks quite simple to me, but I can't get it right. The advice suggests the following: mr(yes). mr(no). mrs(yes). mrs(no). marcel(yes). marcel(no). jacqu(yes). jacqu(no). grandpa(yes). grandpa(no). but how can I use this to solve the problem ? thanx a lot Maxx Problem: If Mr. Johnson goes, his wife goes too. At least one of the children, Marcel or Jacqueline, goes too. If Marcel goes, grandpa goes too. If Mrs. Johnson goes, grandpa stays at home. If Jacqueline goes, Mr. Johnson and Marcel go too. Write a prolog program, which...: > "If Mr. Johnson goes, his wife goes too." > Can you give me a concret example for this line ?=20 Same problem here, Andr=E9. I have come this far, but I'm missing the=20 if_clause condition: % A family is about to visit their friends. They are a bit complicated=20 as to who % is going. The problematic part of the family consist of Mr. Funky,=20 Mrs. Funky, % the kids Marc and Joseline and grandpa Funky. The rules are as follows:= % % If Mrs. Funky goes, his wife goes as well. % At least one of the kids Marc and Joseline will be going. % If Marc goes, grandpa Funky

Purify problem or compiler problem?
= 0x100000 Stack Use : (0x7ef0db30 0x7ef0dd54), size = 0x224 Thread 6 Stack Limit : (0x7e612000 0x7e616000), size = 0x4000 Stack Use : (0x7e615b28 0x7e615d54), size = 0x22c Thread 8 Stack Limit : (0x7e632000 0x7e634000), size = 0x2000 Stack Use : (0x7e633b28 0x7e633d54), size = 0x22c This is with CC. With gcc or g++ it does not have this problem. cc has the problem too but with a larger number in the array. Is it the compiler/linker bug or it purify making things up? If it is the compiler I assume I am screwing up memory badly. Matt Hello Matthew! 06 Feb 04 12:29, Matthew wrote to All: M> I am using sun's CC (c++ compiler) to compile the follow code: M> #define __REENTRANT M> #include <stdio.h> M> #include <pthread.h> etc, etc You will be better off placing the details of this problem in one of the 'C++' usegroups. Vince

Is this a BIOS problem, a kernel problem or a user problem :-)
Hi all, I have a Dual Core P4, which dmesg reports as (CPU0 and) CPU1: Intel(R) Pentium(R) 5 CPU 2.66GHz stepping 07 Unfortunately, it is also reporting CPU: Hyper-Threading is disabled I went through the BIOS (AMI dated 2005) and saw no option to enable HT. uname -a reports Linux hwa3 2.6.17-10-generic #2 SMP Fri Oct 13 18:45:35 UTC 2006 i686 GNU/Linux so clearly the BIOS and the kernel see the processor as dual core. Is there a utility to do this (enable hyperthreading)? Am I going to have to recompile the kernel? Is there a set of Intel dual core CPU's which does not have H

Spam Problem
Hope someone can help me with this problem... When i first got OS X , i set Mail in training mode to help fight the spam problem... Now i'm getting spam, and i Mail isn't catching it... I tried training mode for almost a month, but still getting through... What can i do to re-train Mail to catch this spam?? TIA!! -- --- (remove x before emailing) "your worst day at the ballpark, is better then your best day at work!!" In article <060920051044103098%denny536x@mac.com>, dennis <denny536x@mac.com> wrote: > Hope someone can help me with this problem... > > When i first got OS X , i set Mail in training mode to help fight the > spam problem... > > Now i'm getting spam, and i Mail isn't catching it... > I tried training mode for almost a month, but still getting through... Did you actually click the "junk" button when you got a spam? It's not enough to turn on training, you actually have to train it. -- Tom "Tom" Harrington Macaroni, Automated System Maintenance for Mac OS X. Version 2.0: Delocalize, Repair Permissions, lots more. See http://www.atomicbird.com/ In article <

Hi all: I have a problem when using the following code to push a file touser: HttpServletRequest response; filename = "foo bar.idx" response.setContentType("application/mubot.bitmap"); response.setHeader( "Content-disposition", "attachment;filename=" + filename); The file name show in the download dialogue is "foo", which isexpected to be "foo bar.idx", I think it may be caused by the spacebetween "foo" and "bar", so how can I fix it? Appreciate your help! On Mar 27, 7:29 pm, "shellon" <zhouchen...@gmail.com> wrote:...> The file name show in the download dialogue is "foo", which is> expected to be "foo bar.idx", I think it may be caused by the space> between "foo" and "bar", so how can I fix it?Note the output;<sscce>import java.io.File;class SpaceInName { public static void main(String[] args) throws Exception { File file = new File("foo bar.idx"); System.out.println( file.toURI().toURL() ); }}</sscce>HTHAndrew T.

boot problem
Greetings, I'm new to NetBSD, having come from Linux. I'm pretty much a unix intermediate user. Having installed NetBSD 2.0.2 on my machine. I then proceeded to use sushi to modify the startup configuration (rc.conf). Everything appeared to go fine. Upon rebooting the machine I get a whole slew of WARNING messages from /etc/rc. All of these messages are of the form: /etc/rc: WARNING: \$<variable> is not set properly - see rc.conf(5). Can anyone help or refer me to help? Thanks. K KCA <kadams@ssd.com> wrote: > I'm new to NetBSD, having come from Li

problem with JDBC
Hi,I have problem with JDBC.I am using netBeans 5.5 along with MySQL 5.0 And connector/J Driverfor the same.I have installed it properly.Even though I am getting databaseconnection in runtime window the program doesn't work.But when I run the program I get 'com.mysql.jdbc.Driver'ClassNotFoundException .What shoud I do to get it corrected?Can anybody hep me? Try pasting the connector in de directory:C:\Archivos de programa\Java\jdk1.6.0\jre\lib\extThat is in my computer; of course, you must translate it into the one equivalent in yours. ElLlaneroSolitario wrote:>Try pasting... 10, 5:56 pm, "Pappu" <prashant.po...@gmail.com> wrote: .... > I have problem with .. ...multi-posting? That is serious. Please find a solution to the multi-posting problem - soon. <http://www.physci.org/codes/javafaq.html#xpost> (N.B. X-post to c.l.j.p./g., w/f-u to c.l.j.p. only) Andrew T.

Problem with SSL
Hi,I am trying to develop an email client which will be able to send mailfrom more than one email accounts.The problem is that I am facing is that one mail account requires SSLand one does not.I am adding the security provider using the following statement:Security.addProvider(new com.sun.net.ssl.internal.ssl.Provider());Now, how do I remove this provider?Because theSeutrity.removeProvider(String)function requires a String specifying the name of the provider which Idon't have.Please help me.thanks. Nikhil wrote:> Now, how do I remove this provider?Because the> Seutrity.removeProvider(String)> function requires a String specifying the name of the provider which I> don't have.You don't have to remove the provider at all, and you don't even have to add it since 1.4. See the current JSSE Reference Guide in the JDK/Guide to Features/Security.

java problem
Hi Sir,I am developing my project in java, I have to work on JAI then whichversion I have to install & send me the path from where I have toinstall?ThanksManisha manisha chawla wrote:..Sub: java problem (/java question)Note that ..1)'Developing with JAI' ..would be a better (more specific) subject line than either of the ones you put.2) It is easier to avoid the problem of remembering toadd 'Mam' by not putting any 'Hi..' at the beginning.Nobody would miss it.>I am developing my project in java, I have to work on JAI then which>version I have to install ..It is usually best to use the latest version available.>..& send me the path from where I have to>install?Usually Java API's do not need to be 'installed',they just need to be available to the application at compile and runtime. Put them wherever is most convenient. OTOH, a google for 'JAI install' points to here (2ndtop link at instant of search)..<http://java.sun.com/products/java-media/jai/INSTALL-1_1_2.html>The Google description based on that link also shows 'The executable will try to install JAI within the latest JDK (or JRE) version that it finds

CDR Problem