f

#### 64-bit KISS RNGs

```64-bit KISS RNGs

Consistent with the Keep It Simple Stupid (KISS) principle,
I have previously suggested 32-bit KISS Random Number
Generators (RNGs) that seem to have been frequently adopted.

Having had requests for 64-bit KISSes, and now that
64-bit integers are becoming more available, I will
describe here a 64-bit KISS RNG, with comments on
implementation for various languages, speed, periods
and performance after extensive tests of randomness.

This 64-bit KISS RNG has three components, each nearly
good enough to serve alone.    The components are:
Multiply-With-Carry (MWC), period (2^121+2^63-1)
Xorshift (XSH), period 2^64-1
Congruential (CNG), period 2^64

Compact C and Fortran listings are given below.  They
can be cut, pasted, compiled and run to see if, after
100 million calls, results agree with that provided
by theory, assuming the default seeds.

Users may want to put the content in other forms, and,
for general use, provide means to set the 250 seed bits
required in the variables x,y,z (64 bits) and c (58 bits)
that have been given default values in the test versions.

The C version uses #define macros to enumerate the few
instructions that MWC, XSH and CNG require.   The KISS
macro adds MWC+XSH+CNG mod 2^64, so that KISS can be
inserted at any place in a C program where a random 64-bit
integer is required.
Fortran's requirement that integers be signed makes the
necessary code more complicated, hence a function KISS().

C version; test by invoking macro KISS 100 million times
-----------------------------------------------------------------
#include <stdio.h>

static unsigned long long
x=1234567890987654321ULL,c=123456123456123456ULL,
y=362436362436362436ULL,z=1066149217761810ULL,t;

#define MWC (t=(x<<58)+c, c=(x>>6), x+=t, c+=(x<t), x)
#define XSH ( y^=(y<<13), y^=(y>>17), y^=(y<<43) )
#define CNG ( z=6906969069LL*z+1234567 )
#define KISS (MWC+XSH+CNG)

int main(void)
{int i;
for(i=0;i<100000000;i++) t=KISS;
(t==1666297717051644203ULL) ?
printf("100 million uses of KISS OK"):
printf("Fail");
}

---------------------------------------------------------------
Fortran version; test by calling KISS() 100 million times
---------------------------------------------------------------
program testkiss
implicit integer*8(a-z)
do i=1,100000000; t=KISS(); end do
if(t.eq.1666297717051644203_8) then
print*,"100 million calls to KISS() OK"
else; print*,"Fail"
end if; end

function KISS()
implicit integer*8(a-z)
data x,y,z,c /1234567890987654321_8, 362436362436362436_8,&
1066149217761810_8, 123456123456123456_8/
save x,y,z,c
m(x,k)=ieor(x,ishft(x,k))  !statement function
s(x)=ishft(x,-63)             !statement function
t=ishft(x,58)+c
if(s(x).eq.s(t)) then; c=ishft(x,-6)+s(x)
else; c=ishft(x,-6)+1-s(x+t); endif
x=t+x
y=m(m(m(y,13_8),-17_8),43_8)
z=6906969069_8*z+1234567
KISS=x+y+z
return; end
---------------------------------------------------------------

Output from using the macro KISS or the function KISS()	is
MWC+XSH+CNG mod 2^64.

CNG is easily implemented on machines with 64-bit integers,
as arithmetic is automatically mod 2^64, whether integers
are considered signed or unsigned.   The CNG statement is
z=6906969069*z+1234567.
When I established the lattice structure of congruential
generators in the 60's, a search produced 69069 as an easy-
to-remember multiplier with nearly cubic lattices in 2,3,4,5-
space, so I tried concatenating, using 6906969069 as
my first test multiplier. Remarkably---a seemingly one in many
hundreds chance---it turned out to also have excellent lattice
structure in 2,3,4,5-space, so that's the one chosen.
(I doubt if lattice structure of CNG has much influence on the
composite 64-bit KISS produced via MWC+XSH+CNG mod 2^64.)

XSH, the Xorshift component, described in
www.jstatsoft.org/v08/i14/paper
uses three invocations of an integer "xor"ed with a shifted
version of itself.
The XSH component used for this KISS is, in C notation:
y^=(y<<13); y^=(y>>17); y^=(y<<43)
with Fortran equivalents y=ieor(y,ishft(y,13)), etc., although
this can be effected by a Fortran statement function:
f(y,k)=ieor(y,ishft(y,k))
y=f(f(f(y,13),-17),43)
As with lattice structure, choice of the triple 13,-17,43 is
probably of no particular importance; any one of the 275 full-
period triples listed in the above article is likely to provide
a  satisfactory component XSH for the composite MWC+XSH+CNG.

The choice of multiplier 'a' for the multiply-with-carry (MWC)
component of KISS is not so easily made.  In effect, a multiply-
with-carry sequence has a current value x and current "carry" c,
and from each given x,c a new x,c pair is constructed by forming
t=a*x+c, then x=t mod b=2^64 and c=floor(t/b).
This is easily implemented for 32-bit computers that permit
forming a*t+c in 64 bits, from which the new x is the bottom and
the new c the top 32-bits.

When a,x and c are 64-bits, not many computers seem to have an easy
way to form t=a*x+c in 128 bits, then extract the top and bottom
64-bit segments.  For that reason, special choices for 'a' are
needed among those that satisfy the general requirement that
p=a*b-1 is a prime for which b=2^64 has order (p-1)/2.

My choice---and the only one of this form---is a=2^58+1. Then the
top 64 bits of an imagined 128-bit t=a*x+c may be obtained as
(using C notation)  (x>>6)+ 1 or 0, depending
on whether the 64-bit parts of (x<<58)+c+x cause an overflow.
Since (x<<58)+c cannot itself cause overflow (c will always be <a),
we get the carry as c=(x>>6) plus overflow from (x<<58)+x.

This is easily done in C with unsigned integers, using  a different
kind of 't':   t=(x<<58)+c; c=(x>>6); x=t+x; c=c+(x<t);
For Fortran and others that permit only signed integers, more work
is needed.
Equivalent mod 2^64 versions of t=(x<<58)+c and c=(x>>6) are easy,
and if s(x) represents (x>>63) in C or ishft(x,-66) in Fortran,
then for signed integers, the new carry c comes from the rule
if s(x) equals s(t) then c=(x>>6)+s(x) else c=(x>>6)+1-s(x+t)

Speed:
A C version of this KISS RNG takes 18 nanosecs for each
64-bit random number on my desktop (Vista) PC, thus
producing KISSes at a rate exceeding 55 million per second.
Fortran or other integers-must-be-signed compilers might get
"only" around 40 million per second.

Setting seeds:
Use of KISS or KISS() as a general 64-bit RNG requires specifying
3*64+58=250 bits for seeds, 64 bits each for x,y,z and 58 for c,
resulting in a composite sequence with period around 2^250.
The actual period is
(2^250+2^192+2^64-2^186-2^129)/6 ~= 2^(247.42)  or 10^(74.48).
We "lose" 1+1.58=2.58 bits from maximum possible period, one bit
because b=2^64, a square, cannot be a primitive root of p=ab-1,
so the best possible order for b is (p-1)/2.
The periods of MWC and  XSH have gcd 3=2^1.58, so another  1.58
bits  are "lost" from the best possible period we could expect
from 250 seed bits.

Some users may think 250 seed bits are an unreasonable requirement.
A good seeding  procedure might be to assume the default seed
values then let the user choose none, one, two,..., or all
of x,y,z, and c to be reseeded.

Tests:
Latest tests in The Diehard Battery, available at
http://i.cs.hku.hk/~diehard/
were applied extensively. Those tests that specifically required
32-bit integers were applied to the leftmost 32 bits
(e,g, KISS>>32;), then to the middle 32-bits ((KISS<<16)>>32;)
then to the rightmost 32 bits, ( (KISS<<32)>>32).
There were no extremes in the more than 700 p-values returned
by the tests, nor indeed for similar tests applied to just two of the
KISS components: MWC+XSH, then MWC+CNG, then XSH+CNG.

The simplicity, speed, period around 2^250 and performance on
tests of randomness---as well as ability to produce exactly
the same 64-bit patterns, whether considered signed or unsigned
integers---make this 64-bit KISS well worth considering for
as has been done for 32-bit KISSes.

George Marsaglia

```
 0
gmarsaglia (23)
2/28/2009 12:30:48 PM
comp.lang.fortran 11941 articles. 2 followers.

26 Replies
1766 Views

Similar Articles

[PageSpeed] 34

```On Sat, 28 Feb 2009 04:30:48 -0800 (PST), geo <gmarsaglia@gmail.com>
wrote:

>64-bit KISS RNGs
>
>Consistent with the Keep It Simple Stupid (KISS) principle,
>I have previously suggested 32-bit KISS Random Number
>Generators (RNGs) that seem to have been frequently adopted.
>
>Having had requests for 64-bit KISSes, and now that
>64-bit integers are becoming more available, I will
>describe here a 64-bit KISS RNG, with comments on
>implementation for various languages, speed, periods
>and performance after extensive tests of randomness.
>
>This 64-bit KISS RNG has three components, each nearly
>good enough to serve alone.    The components are:
>Multiply-With-Carry (MWC), period (2^121+2^63-1)
>Xorshift (XSH), period 2^64-1
>Congruential (CNG), period 2^64
>
>Compact C and Fortran listings are given below.  They
>can be cut, pasted, compiled and run to see if, after
>100 million calls, results agree with that provided
>by theory, assuming the default seeds.
>
>Users may want to put the content in other forms, and,
>for general use, provide means to set the 250 seed bits
>required in the variables x,y,z (64 bits) and c (58 bits)
>that have been given default values in the test versions.
>
>The C version uses #define macros to enumerate the few
>instructions that MWC, XSH and CNG require.   The KISS
>macro adds MWC+XSH+CNG mod 2^64, so that KISS can be
>inserted at any place in a C program where a random 64-bit
>integer is required.
>Fortran's requirement that integers be signed makes the
>necessary code more complicated, hence a function KISS().
>
>C version; test by invoking macro KISS 100 million times
>-----------------------------------------------------------------
>#include <stdio.h>
>
>static unsigned long long
>x=1234567890987654321ULL,c=123456123456123456ULL,
>y=362436362436362436ULL,z=1066149217761810ULL,t;
>
>#define MWC (t=(x<<58)+c, c=(x>>6), x+=t, c+=(x<t), x)
>#define XSH ( y^=(y<<13), y^=(y>>17), y^=(y<<43) )
>#define CNG ( z=6906969069LL*z+1234567 )
>#define KISS (MWC+XSH+CNG)
>
> int main(void)
> {int i;
>  for(i=0;i<100000000;i++) t=KISS;
>  (t==1666297717051644203ULL) ?
>  printf("100 million uses of KISS OK"):
>  printf("Fail");
> }
>
>---------------------------------------------------------------
>Fortran version; test by calling KISS() 100 million times
>---------------------------------------------------------------
> program testkiss
> implicit integer*8(a-z)
> do i=1,100000000; t=KISS(); end do
> if(t.eq.1666297717051644203_8) then
> print*,"100 million calls to KISS() OK"
> else; print*,"Fail"
> end if; end
>
> function KISS()
> implicit integer*8(a-z)
> data x,y,z,c /1234567890987654321_8, 362436362436362436_8,&
> 1066149217761810_8, 123456123456123456_8/
> save x,y,z,c
>  m(x,k)=ieor(x,ishft(x,k))  !statement function
>  s(x)=ishft(x,-63)             !statement function
> t=ishft(x,58)+c
> if(s(x).eq.s(t)) then; c=ishft(x,-6)+s(x)
>      else; c=ishft(x,-6)+1-s(x+t); endif
> x=t+x
> y=m(m(m(y,13_8),-17_8),43_8)
> z=6906969069_8*z+1234567
> KISS=x+y+z
> return; end
>---------------------------------------------------------------
>
>Output from using the macro KISS or the function KISS()	is
>MWC+XSH+CNG mod 2^64.
>
>CNG is easily implemented on machines with 64-bit integers,
>as arithmetic is automatically mod 2^64, whether integers
>are considered signed or unsigned.   The CNG statement is
>    z=6906969069*z+1234567.
>When I established the lattice structure of congruential
>generators in the 60's, a search produced 69069 as an easy-
>to-remember multiplier with nearly cubic lattices in 2,3,4,5-
>space, so I tried concatenating, using 6906969069 as
>my first test multiplier. Remarkably---a seemingly one in many
>hundreds chance---it turned out to also have excellent lattice
>structure in 2,3,4,5-space, so that's the one chosen.
>(I doubt if lattice structure of CNG has much influence on the
>composite 64-bit KISS produced via MWC+XSH+CNG mod 2^64.)
>
>
>XSH, the Xorshift component, described in
>     www.jstatsoft.org/v08/i14/paper
>uses three invocations of an integer "xor"ed with a shifted
>version of itself.
>The XSH component used for this KISS is, in C notation:
>   y^=(y<<13); y^=(y>>17); y^=(y<<43)
>with Fortran equivalents y=ieor(y,ishft(y,13)), etc., although
>this can be effected by a Fortran statement function:
>     f(y,k)=ieor(y,ishft(y,k))
>     y=f(f(f(y,13),-17),43)
>As with lattice structure, choice of the triple 13,-17,43 is
>probably of no particular importance; any one of the 275 full-
>period triples listed in the above article is likely to provide
>a  satisfactory component XSH for the composite MWC+XSH+CNG.
>
>The choice of multiplier 'a' for the multiply-with-carry (MWC)
>component of KISS is not so easily made.  In effect, a multiply-
>with-carry sequence has a current value x and current "carry" c,
>and from each given x,c a new x,c pair is constructed by forming
>  t=a*x+c, then x=t mod b=2^64 and c=floor(t/b).
>This is easily implemented for 32-bit computers that permit
>forming a*t+c in 64 bits, from which the new x is the bottom and
>the new c the top 32-bits.
>
>When a,x and c are 64-bits, not many computers seem to have an easy
>way to form t=a*x+c in 128 bits, then extract the top and bottom
>64-bit segments.  For that reason, special choices for 'a' are
>needed among those that satisfy the general requirement that
>p=a*b-1 is a prime for which b=2^64 has order (p-1)/2.
>
>My choice---and the only one of this form---is a=2^58+1. Then the
>top 64 bits of an imagined 128-bit t=a*x+c may be obtained as
>(using C notation)  (x>>6)+ 1 or 0, depending
>on whether the 64-bit parts of (x<<58)+c+x cause an overflow.
>Since (x<<58)+c cannot itself cause overflow (c will always be <a),
>we get the carry as c=(x>>6) plus overflow from (x<<58)+x.
>
>This is easily done in C with unsigned integers, using  a different
>kind of 't':   t=(x<<58)+c; c=(x>>6); x=t+x; c=c+(x<t);
>For Fortran and others that permit only signed integers, more work
>is needed.
>Equivalent mod 2^64 versions of t=(x<<58)+c and c=(x>>6) are easy,
>and if s(x) represents (x>>63) in C or ishft(x,-66) in Fortran,
>then for signed integers, the new carry c comes from the rule
> if s(x) equals s(t) then c=(x>>6)+s(x) else c=(x>>6)+1-s(x+t)
>
>Speed:
>A C version of this KISS RNG takes 18 nanosecs for each
>64-bit random number on my desktop (Vista) PC, thus
>producing KISSes at a rate exceeding 55 million per second.
>Fortran or other integers-must-be-signed compilers might get
>"only" around 40 million per second.
>
>Setting seeds:
>Use of KISS or KISS() as a general 64-bit RNG requires specifying
>3*64+58=250 bits for seeds, 64 bits each for x,y,z and 58 for c,
>resulting in a composite sequence with period around 2^250.
>The actual period is
>  (2^250+2^192+2^64-2^186-2^129)/6 ~= 2^(247.42)  or 10^(74.48).
>We "lose" 1+1.58=2.58 bits from maximum possible period, one bit
>because b=2^64, a square, cannot be a primitive root of p=ab-1,
>so the best possible order for b is (p-1)/2.
>The periods of MWC and  XSH have gcd 3=2^1.58, so another  1.58
>bits  are "lost" from the best possible period we could expect
>from 250 seed bits.
>
>Some users may think 250 seed bits are an unreasonable requirement.
>A good seeding  procedure might be to assume the default seed
>values then let the user choose none, one, two,..., or all
>of x,y,z, and c to be reseeded.
>
>Tests:
>Latest tests in The Diehard Battery, available at
>   http://i.cs.hku.hk/~diehard/
>were applied extensively. Those tests that specifically required
>32-bit integers were applied to the leftmost 32 bits
>(e,g, KISS>>32;), then to the middle 32-bits ((KISS<<16)>>32;)
>then to the rightmost 32 bits, ( (KISS<<32)>>32).
>There were no extremes in the more than 700 p-values returned
>by the tests, nor indeed for similar tests applied to just two of the
>KISS components: MWC+XSH, then MWC+CNG, then XSH+CNG.
>
>The simplicity, speed, period around 2^250 and performance on
>tests of randomness---as well as ability to produce exactly
>the same 64-bit patterns, whether considered signed or unsigned
>integers---make this 64-bit KISS well worth considering for
>as has been done for 32-bit KISSes.
>
>George Marsaglia

Thank you for that George.  I have put together a demonstration
version in Java.  It is coded for clarity and not for speed.  There
are many obvious speed improvements possible for production code.

My code was based on your Fortran version as Java also uses signed
longs; the signBit() method is equivalent to your s(x) statement
function.  I have tweaked the test method to allow it to print out the
hex values of x, y, z and c for the first few times through the test
loop.  This was a great help in my debugging of my initial attempt so
I have included the output after the code.

My class extends the standard Java Random class.  Since Kiss64 is 64
bit internally I overrode nextLong() as well as the usual next() so
that longs are taken directly from Kiss64 rather than built up from
separate calls to next() inside Random as usually happens.

To si.math and comp.lang.java.programmer only.

rossum

// --- Begin Code ---

import java.util.Random;

/**
* An implementation of George Marsaglia's KISS 64 PRNG.
*
* This is written for clarity rather than speed.
*
* @author rossum
* @version 1.0; February 2009
*/
public class Kiss64 extends Random {

//////////
// Data //
//////////

private final static long DEFAULT_X = 1234567890987654321L;
private final static long DEFAULT_Y =  362436362436362436L;
private final static long DEFAULT_Z =    1066149217761810L;
private final static long DEFAULT_C =  123456123456123456L;

private long mX = DEFAULT_X;
private long mY = DEFAULT_Y;
private long mZ = DEFAULT_Z;
private long mC = DEFAULT_C;

//////////////////
// Constructors //
//////////////////

public Kiss64() { }

public Kiss64(long x) {
mX = x;
}

public Kiss64(long x, long y) {
mX = x;
mY = y;
}

public Kiss64(long x, long y, long z) {
mX = x;
mY = y;
mZ = z;
}

public Kiss64(long x, long y, long z, long c) {
mX = x;
mY = y;
mZ = z;
mC = c & 0x3FFFFFFFFFFFFFFL;  // 58 bit mask
}

/////////////
// Methods //
/////////////

@Override
public long nextLong() {
mwc();
xsh();
cng();
return (mX + mY + mZ);
}

/**
* Tests Kiss64 code with default settings.
*
* @param printWanted true if hex printout wanted,
* false otherwise.
* @return true if the test was passed, false otherwise.
*/
public static boolean testKiss64(boolean printWanted) {
final long expected = 1666297717051644203L;
final int printLimit = 5;

Kiss64 k64 = new Kiss64();
long r = 0;
int i;
for (i = 0; i < printLimit; ++i) {
if (printWanted) {
System.out.println("i = " + i);
System.out.printf("   x = 0x%X%n", k64.mX);
System.out.printf("   y = 0x%X%n", k64.mY);
System.out.printf("   z = 0x%X%n", k64.mZ);
System.out.printf("   c = 0x%X%n", k64.mC);
}
r = k64.nextLong();
}
for ( ; i < 100000000; ++i) {
r = k64.nextLong();
}
if (r == expected) {
System.out.println("100 million uses of Kiss64 OK.");
return true;
} else {
System.out.println("Kiss64 failed: r == " +
r + ", not " + expected);
return false;
}
}

@Override
protected int next(int bits) {
return (int)(nextLong() >>> (48 - bits));
}

private void mwc() {
long t = (mX << 58) + mC;
mC = (mX >>> 6);
if (signBit(mX) == signBit(t)) {
mC += signBit(mX);
} else {
mC += 1L - signBit(mX + t);
}
mX += t;
}

private void xsh() {
mY ^= (mY << 13);
mY ^= (mY >>> 17);
mY ^= (mY << 43);
}

private void cng() {
mZ *= 6906969069L;
mZ += 1234567L;
}

private long signBit(long num) {
return num >>> 63;
}

} // end class Kiss64

// --- End Code ---

/* --- Begin Output --- *

i = 0
x = 0x112210F4B16C1CB1
y = 0x507A1F38CB440C4
z = 0x3C9A83566FA12
c = 0x1B69AB0AFF2F240
i = 1
x = 0xD6D8ABA5615F0EF1
y = 0x32D38F9EC9E4292
z = 0xA1F271F53FE5FF31
c = 0x448843D2C5B072
i = 2
x = 0x9B1D33E93424BF63
y = 0x6CB5F773267910F4
z = 0x476BE49FC6B421E4
c = 0x35B62AE95857C3C
i = 3
x = 0x2A789697C9AA3B9F
y = 0x1ECDC291CDB992C7
z = 0xB5475749C8ECC29B
c = 0x26C74CFA4D092FE
i = 4
x = 0xA8E50B676E7ACE9D
y = 0x36F6106902720D37
z = 0xE6A59D970705F906
c = 0xA9E25A5F26A8EE
100 million uses of Kiss64 OK.

/* --- End Output --- */

```
 0
rossum48 (719)
2/28/2009 6:24:02 PM
```In article <d0d9069e-cfff-4520-a0fe-96715b25852d@j8g2000yql.googlegroups.com>,
geo  <gmarsaglia@gmail.com> wrote:
>64-bit KISS RNGs
>
>Consistent with the Keep It Simple Stupid (KISS) principle,
>I have previously suggested 32-bit KISS Random Number
>Generators (RNGs) that seem to have been frequently adopted.
>
>Having had requests for 64-bit KISSes, and now that
>64-bit integers are becoming more available, I will
>describe here a 64-bit KISS RNG, with comments on
>implementation for various languages, speed, periods
>and performance after extensive tests of randomness.
>
>This 64-bit KISS RNG has three components, each nearly
>good enough to serve alone.    The components are:
>Multiply-With-Carry (MWC), period (2^121+2^63-1)
>Xorshift (XSH), period 2^64-1
>Congruential (CNG), period 2^64

While I hesitate to follow up to George Marsaglia on random numbers,
there are a few things here that need at least clarification.
I am NOT referring to the generator as such, where I have no
disagreement, but to some of the other remarks.

Nobody should EVER use 32-bit generators for more than about ten
million numbers in an analysis without careful analysis, because
the discreteness will start to show through in at least some real
analyses.

No Xorshift generator is good enough to use on its own, because
they have some evil properties.  We knew about them before 1970
(Knuth refers to it) and at least some of us knew the reason but
could not prove it.  The person who did was Martin Luescher of
CERN, in the 1990s (if I recall).

When using generators in parallel programs, it is as important to
worry about their quasi-independence as their serial properties;
the common practice of using different seeds to the same generator
is NOT a good idea.  Multiplicative congruential generators with
coprime moduli are quasi-independent over the whole period, as are
multiply-with-carry ones with the same modulus but different, 'safe
prime' multipliers - unfortunately, complementary multiply-with-carry
ones are not.

Most of this is published, but perhaps not the last sentence.
More work on the quasi-independence of generators is needed!  In
particular, my belief is that the best generators for parallel
work are multiply-with-carry ones with the same modulus and
different, 'safe prime' multipliers, but I have been out of this
area for many years and my study of them has been cursory.  A
quick Web search didn't find anything useful, but there were an
awful lot of irrelevant hits, so I may have missed something.

The problem here is that full-period properties do not necessarily
map to the properties of shorter sequences (that was the problem
with Xorshift generators), and analysing the latter is mathematically
evil.

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
2/28/2009 7:18:01 PM
```On Feb 28, 4:30=A0am, geo <gmarsag...@gmail.com> wrote:
> 64-bit KISS RNGs
>
> Consistent with the Keep It Simple Stupid (KISS) principle,
> I have previously suggested 32-bit KISS Random Number
> Generators (RNGs) that seem to have been frequently adopted.
>
> Having had requests for 64-bit KISSes, and now that
> 64-bit integers are becoming more available, I will
> describe here a 64-bit KISS RNG, with comments on
> implementation for various languages, speed, periods
> and performance after extensive tests of randomness.
>
> This 64-bit KISS RNG has three components, each nearly
> good enough to serve alone. =A0 =A0The components are:
> Multiply-With-Carry (MWC), period (2^121+2^63-1)
> Xorshift (XSH), period 2^64-1
> Congruential (CNG), period 2^64
>
> Compact C and Fortran listings are given below. =A0They
> can be cut, pasted, compiled and run to see if, after
> 100 million calls, results agree with that provided
> by theory, assuming the default seeds.
>
> Users may want to put the content in other forms, and,
> for general use, provide means to set the 250 seed bits
> required in the variables x,y,z (64 bits) and c (58 bits)
> that have been given default values in the test versions.
>
> The C version uses #define macros to enumerate the few
> instructions that MWC, XSH and CNG require. =A0 The KISS
> macro adds MWC+XSH+CNG mod 2^64, so that KISS can be
> inserted at any place in a C program where a random 64-bit
> integer is required.
> Fortran's requirement that integers be signed makes the
> necessary code more complicated, hence a function KISS().
>
> C version; test by invoking macro KISS 100 million times
> -----------------------------------------------------------------
> #include <stdio.h>
>
> static unsigned long long
> x=3D1234567890987654321ULL,c=3D123456123456123456ULL,
> y=3D362436362436362436ULL,z=3D1066149217761810ULL,t;
>
> #define MWC (t=3D(x<<58)+c, c=3D(x>>6), x+=3Dt, c+=3D(x<t), x)
> #define XSH ( y^=3D(y<<13), y^=3D(y>>17), y^=3D(y<<43) )
> #define CNG ( z=3D6906969069LL*z+1234567 )
> #define KISS (MWC+XSH+CNG)
>
> =A0int main(void)
> =A0{int i;
> =A0 for(i=3D0;i<100000000;i++) t=3DKISS;
> =A0 (t=3D=3D1666297717051644203ULL) ?
> =A0 printf("100 million uses of KISS OK"):
> =A0 printf("Fail");
> =A0}
>
> ---------------------------------------------------------------
> Fortran version; test by calling KISS() 100 million times
> ---------------------------------------------------------------
> =A0program testkiss
> =A0implicit integer*8(a-z)
> =A0do i=3D1,100000000; t=3DKISS(); end do
> =A0if(t.eq.1666297717051644203_8) then
> =A0print*,"100 million calls to KISS() OK"
> =A0else; print*,"Fail"
> =A0end if; end
>
> =A0function KISS()
> =A0implicit integer*8(a-z)
> =A0data x,y,z,c /1234567890987654321_8, 362436362436362436_8,&
> =A01066149217761810_8, 123456123456123456_8/
> =A0save x,y,z,c
> =A0 m(x,k)=3Dieor(x,ishft(x,k)) =A0!statement function
> =A0 s(x)=3Dishft(x,-63) =A0 =A0 =A0 =A0 =A0 =A0 !statement function
> =A0t=3Dishft(x,58)+c
> =A0if(s(x).eq.s(t)) then; c=3Dishft(x,-6)+s(x)
> =A0 =A0 =A0 else; c=3Dishft(x,-6)+1-s(x+t); endif
> =A0x=3Dt+x
> =A0y=3Dm(m(m(y,13_8),-17_8),43_8)
> =A0z=3D6906969069_8*z+1234567
> =A0KISS=3Dx+y+z
> =A0return; end
> ---------------------------------------------------------------
>
> Output from using the macro KISS or the function KISS() is
> MWC+XSH+CNG mod 2^64.
>
> CNG is easily implemented on machines with 64-bit integers,
> as arithmetic is automatically mod 2^64, whether integers
> are considered signed or unsigned. =A0 The CNG statement is
> =A0 =A0 z=3D6906969069*z+1234567.
> When I established the lattice structure of congruential
> generators in the 60's, a search produced 69069 as an easy-
> to-remember multiplier with nearly cubic lattices in 2,3,4,5-
> space, so I tried concatenating, using 6906969069 as
> my first test multiplier. Remarkably---a seemingly one in many
> hundreds chance---it turned out to also have excellent lattice
> structure in 2,3,4,5-space, so that's the one chosen.
> (I doubt if lattice structure of CNG has much influence on the
> composite 64-bit KISS produced via MWC+XSH+CNG mod 2^64.)
>
> XSH, the Xorshift component, described in
> =A0 =A0 =A0www.jstatsoft.org/v08/i14/paper
> uses three invocations of an integer "xor"ed with a shifted
> version of itself.
> The XSH component used for this KISS is, in C notation:
> =A0 =A0y^=3D(y<<13); y^=3D(y>>17); y^=3D(y<<43)
> with Fortran equivalents y=3Dieor(y,ishft(y,13)), etc., although
> this can be effected by a Fortran statement function:
> =A0 =A0 =A0f(y,k)=3Dieor(y,ishft(y,k))
> =A0 =A0 =A0y=3Df(f(f(y,13),-17),43)
> As with lattice structure, choice of the triple 13,-17,43 is
> probably of no particular importance; any one of the 275 full-
> period triples listed in the above article is likely to provide
> a =A0satisfactory component XSH for the composite MWC+XSH+CNG.
>
> The choice of multiplier 'a' for the multiply-with-carry (MWC)
> component of KISS is not so easily made. =A0In effect, a multiply-
> with-carry sequence has a current value x and current "carry" c,
> and from each given x,c a new x,c pair is constructed by forming
> =A0 t=3Da*x+c, then x=3Dt mod b=3D2^64 and c=3Dfloor(t/b).
> This is easily implemented for 32-bit computers that permit
> forming a*t+c in 64 bits, from which the new x is the bottom and
> the new c the top 32-bits.
>
> When a,x and c are 64-bits, not many computers seem to have an easy
> way to form t=3Da*x+c in 128 bits, then extract the top and bottom
> 64-bit segments. =A0For that reason, special choices for 'a' are
> needed among those that satisfy the general requirement that
> p=3Da*b-1 is a prime for which b=3D2^64 has order (p-1)/2.
>
> My choice---and the only one of this form---is a=3D2^58+1. Then the
> top 64 bits of an imagined 128-bit t=3Da*x+c may be obtained as
> (using C notation) =A0(x>>6)+ 1 or 0, depending
> on whether the 64-bit parts of (x<<58)+c+x cause an overflow.
> Since (x<<58)+c cannot itself cause overflow (c will always be <a),
> we get the carry as c=3D(x>>6) plus overflow from (x<<58)+x.
>
> This is easily done in C with unsigned integers, using =A0a different
> kind of 't': =A0 t=3D(x<<58)+c; c=3D(x>>6); x=3Dt+x; c=3Dc+(x<t);
> For Fortran and others that permit only signed integers, more work
> is needed.
> Equivalent mod 2^64 versions of t=3D(x<<58)+c and c=3D(x>>6) are easy,
> and if s(x) represents (x>>63) in C or ishft(x,-66) in Fortran,
> then for signed integers, the new carry c comes from the rule
> =A0if s(x) equals s(t) then c=3D(x>>6)+s(x) else c=3D(x>>6)+1-s(x+t)
>
> Speed:
> A C version of this KISS RNG takes 18 nanosecs for each
> 64-bit random number on my desktop (Vista) PC, thus
> producing KISSes at a rate exceeding 55 million per second.
> Fortran or other integers-must-be-signed compilers might get
> "only" around 40 million per second.
>
> Setting seeds:
> Use of KISS or KISS() as a general 64-bit RNG requires specifying
> 3*64+58=3D250 bits for seeds, 64 bits each for x,y,z and 58 for c,
> resulting in a composite sequence with period around 2^250.
> The actual period is
> =A0 (2^250+2^192+2^64-2^186-2^129)/6 ~=3D 2^(247.42) =A0or 10^(74.48).
> We "lose" 1+1.58=3D2.58 bits from maximum possible period, one bit
> because b=3D2^64, a square, cannot be a primitive root of p=3Dab-1,
> so the best possible order for b is (p-1)/2.
> The periods of MWC and =A0XSH have gcd 3=3D2^1.58, so another =A01.58
> bits =A0are "lost" from the best possible period we could expect
> from 250 seed bits.
>
> Some users may think 250 seed bits are an unreasonable requirement.
> A good seeding =A0procedure might be to assume the default seed
> values then let the user choose none, one, two,..., or all
> of x,y,z, and c to be reseeded.
>
> Tests:
> Latest tests in The Diehard Battery, available at
> =A0 =A0http://i.cs.hku.hk/~diehard/
> were applied extensively. Those tests that specifically required
> 32-bit integers were applied to the leftmost 32 bits
> (e,g, KISS>>32;), then to the middle 32-bits ((KISS<<16)>>32;)
> then to the rightmost 32 bits, ( (KISS<<32)>>32).
> There were no extremes in the more than 700 p-values returned
> by the tests, nor indeed for similar tests applied to just two of the
> KISS components: MWC+XSH, then MWC+CNG, then XSH+CNG.
>
> The simplicity, speed, period around 2^250 and performance on
> tests of randomness---as well as ability to produce exactly
> the same 64-bit patterns, whether considered signed or unsigned
> integers---make this 64-bit KISS well worth considering for
> as has been done for 32-bit KISSes.

george!

as chief architect for a leading gaming (gambling) manufacturer in
vegas
and one of the few engineers with a mathematics background
i am often working with our internal rng
to provide better tools for our game catalog

for years
which served us quite well for it's nice distribution
speed
and fair period

however
some of our newest gaming ideas
have required rng's with much greater periods
(to cover the possibility space of the game)
and our latest platform has moved over to use mersenne twisters

(unfortunately
it has been an industry standard
to just add more rng's (the same ones instantiated multiply!)
and i had to fight to ensure my company didn't go that path
to ensure unintended correlations were not introduced)

this improvement of yours would have certainly been considered
if available at the time
but i do fear the period may still have been too small
for some of our more extreme future needs
(there are already products on the market from several manufacturers
that allow a player to play 100 simultaneous poker or keno games)

i'm very happy i saw this post of yours

thank you for all your contributions!

-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-
galathaea: prankster, fablist, magician, liar
```
 0
galathaea (74)
3/1/2009 1:53:15 AM
```In article
galathaea <galathaea@gmail.com> writes:

RANLUX is slow, but at the highest "luxury level" all 24 bits of the
mantissa are chaotic.  So, one could just stick these together to create
numbers containing more bits.

```
 0
helbig (5064)
3/1/2009 9:30:19 AM
```In article <godkjb\$kn6\$1@online.de>,
Phillip Helbig---remove CLOTHES to reply <helbig@astro.multiCLOTHESvax.de> wrote:
>In article
>galathaea <galathaea@gmail.com> writes:
>
>RANLUX is slow, but at the highest "luxury level" all 24 bits of the
>mantissa are chaotic.  So, one could just stick these together to create
>numbers containing more bits.

That wasn't the issue she (I assume) was addressing - it was one that
I did.  Yes, that technique works, for both RANLUX and 32-bit KISS.
I use my own double-precision generator, of course, which has some
theoretical advantages over both and is marginally simpler than (and
similar to) KISS.

Galathaea's concern was about the period, and she is very right to
be so concerned.  While a long period does not guarantee pseudo-
randomness, it is a prerequisite for it - in particular, the pseudo-
random properties in N dimensions are often limited by the Nth root
of the period.  And, despite common belief, that is NOT solely true
for multiplicative congruential generators.

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/1/2009 9:45:43 AM
```In article <godlg7\$a4j\$1@soup.linux.pwf.cam.ac.uk>, nmm1@cam.ac.uk
writes:

> Galathaea's concern was about the period, and she is very right to
> be so concerned.  While a long period does not guarantee pseudo-
> randomness, it is a prerequisite for it - in particular, the pseudo-
> random properties in N dimensions are often limited by the Nth root
> of the period.  And, despite common belief, that is NOT solely true
> for multiplicative congruential generators.

The period of RANLUX is huge---10**164 or something.  (In other words,
many orders of magnitude larger than the number of distinct bit
combinations.  Many lesser generators have a period much LESS than the
number of distinct bit combinations and some algorithms can have a
period at most as long as the number of distinct bit combinations.  In
these cases, of course, a given number x is always followed by a given
number y, which with RANLUX is not the case.)

```
 0
helbig (5064)
3/1/2009 9:49:52 AM
```In article <godlo0\$lg3\$1@online.de>,
Phillip Helbig---remove CLOTHES to reply <helbig@astro.multiCLOTHESvax.de> wrote:
>In article <godlg7\$a4j\$1@soup.linux.pwf.cam.ac.uk>, nmm1@cam.ac.uk
>writes:
>
>> Galathaea's concern was about the period, and she is very right to
>> be so concerned.  While a long period does not guarantee pseudo-
>> randomness, it is a prerequisite for it - in particular, the pseudo-
>> random properties in N dimensions are often limited by the Nth root
>> of the period.  And, despite common belief, that is NOT solely true
>> for multiplicative congruential generators.
>
>The period of RANLUX is huge---10**164 or something.  (In other words,
>many orders of magnitude larger than the number of distinct bit
>combinations.  Many lesser generators have a period much LESS than the
>number of distinct bit combinations and some algorithms can have a
>period at most as long as the number of distinct bit combinations.  In
>these cases, of course, a given number x is always followed by a given
>number y, which with RANLUX is not the case.)

!!!!!  ALL such generators have been known to be trash (and, yes, I
mean trash) since the late 1960s!  One of my papers proves (and I
mean mathematically rigorously) that you should never use more than
period^(2/3) numbers in a simulation - the rule for never using more
than period^(1/2) for cryptographic purposes has been known since
time immemorial.

64-bit KISS has a period of about 2^249, according to that posting.

I could explain the potential defects of RANLUX, but would have to
rake quite a lot of my memories from where they are archived (on
tape, perhaps?)  And please note that the word is 'potential' - to
know if they were actual needs forms of analysis that I believe are
still beyond the state of the art (and well beyond my mathematical
ability).  However, I believe that it is almost certainly reliable
(on mathematical grounds, incidentally).

Similar remarks can be made about KISS and my own generator but,
there, I am almost certain that the required analysis is WAY beyond
the state of the art.

[ The analysis I am referring to is the one that maps the full-period
uniformity, which is a quasi-random property, to the pseudo-randomness
of short sequences.  The word 'hairy' springs to mind! ]

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/1/2009 10:29:28 AM
```"geo" <gmarsaglia@gmail.com> wrote in message
> 64-bit KISS RNGs
>
> ---------------------------------------------------------------
> Fortran version; test by calling KISS() 100 million times
> ---------------------------------------------------------------
>  program testkiss
>  implicit integer*8(a-z)
>  do i=1,100000000; t=KISS(); end do
>  if(t.eq.1666297717051644203_8) then
>  print*,"100 million calls to KISS() OK"
>  else; print*,"Fail"
>  end if; end
>
>  function KISS()
>  implicit integer*8(a-z)
>  data x,y,z,c /1234567890987654321_8, 362436362436362436_8,&
>  1066149217761810_8, 123456123456123456_8/
>  save x,y,z,c
>   m(x,k)=ieor(x,ishft(x,k))  !statement function
>   s(x)=ishft(x,-63)             !statement function
>  t=ishft(x,58)+c
>  if(s(x).eq.s(t)) then; c=ishft(x,-6)+s(x)
>       else; c=ishft(x,-6)+1-s(x+t); endif
>  x=t+x
>  y=m(m(m(y,13_8),-17_8),43_8)
>  z=6906969069_8*z+1234567
>  KISS=x+y+z
>  return; end
> ---------------------------------------------------------------

George, this code is not portable Fortran.
If you want to specify a kind, you need to use
SELECTED_INT_KIND or equivalent action
in order for it to be portable.

```
 0
robin_v (2737)
3/1/2009 1:31:23 PM
```"geo" <gmarsaglia@gmail.com> wrote in message

> The simplicity, speed, period around 2^250 and performance on
> tests of randomness---as well as ability to produce exactly
> the same 64-bit patterns, whether considered signed or unsigned
> integers---make this 64-bit KISS well worth considering for
> as has been done for 32-bit KISSes.
>
> George Marsaglia

A PL/I version:

KISS: procedure() returns (fixed binary (64) unsigned) options (reorder);
declare (x initial (1234567890987654321),
y initial ( 362436362436362436),
z initial (   1066149217761810),
c initial ( 123456123456123456),
t                             ) fixed binary (64) unsigned static;

t=isll(x,58)+c;
if isrl(x,63) = isrl(t,63) then
c=isrl(x,6)+isrl(x, 63);
else
c=isrl(x,6)+1-isrl(x+t,63);
x=t+x;
t = ieor(y, isll(y, 13));
t = ieor(t, isrl(t, 17));
y = ieor(t, isll(t, 43));
z=6906969069*z+1234567;
return (x+y+z);
end KISS;

```
 0
robin_v (2737)
3/13/2009 12:20:50 PM
```On Mar 1, 3:29=A0am, n...@cam.ac.uk wrote:
> In article <godlo0\$lg...@online.de>,
> Phillip Helbig---remove CLOTHES to reply <hel...@astro.multiCLOTHESvax.de=
> wrote:
>
>
>
>
>
> >In article <godlg7\$a4...@soup.linux.pwf.cam.ac.uk>, n...@cam.ac.uk
> >writes:
>
> >> Galathaea's concern was about the period, and she is very right to
> >> be so concerned. =A0While a long period does not guarantee pseudo-
> >> randomness, it is a prerequisite for it - in particular, the pseudo-
> >> random properties in N dimensions are often limited by the Nth root
> >> of the period. =A0And, despite common belief, that is NOT solely true
> >> for multiplicative congruential generators.
>
> >The period of RANLUX is huge---10**164 or something. =A0(In other words,
> >many orders of magnitude larger than the number of distinct bit
> >combinations. =A0Many lesser generators have a period much LESS than the
> >number of distinct bit combinations and some algorithms can have a
> >period at most as long as the number of distinct bit combinations. =A0In
> >these cases, of course, a given number x is always followed by a given
> >number y, which with RANLUX is not the case.)
>
> !!!!! =A0ALL such generators have been known to be trash (and, yes, I
> mean trash) since the late 1960s! =A0One of my papers proves (and I
> mean mathematically rigorously) that you should never use more than
> period^(2/3) numbers in a simulation - the rule for never using more
> than period^(1/2) for cryptographic purposes has been known since
> time immemorial.
>
> 64-bit KISS has a period of about 2^249, according to that posting.
>
> I could explain the potential defects of RANLUX, but would have to
> rake quite a lot of my memories from where they are archived (on
> tape, perhaps?) =A0And please note that the word is 'potential' - to
> know if they were actual needs forms of analysis that I believe are
> still beyond the state of the art (and well beyond my mathematical
> ability). =A0However, I believe that it is almost certainly reliable
> (on mathematical grounds, incidentally).
>
> Similar remarks can be made about KISS and my own generator but,
> there, I am almost certain that the required analysis is WAY beyond
> the state of the art.
>
> [ The analysis I am referring to is the one that maps the full-period
> uniformity, which is a quasi-random property, to the pseudo-randomness
> of short sequences. =A0The word 'hairy' springs to mind! ]

I recommend giving your PRNG a pounding with TESTU01:
tests&aq=3D1&oq=3Dtestu01+

TestU01 is a fabulous battery of tests that has a very broad spectrum
and also implementations for just about every existing, popular prng.

Highly recommended.
```
 0
dcorbit (2697)
3/13/2009 8:13:10 PM
```In article <96d3ac88-3b21-49fe-8383-1c924b90c211@q11g2000yqh.googlegroups.com>,
user923005  <dcorbit@connx.com> wrote:
>
>I recommend giving your PRNG a pounding with TESTU01:
>tests&aq=3D1&oq=3Dtestu01+
>
>TestU01 is a fabulous battery of tests that has a very broad spectrum
>and also implementations for just about every existing, popular prng.
>
>Highly recommended.

That URL is completely mangled.  You don't use (ugh) 'Rich Text' by
any chance?

In my view Pierre L'Ecuyer is the leading expert on this area active
today[*].  I would do so were I still active in this area myself; it
would unquestionably pass most of them, as I have tried.  I have some
other tests, one of which is harsh enough that it is one of very few
that will fail most of Marsaglia's; my generator passed, but that is
to be expected :-)

[*} George Marsaglia was the previous holder of that position, as
most people will agree, preceded by Donald Knuth.

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/13/2009 8:46:52 PM
```<nmm1@cam.ac.uk> wrote in message news:gpegns\$ra5\$1@soup.linux.pwf.cam.ac.uk...

> In my view Pierre L'Ecuyer is the leading expert on this area active
> today[*].  I would do so were I still active in this area myself; it
> would unquestionably pass most of them, as I have tried.  I have some
> other tests, one of which is harsh enough that it is one of very few
> that will fail most of Marsaglia's;

You are only guessing.
Until you have actually run all such tests, your post is just hype.

```
 0
robin_v (2737)
3/14/2009 12:03:58 AM
```In article <OvCul.28450\$cu.8821@news-server.bigpond.net.au>,
robin <robin_v@bigpond.com> wrote:
>
>> In my view Pierre L'Ecuyer is the leading expert on this area active
>> today[*].  I would do so were I still active in this area myself; it
>> would unquestionably pass most of them, as I have tried.  I have some
>> other tests, one of which is harsh enough that it is one of very few
>> that will fail most of Marsaglia's;
>
>You are only guessing.
>Until you have actually run all such tests, your post is just hype.

Back under your bridge with you!

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/14/2009 9:42:53 AM
```nmm1@cam.ac.uk wrote:
> In article <OvCul.28450\$cu.8821@news-server.bigpond.net.au>,
> robin <robin_v@bigpond.com> wrote:
> >
> >> In my view Pierre L'Ecuyer is the leading expert on this area active
> >> today[*].  I would do so were I still active in this area myself; it
> >> would unquestionably pass most of them, as I have tried.  I have some
> >> other tests, one of which is harsh enough that it is one of very few
> >> that will fail most of Marsaglia's;
> >
> >You are only guessing.
> >Until you have actually run all such tests, your post is just hype.
>
> Back under your bridge with you!

actual ones with you. Good programmers don't even need to profile/test
their code; they just know!
```
 0
blargg.ei3 (369)
3/14/2009 8:11:01 PM
```On Mar 14, 1:11=A0pm, blargg....@gishpuppy.com (blargg) wrote:
> n...@cam.ac.uk wrote:
> > In article <OvCul.28450\$cu.8...@news-server.bigpond.net.au>,
> > robin <robi...@bigpond.com> wrote:
>
> > >> In my view Pierre L'Ecuyer is the leading expert on this area active
> > >> today[*]. =A0I would do so were I still active in this area myself; =
it
> > >> would unquestionably pass most of them, as I have tried. =A0I have s=
ome
> > >> other tests, one of which is harsh enough that it is one of very few
> > >> that will fail most of Marsaglia's;
>
> > >You are only guessing.
> > >Until you have actually run all such tests, your post is just hype.
>
> > Back under your bridge with you!
>
> actual ones with you. Good programmers don't even need to profile/test
> their code; they just know!

On the other hand, Nick Maclaren is a pretty sharp guy.  I guess that
he has done all the testing he needs to do to determine what he needs

He had some spot on advice for me in 1999.  I recall he said that a
particular technique might "run like the clappers", and by golly it
did (when I finally got around to thoroughly benching all of the ideas
at my disposal, his idea was best).  Here is the thread I am referring
to:
667dd/df18e776736c332f?q=3Dclappers+author:nick

At any rate, I guess you have picked the wrong target, though the

Humorously, I think that there is also some truth in this:
"Good programmers don't even need to profile/test their code; they
just know!"

If I choose an idea that is O(n) and all the other ideas at my
disposal are O(n*log(n)), and my implementation is fast enough when
the data is small, then I don't need to bother testing the performance
because I can't improve it.
When the problem gets large enough, the O(n) solution will dominate,
and it is fast enough for small data sets.  There is nothing that
testing can tell me that I do not already know.

```
 0
dcorbit (2697)
3/14/2009 11:23:04 PM
```In article <91f7aa39-dfde-4bd2-bc68-702c9cd2a66d@o36g2000yqh.googlegroups.com>,
user923005  <dcorbit@connx.com> wrote:
>
>Humorously, I think that there is also some truth in this:
>"Good programmers don't even need to profile/test their code; they
>just know!"
>
>If I choose an idea that is O(n) and all the other ideas at my
>disposal are O(n*log(n)), and my implementation is fast enough when
>the data is small, then I don't need to bother testing the performance
>because I can't improve it.
>When the problem gets large enough, the O(n) solution will dominate,
>and it is fast enough for small data sets.  There is nothing that
>testing can tell me that I do not already know.

Absolutely.  Analyse first.  Design second.  Code and debug third.
Every good engineer knows that rule!

You need to run a couple of simple tests to check that your code
matches your analysis, and there isn't a major flaw in the analysis
itself, but you don't need to do more than that.  The main testing
comes where you can't analyse the problem, or where you suspect the
analysis may be unreliable.

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/15/2009 10:31:34 AM
```In article <gpile6\$26f\$1@soup.linux.pwf.cam.ac.uk>, nmm1@cam.ac.uk
wrote:

> In article
> user923005  <dcorbit@connx.com> wrote:
> >
> >Humorously, I think that there is also some truth in this:
> >"Good programmers don't even need to profile/test their code; they
> >just know!"
> >
> >If I choose an idea that is O(n) and all the other ideas at my
> >disposal are O(n*log(n)), and my implementation is fast enough when
> >the data is small, then I don't need to bother testing the performance
> >because I can't improve it.
> >When the problem gets large enough, the O(n) solution will dominate,
> >and it is fast enough for small data sets.  There is nothing that
> >testing can tell me that I do not already know.
>
> Absolutely.  Analyse first.  Design second.  Code and debug third.
> Every good engineer knows that rule!
>
> You need to run a couple of simple tests to check that your code
> matches your analysis, and there isn't a major flaw in the analysis
> itself, but you don't need to do more than that.  The main testing
> comes where you can't analyse the problem, or where you suspect the
> analysis may be unreliable.

Sometimes you divide your problem into different domains, large,
medium, and small, or high and low, or combinations of different
ranges for different variables, and you use different algorithms to
solve the different domains.  Part of the optimal solution then
consists of switching to the best algorithm at the right time.
Sorting is one of those problems, where with small data sets you
might use one algorithm, and you switch to different algorithms as
the data sets grow.  Also, some sorting algorithms involve
partitioning the original large data set into smaller subsets,
sometimes doing this recursively, and you might change algorithms as
you move up and down the different levels of recursion.

The exact switchover points might be machine dependent, depending
say on the relative efficiency of various floating point and integer
instructions, or the relative speeds of different parts of the
memory hierarchy, or the communication speed with some external
device.  Much of this must be done in a brute force way where you
just search for the switchover points, and then store or interpolate
those switchover points for later use.  This is the way ATLAS
optimizes linear algebra operations.

\$.02 -Ron Shepard
```
 0
ron-shepard (1197)
3/15/2009 5:09:21 PM
```geo wrote:
> Having had requests for 64-bit KISSes, and now that
> 64-bit integers are becoming more available, I will
> describe here a 64-bit KISS RNG, with comments on
> implementation for various languages, speed, periods
> and performance after extensive tests of randomness.
>.
> C version; test by invoking macro KISS 100 million times

Recoded for MS VC (MSC) compilers, which don't understand
the 'long long' datatype, and with minor corrections....

#include <stdio.h>

#if _WIN32
typedef unsigned __int64       ullong_t;
#else
typedef unsigned long long     ullong_t;
#endif

static ullong_t         x = 1234567890987654321/*ULL*/;
static ullong_t	        c = 123456123456123456/*ULL*/;
static ullong_t         y = 362436362436362436/*ULL*/;
static ullong_t         z = 1066149217761810/*ULL*/;
static ullong_t         t;

#define MWC     (t = (x<<58)+c, c = (x>>6), x+=t, c+=(x<t), x)
#define XSH     (y ^= (y<<13), y ^= (y>>17), y ^= (y<<43))
#define CNG     (z = 6906969069/*LL*/ * z + 1234567)
#define KISS    (MWC + XSH + CNG)

int main(void)
{
int     i;

for (i = 0;  i < 100000000;  i++)
t = KISS;

if (t == 1666297717051644203/*ULL*/)
printf("100 million uses of KISS OK");
else
printf("Fail");
return 0;
}

-drt
```
 0
david1417 (99)
3/15/2009 8:44:45 PM
```David R Tribble wrote:
> geo wrote:
>> Having had requests for 64-bit KISSes, and now that
>> 64-bit integers are becoming more available, I will
>> describe here a 64-bit KISS RNG, with comments on
>> implementation for various languages, speed, periods
>> and performance after extensive tests of randomness.
>> .
>> C version; test by invoking macro KISS 100 million times
>
> Recoded for MS VC (MSC) compilers, which don't understand
> the 'long long' datatype, and with minor corrections....
>
> #include <stdio.h>
>
> #if _WIN32
>  typedef unsigned __int64       ullong_t;
> #else
>  typedef unsigned long long     ullong_t;
> #endif
>
> static ullong_t         x = 1234567890987654321/*ULL*/;
> static ullong_t	        c = 123456123456123456/*ULL*/;
> static ullong_t         y = 362436362436362436/*ULL*/;
> static ullong_t         z = 1066149217761810/*ULL*/;
> static ullong_t         t;
>
> #define MWC     (t = (x<<58)+c, c = (x>>6), x+=t, c+=(x<t), x)
> #define XSH     (y ^= (y<<13), y ^= (y>>17), y ^= (y<<43))
> #define CNG     (z = 6906969069/*LL*/ * z + 1234567)
> #define KISS    (MWC + XSH + CNG)
>
> int main(void)
> {
>     int     i;
>
>     for (i = 0;  i < 100000000;  i++)
>         t = KISS;
>
>     if (t == 1666297717051644203/*ULL*/)
>         printf("100 million uses of KISS OK");
>     else
>         printf("Fail");
>     return 0;
> }
>
> -drt

Does that make sense on a 32 bit Windows?
```
 0
3/15/2009 9:01:31 PM
```nmm1@cam.ac.uk wrote:

> Absolutely.  Analyse first.  Design second.  Code and debug third.
> Every good engineer knows that rule!
>
> You need to run a couple of simple tests to check that your code
> matches your analysis, and there isn't a major flaw in the analysis
> itself, but you don't need to do more than that.  The main testing
> comes where you can't analyse the problem, or where you suspect the
> analysis may be unreliable.

Which, in practice, is all the time. I mean, have _you_ ever spoken to a
customer who knew what he wanted, completely, correctly, and up front?
I know I haven't.

Richard
```
 0
raltbos (821)
3/16/2009 2:57:58 PM
```In article <49be607d.12561515@news.xs4all.nl>,
Richard Bos <rlbos@xs4all.nl> wrote:
>
>> Absolutely.  Analyse first.  Design second.  Code and debug third.
>> Every good engineer knows that rule!
>>
>> You need to run a couple of simple tests to check that your code
>> matches your analysis, and there isn't a major flaw in the analysis
>> itself, but you don't need to do more than that.  The main testing
>> comes where you can't analyse the problem, or where you suspect the
>> analysis may be unreliable.
>
>Which, in practice, is all the time. I mean, have _you_ ever spoken to a
>customer who knew what he wanted, completely, correctly, and up front?
>I know I haven't.

Oh, yes, but you had better deliver the solution within a couple of
days - if not, the requirement will have started drifting ....

Regards,
Nick Maclaren.
```
 0
nmm12 (1380)
3/16/2009 4:42:57 PM
```Richard Bos <raltbos@xs4all.nl> wrote:
>
> Which, in practice, is all the time. I mean, have _you_ ever spoken to a
> customer who knew what he wanted, completely, correctly, and up front?

Yes.  Of course, what he wanted wasn't what he actually needed....
--
Larry Jones

Just when I thought this junk was beginning to make sense. -- Calvin
```
 0
3/16/2009 6:14:26 PM
```On Mar 16, 9:57=A0am, ralt...@xs4all.nl (Richard Bos) wrote:
> n...@cam.ac.uk wrote:
> > Absolutely. =A0Analyse first. =A0Design second. =A0Code and debug third=
..
> > Every good engineer knows that rule!
>
> > You need to run a couple of simple tests to check that your code
> > matches your analysis, and there isn't a major flaw in the analysis
> > itself, but you don't need to do more than that. =A0The main testing
> > comes where you can't analyse the problem, or where you suspect the
> > analysis may be unreliable.
>
> Which, in practice, is all the time. I mean, have _you_ ever spoken to a
> customer who knew what he wanted, completely, correctly, and up front?
> I know I haven't.
>
> Richard

I did. It was wonderful. The University Registrar's Office. My boss
had grown up with punchcards and chewed JCL and MARKIV (now called, I
think, Vision:Builder) in her sleep. My job was to run database
queries to make
address labels. NAME STREET APT CITY ST ZIP.
Lovely. First I made a WinBatch front-end to automate
the TN3270 emulator (hit TAB once instead of 7 times, etc.), then I
a more sensible query language to MARKIV, then I made
a compiler to MARKIV. As long as the labels kept flowing,
I basically optimized my job away (but I got to stay,
just with massive amounts of paid time).

lxt
```
 0
mijoryx (966)
3/17/2009 3:24:18 AM
```"blargg" <blargg.ei3@gishpuppy.com> wrote in message news:blargg.ei3-1403091511020001@192.168.1.4...
> nmm1@cam.ac.uk (Nick Maclaren) wrote:
> > In article <OvCul.28450\$cu.8821@news-server.bigpond.net.au>,
> > robin <robin_v@bigpond.com> wrote:
> > >
> > >> In my view Pierre L'Ecuyer is the leading expert on this area active
> > >> today[*].  I would do so were I still active in this area myself; it
> > >> would unquestionably pass most of them, as I have tried.  I have some
> > >> other tests, one of which is harsh enough that it is one of very few
> > >> that will fail most of Marsaglia's;
> > >
> > >You are only guessing.
> > >Until you have actually run all such tests, your post is just hype.
> >
> > Back under your bridge with you!
>
> actual ones with you. Good programmers don't even need to profile/test
> their code; they just know!

There's no substitute for rigorous testing.

Nor is there any excuse for denigrating  an expert's work
-- as Maclaren has done --
without having done a single test.

```
 0
robin_v (2737)
3/17/2009 12:10:40 PM
```On Tue, 17 Mar 2009 12:10:40 GMT, "robin" <robin_v@bigpond.com>
wrote:

>"blargg" <blargg.ei3@gishpuppy.com> wrote in message news:blargg.ei3-1403091511020001@192.168.1.4...
>> nmm1@cam.ac.uk (Nick Maclaren) wrote:
>> > In article <OvCul.28450\$cu.8821@news-server.bigpond.net.au>,
>> > robin <robin_v@bigpond.com> wrote:
>> > >
>> > >> In my view Pierre L'Ecuyer is the leading expert on this area active
>> > >> today[*].  I would do so were I still active in this area myself; it
>> > >> would unquestionably pass most of them, as I have tried.  I have some
>> > >> other tests, one of which is harsh enough that it is one of very few
>> > >> that will fail most of Marsaglia's;
>> > >
>> > >You are only guessing.
>> > >Until you have actually run all such tests, your post is just hype.
>> >
>> > Back under your bridge with you!
>>
>> actual ones with you. Good programmers don't even need to profile/test
>> their code; they just know!
>
>There's no substitute for rigorous testing.
>
>Nor is there any excuse for denigrating  an expert's work
>-- as Maclaren has done --
>without having done a single test.

Oh dear, not much into irony and sarcasm are we?

Richard Harter, cri@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
If I do not see as far as others, it is because
I stand in the footprints of giants.
```
 0
cri (1432)
3/17/2009 2:58:07 PM
```On Feb 28, 8:30=A0am, geo <gmarsag...@gmail.com> wrote:> 64-bit KISS
RNGs> > Consistent with the Keep It Simple Stupid (KISS) principle,> I
have previously suggested 32-bit KISS Random Number> Generators (RNGs)
that seem to have been frequently adopted.> > Having had requests for
64-bit KISSes, and now that> 64-bit integers are becoming more
available, I will> describe here a 64-bit KISS RNG, with comments on>
implementation for various languages, speed, periods> and performance
after extensive tests of randomness.> > This 64-bit KISS RNG has three
components, each nearly> good enough to serve alone. =A0 =A0The components
are:> Multiply-With-Carry (MWC), period (2^121+2^63-1)> Xorshift
(XSH), period 2^64-1> Congruential (CNG), period 2^64> > Compact C and
Fortran listings are given below. =A0They> can be cut, pasted, compiled
and run to see if, after> 100 million calls, results agree with that
provided> by theory, assuming the default seeds.> > Users may want to
put the content in other forms, and,> for general use, provide means
to set the 250 seed bits> required in the variables x,y,z (64 bits)
and c (58 bits)> that have been given default values in the test
versions.> > The C version uses #define macros to enumerate the few>
instructions that MWC, XSH and CNG require. =A0 The KISS> macro adds MWC
+XSH+CNG mod 2^64, so that KISS can be> inserted at any place in a C
program where a random 64-bit> integer is required.> Fortran's
requirement that integers be signed makes the> necessary code more
complicated, hence a function KISS().> > C version; test by invoking
macro KISS 100 million times>
----------------------------------------------------------------->
#include <stdio.h>> > static unsigned long long>
x=3D1234567890987654321ULL,c=3D123456123456123456ULL,>
y=3D362436362436362436ULL,z=3D1066149217761810ULL,t;> > #define MWC (t=3D
(x<<58)+c, c=3D(x>>6), x+=3Dt, c+=3D(x<t), x)> #define XSH ( y^=3D(y<<13), =
y^=3D
(y>>17), y^=3D(y<<43) )> #define CNG ( z=3D6906969069LL*z+1234567 )>
#define KISS (MWC+XSH+CNG)> > =A0int main(void)> =A0{int i;> =A0 for
(i=3D0;i<100000000;i++) t=3DKISS;> =A0 (t=3D=3D1666297717051644203ULL) ?> =
=A0
printf("100 million uses of KISS OK"):> =A0 printf("Fail");> =A0}> >
--------------------------------------------------------------->
Fortran version; test by calling KISS() 100 million times>
--------------------------------------------------------------->
=A0program testkiss> =A0implicit integer*8(a-z)> =A0do i=3D1,100000000; t=
=3DKISS
(); end do> =A0if(t.eq.1666297717051644203_8) then> =A0print*,"100 million
calls to KISS() OK"> =A0else; print*,"Fail"> =A0end if; end> > =A0function
KISS()> =A0implicit integer*8(a-z)> =A0data x,y,z,c /
1234567890987654321_8, 362436362436362436_8,&> =A01066149217761810_8,
123456123456123456_8/> =A0save x,y,z,c> =A0 m(x,k)=3Dieor(x,ishft(x,k)) =A0=
!
statement function> =A0 s(x)=3Dishft(x,-63) =A0 =A0 =A0 =A0 =A0 =A0 !statem=
ent
function> =A0t=3Dishft(x,58)+c> =A0if(s(x).eq.s(t)) then; c=3Dishft(x,-6)+s=
(x)
> =A0 =A0 =A0 else; c=3Dishft(x,-6)+1-s(x+t); endif> =A0x=3Dt+x> =A0y=3Dm(m=
(m(y,
13_8),-17_8),43_8)> =A0z=3D6906969069_8*z+1234567> =A0KISS=3Dx+y+z> =A0retu=
rn;
end> --------------------------------------------------------------->
> Output from using the macro KISS or the function KISS() is> MWC+XSH
+CNG mod 2^64.> > CNG is easily implemented on machines with 64-bit
integers,> as arithmetic is automatically mod 2^64, whether integers>
are considered signed or unsigned. =A0 The CNG statement is> =A0 =A0
z=3D6906969069*z+1234567.> When I established the lattice structure of
congruential> generators in the 60's, a search produced 69069 as an
easy-> to-remember multiplier with nearly cubic lattices in 2,3,4,5->
space, so I tried concatenating, using 6906969069 as> my first test
multiplier. Remarkably---a seemingly one in many> hundreds chance---it
turned out to also have excellent lattice> structure in 2,3,4,5-space,
so that's the one chosen.> (I doubt if lattice structure of CNG has
much influence on the> composite 64-bit KISS produced via MWC+XSH+CNG
mod 2^64.)> > XSH, the Xorshift component, described in> =A0 =A0
=A0www.jstatsoft.org/v08/i14/paper> uses three invocations of an integer
"xor"ed with a shifted> version of itself.> The XSH component used for
this KISS is, in C notation:> =A0 =A0y^=3D(y<<13); y^=3D(y>>17); y^=3D(y<<4=
3)>
with Fortran equivalents y=3Dieor(y,ishft(y,13)), etc., although> this
can be effected by a Fortran statement function:> =A0 =A0 =A0f(y,k)=3Dieor
(y,ishft(y,k))> =A0 =A0 =A0y=3Df(f(f(y,13),-17),43)> As with lattice
structure, choice of the triple 13,-17,43 is> probably of no
particular importance; any one of the 275 full-> period triples listed
in the above article is likely to provide> a =A0satisfactory component
XSH for the composite MWC+XSH+CNG.> > The choice of multiplier 'a' for
the multiply-with-carry (MWC)> component of KISS is not so easily
made. =A0In effect, a multiply-> with-carry sequence has a current value
x and current "carry" c,> and from each given x,c a new x,c pair is
constructed by forming> =A0 t=3Da*x+c, then x=3Dt mod b=3D2^64 and c=3Dfloo=
r(t/
b).> This is easily implemented for 32-bit computers that permit>
forming a*t+c in 64 bits, from which the new x is the bottom and> the
new c the top 32-bits.> > When a,x and c are 64-bits, not many
computers seem to have an easy> way to form t=3Da*x+c in 128 bits, then
extract the top and bottom> 64-bit segments. =A0For that reason, special
choices for 'a' are> needed among those that satisfy the general
requirement that> p=3Da*b-1 is a prime for which b=3D2^64 has order (p-1)/
2.> > My choice---and the only one of this form---is a=3D2^58+1. Then
the> top 64 bits of an imagined 128-bit t=3Da*x+c may be obtained as>
(using C notation) =A0(x>>6)+ 1 or 0, depending> on whether the 64-bit
parts of (x<<58)+c+x cause an overflow.> Since (x<<58)+c cannot itself
cause overflow (c will always be <a),> we get the carry as c=3D(x>>6)
plus overflow from (x<<58)+x.> > This is easily done in C with
unsigned integers, using =A0a different> kind of 't': =A0 t=3D(x<<58)+c; c=
=3D
(x>>6); x=3Dt+x; c=3Dc+(x<t);> For Fortran and others that permit only
signed integers, more work> is needed.> Equivalent mod 2^64 versions
of t=3D(x<<58)+c and c=3D(x>>6) are easy,> and if s(x) represents (x>>63)
in C or ishft(x,-66) in Fortran,> then for signed integers, the new
carry c comes from the rule> =A0if s(x) equals s(t) then c=3D(x>>6)+s(x)
else c=3D(x>>6)+1-s(x+t)> > Speed:> A C version of this KISS RNG takes
18 nanosecs for each> 64-bit random number on my desktop (Vista) PC,
thus> producing KISSes at a rate exceeding 55 million per second.>
Fortran or other integers-must-be-signed compilers might get> "only"
around 40 million per second.> > Setting seeds:> Use of KISS or KISS()
as a general 64-bit RNG requires specifying> 3*64+58=3D250 bits for
seeds, 64 bits each for x,y,z and 58 for c,> resulting in a composite
sequence with period around 2^250.> The actual period is> =A0
(2^250+2^192+2^64-2^186-2^129)/6 ~=3D 2^(247.42) =A0or 10^(74.48).> We
"lose" 1+1.58=3D2.58 bits from maximum possible period, one bit> because
b=3D2^64, a square, cannot be a primitive root of p=3Dab-1,> so the best
possible order for b is (p-1)/2.> The periods of MWC and =A0XSH have gcd
3=3D2^1.58, so another =A01.58> bits =A0are "lost" from the best possible
period we could expect> from 250 seed bits.> > Some users may think
250 seed bits are an unreasonable requirement.> A good seeding
=A0procedure might be to assume the default seed> values then let the
user choose none, one, two,..., or all> of x,y,z, and c to be
reseeded.> > Tests:> Latest tests in The Diehard Battery, available
at> =A0 =A0http://i.cs.hku.hk/~diehard/> were applied extensively. Those
tests that specifically required> 32-bit integers were applied to the
leftmost 32 bits> (e,g, KISS>>32;), then to the middle 32-bits
((KISS<<16)>>32;)> then to the rightmost 32 bits, ( (KISS<<32)>>32).>
There were no extremes in the more than 700 p-values returned> by the
tests, nor indeed for similar tests applied to just two of the> KISS
components: MWC+XSH, then MWC+CNG, then XSH+CNG.> > The simplicity,
speed, period around 2^250 and performance on> tests of randomness---
as well as ability to produce exactly> the same 64-bit patterns,
whether considered signed or unsigned> integers---make this 64-bit