Monkey and a poet's work ?

  • Follow


Hi,
I came across the 'Infinite Monkey Theorem'.
http://en.wikipedia.org/wiki/Infinite_monkey_theorem

I wonder how can a monkey hitting keys at random on
a typewriter keyboard for an infinite amount of time will
almost surely type a given text, such as the complete
works of William Shakespeare ? And why was
monkey chosen to convey this theorem ?

How far is this theorem true ? Has any monkey
proved this now :-) ??

Thx in advans,
Karthik Balaguru
0
Reply karthikbalaguru79 (791) 3/7/2010 5:34:34 PM

Karthik Balaguru wrote:

> Hi,
> I came across the 'Infinite Monkey Theorem'.
> http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> 
> I wonder how can a monkey hitting keys at random on
> a typewriter keyboard for an infinite amount of time will
> almost surely type a given text, such as the complete
> works of William Shakespeare ? 

What part of

"In this context, "almost surely" is a mathematical term with a precise 
meaning, and the "monkey" is not an actual monkey, but a metaphor for an 
abstract device that produces a random sequence of letters ad infinitum."

and of this page

http://en.wikipedia.org/wiki/Almost_surely

is not clear?

If that's not enough, there's a mathematical proof down the page:

http://en.wikipedia.org/wiki/Infinite_monkey_theorem#Solution

> And why was monkey chosen to convey this theorem ?

http://en.wikipedia.org/wiki/Infinite_monkey_theorem#History

0
Reply pk (424) 3/7/2010 5:47:34 PM


On Mar 7, 10:34=A0pm, Karthik Balaguru <karthikbalagur...@gmail.com>
wrote:
> Hi,
> I came across the 'Infinite Monkey Theorem'.http://en.wikipedia.org/wiki/=
Infinite_monkey_theorem
>
> I wonder how can a monkey hitting keys at random on
> a typewriter keyboard for an infinite amount of time will
> almost surely type a given text, such as the complete
> works of William Shakespeare ? And why was
> monkey chosen to convey this theorem ?
>

On continuing to understand this theorem the below
link shed some light over this -
http://en.wikipedia.org/wiki/Infinite_monkey_theorem#Real_monkeys
The same link also has the 'Direct proofs' and 'probabilities'
sub-sections that set the things very clear.

It seems that the 'the probability' is zero.

Thx ,
Karthik Balaguru
0
Reply karthikbalaguru79 (791) 3/7/2010 5:49:25 PM

In comp.unix.programmer Karthik Balaguru <karthikbalaguru79@gmail.com> wrote:
> I came across the 'Infinite Monkey Theorem'.
> http://en.wikipedia.org/wiki/Infinite_monkey_theorem

> I wonder how can a monkey hitting keys at random on
> a typewriter keyboard for an infinite amount of time will
> almost surely type a given text, such as the complete
> works of William Shakespeare ?

What could be not in an infinite set? You will have not
only the works of Shakespeare, but also all his works
with all kinds of typos, readers digest versions etc.;-)

> And why was monkey chosen to convey this theorem ?

Because at the time someone came up with this idea there
weren't any keyboards that cats use for sleeping on. That
later led to the theorem that given enough cats, keyboards
and time all possible Perl scripts will be created.

> How far is this theorem true ? Has any monkey
> proved this now :-) ??

Well, instead of using a single monkey, giving it infinite
time, you can use a large number of monkeys for a shorter
time. Now, since the works of Shakespeare actually have been
written (assming that Shakespeare was a kind of monkey and
you don't instsit on the typewriter part), the theorem thus
has been experimentally proven (as a possibly uninteded side
effect of the mice having earth produced for finding "the"
question).

For another take on this have a look at the story "The Library
of Babel" by Jorge Luis Borges (who was, BTW, the model for
the blind librarian in Eco's "The Name of the Rose").

                               Regards, Jens
-- 
  \   Jens Thoms Toerring  ___      jt@toerring.de
   \__________________________      http://toerring.de
0
Reply jt68 (1134) 3/7/2010 6:02:12 PM

On Mar 7, 10:47=A0pm, pk <p...@pk.invalid> wrote:
> Karthik Balaguru wrote:
> > Hi,
> > I came across the 'Infinite Monkey Theorem'.
> >http://en.wikipedia.org/wiki/Infinite_monkey_theorem
>
> > I wonder how can a monkey hitting keys at random on
> > a typewriter keyboard for an infinite amount of time will
> > almost surely type a given text, such as the complete
> > works of William Shakespeare ?
>
> What part of
>
> "In this context, "almost surely" is a mathematical term with a precise
> meaning, and the "monkey" is not an actual monkey, but a metaphor for an
> abstract device that produces a random sequence of letters ad infinitum."
>
> and of this page
>
> http://en.wikipedia.org/wiki/Almost_surely
>
> is not clear?
>
> If that's not enough, there's a mathematical proof down the page:
>
> http://en.wikipedia.org/wiki/Infinite_monkey_theorem#Solution
>
> > And why was monkey chosen to convey this theorem ?
>
> http://en.wikipedia.org/wiki/Infinite_monkey_theorem#History

Yes, i saw that 'Direct Proof' , 'Probabilities', 'Real Monkey'
and 'History' section that shed more light over this, I found
those while continuing to understand more about this
theorem and hence i sent another mail in quick succession
to this. Anways, Thx a lot.

Thx,
Karthik Balaguru
0
Reply karthikbalaguru79 (791) 3/7/2010 6:02:59 PM

On Mar 7, 11:02=A0pm, j...@toerring.de (Jens Thoms Toerring) wrote:
> In comp.unix.programmer Karthik Balaguru <karthikbalagur...@gmail.com> wr=
ote:
>
> > I came across the 'Infinite Monkey Theorem'.
> >http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> > I wonder how can a monkey hitting keys at random on
> > a typewriter keyboard for an infinite amount of time will
> > almost surely type a given text, such as the complete
> > works of William Shakespeare ?
>
> What could be not in an infinite set? You will have not
> only the works of Shakespeare, but also all his works
> with all kinds of typos, readers digest versions etc.;-)
>

Readers digest version too :-) :-) :-)
Yeah, an infinite set can have all kind of works !

> > And why was monkey chosen to convey this theorem ?
>
> Because at the time someone came up with this idea there
> weren't any keyboards that cats use for sleeping on.

:-) :-)
Maybe, cat might eat the mouse ;-)

> That
> later led to the theorem that given enough cats, keyboards
> and time all possible Perl scripts will be created.
>
> > How far is this theorem true ? Has any monkey
> > proved this now :-) ??
>
> Well, instead of using a single monkey, giving it infinite
> time, you can use a large number of monkeys for a shorter
> time.

:-) :-)

> Now, since the works of Shakespeare actually have been
> written (assming that Shakespeare was a kind of monkey and
> you don't instsit on the typewriter part), the theorem thus
> has been experimentally proven (as a possibly uninteded side
> effect of the mice having earth produced for finding "the"
> question).
>
> For another take on this have a look at the story "The Library
> of Babel" by Jorge Luis Borges (who was, BTW, the model for
> the blind librarian in Eco's "The Name of the Rose").
>

:-)

Karthik Balaguru
0
Reply karthikbalaguru79 (791) 3/7/2010 6:39:46 PM

>
>>
>> How far is this theorem true? Has any monkey proved this now?
>>
> Well, instead of using a single monkey, giving it infinite time, you 
> can use a large number of monkeys for a shorter time.
>
.... because, as we all know, infinity divided by a finite number is ... 
erm ... finite and ... erm ... smaller.

Robert Wilensky is proven right by this very thread.

0
Reply J.deBoynePollard-newsgroups (95) 3/7/2010 7:00:37 PM

jt@toerring.de (Jens Thoms Toerring) writes:

> In comp.unix.programmer Karthik Balaguru <karthikbalaguru79@gmail.com> wrote:
>> I came across the 'Infinite Monkey Theorem'.
>> http://en.wikipedia.org/wiki/Infinite_monkey_theorem
>
>> I wonder how can a monkey hitting keys at random on
>> a typewriter keyboard for an infinite amount of time will
>> almost surely type a given text, such as the complete
>> works of William Shakespeare ?
>
> What could be not in an infinite set? You will have not
> only the works of Shakespeare, but also all his works
> with all kinds of typos, readers digest versions etc.;-)

You probably should clarify this: "what could not be in an infinite
set constructed like this?"[1].  The mere fact that the set is
infinite is not enough to ensure that even one word of Shakespeare is
almost surely present.

[1] or "what could not be in such a set?".

<snip>
-- 
Ben.
0
Reply ben.usenet (6515) 3/7/2010 7:40:28 PM

Ben Bacarisse  wrote in message
<0.9da34275b07558e5d425.20100307194028GMT.87d3zf6hub.fsf@bsb.me.uk>:
>				   The mere fact that the set is
> infinite is not enough to ensure that even one word of Shakespeare is
> almost surely present.

For example, the infinite number of monkeys could not type a single chapter
of Victor Hugo in a dumb text editor with an US keyboard.
0
Reply george54 (194) 3/7/2010 8:12:08 PM

>> For another take on this have a look at the story "The Library
>> of Babel" by Jorge Luis Borges (who was, BTW, the model for
>> the blind librarian in Eco's "The Name of the Rose").

You might like the "Pierre Menard, Author of the Quixote" as well, for
a different take on it.


        Stefan
0
Reply monnier (195) 3/7/2010 8:14:30 PM

On 7 Mar, 18:02, j...@toerring.de (Jens Thoms Toerring) wrote:
> In comp.unix.programmer Karthik Balaguru <karthikbalagur...@gmail.com> wrote:

> > I came across the 'Infinite Monkey Theorem'.

> >http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> > I wonder how can a monkey hitting keys at random on
> > a typewriter keyboard for an infinite amount of time will
> > almost surely type a given text, such as the complete
> > works of William Shakespeare ?
>
> What could be not in an infinite set?

lots of things. An infinite set doesn't have to contain all values
with equal probability. It is far from clear that pi expressed as a
decimal fraction and then mapped to ascii in some reasonable manner /
has/ to contain the complete works of shakespere. Nasty stuff
infinity.

<snip>

> > How far is this theorem true ? Has any monkey
> > proved this now :-) ??
>
> Well, instead of using a single monkey, giving it infinite
> time, you can use a large number of monkeys for a shorter
> time.

so if I used 10 monkeys how much time would I save?


> Now, since the works of Shakespeare actually have been
> written (assming that Shakespeare was a kind of monkey and
> you don't instsit on the typewriter part), the theorem thus
> has been experimentally proven (as a possibly uninteded side
> effect of the mice having earth produced for finding "the"
> question).

shakespere didn't generate his plays by random means.

> For another take on this have a look at the story "The Library
> of Babel" by Jorge Luis Borges (who was, BTW, the model for
> the blind librarian in Eco's "The Name of the Rose").


--
Nick Keighley

"Anyone attempting to generate random numbers by deterministic
means is, of course, living in a state of sin."
        -- John Von Neumann
0
Reply nick_keighley_nospam (4574) 3/8/2010 9:56:54 AM

Jens Thoms Toerring wrote:
> In comp.unix.programmer Karthik Balaguru <karthikbalaguru79@gmail.com> wrote:
>> I came across the 'Infinite Monkey Theorem'.
>> http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> 
>> I wonder how can a monkey hitting keys at random on
>> a typewriter keyboard for an infinite amount of time will
>> almost surely type a given text, such as the complete
>> works of William Shakespeare ?
> 
> What could be not in an infinite set?

The infinite set of even integers contains only half the integers. None 
of the odd integers (of which there are infinitely many) is present in 
the set.

The infinite set of prime integers contains almost none of the integers 
- just an infinite handful that share an odd characteristic. Most of the 
integers are absent from that infinite set.

The infinite set of integer powers of two contains practically no 
integers at all - merely infinitely many. If the infinite set of 
integers were an infinitely large barrel, the infinite set of integer 
powers of two would be an infinitesimally small apple rolling around at 
the bottom. (Admittedly it would be an infinitely large infinitesimally 
small apple...)

The set of things that could not be in a given infinite set is 
infinitely large. In fact, there are many such sets. Now, consider the 
set of *all* such sets... Er, actually no, let's not go there...

> You will have not
> only the works of Shakespeare, but also all his works
> with all kinds of typos, readers digest versions etc.;-)

That isn't certain by any means. See above.

<snip>

> Well, instead of using a single monkey, giving it infinite
> time, you can use a large number of monkeys for a shorter
> time. Now, since the works of Shakespeare actually have been
> written (assming that Shakespeare was a kind of monkey and
> you don't instsit on the typewriter part), the theorem thus
> has been experimentally proven (as a possibly uninteded side
> effect of the mice having earth produced for finding "the"
> question).

I think we can agree that the works of Shakespeare have actually been 
written. But you are assuming:

(a) that Shakespeare wrote them, which is apparently a matter of some 
dispute;

(b) that Shakespeare was a kind of monkey, which is very unlikely to be 
true. Both are primates, but then cars and bicycles are both vehicles; 
that doesn't mean a bike is a kind of car or vice versa.

And in any case the theory is about duplicating the works of 
Shakespeare, not originating them.

Let's try to quantify it a little, shall we?

We start by defining our terms and our tools. We will begin with a 
simple monkey, eventually replacing him with a computer.

Let's assume that we have *one* monkey, then, hitting *one* typewriter 
*once* per second, purely at random. The typewriter has *two* keys, 0 
and 1. (When we install the MonkeyBrain XII later on, we can soup up the 
speed a bit.)

Here is a Shakespeare sonnetette: 10010111. We want to know whether, if 
we'd installed our monkey attadawnatime, he would have a reasonable 
probability of having produced this sonnetette by now. Of course, we'd 
like to be more general than that.

Attadawnatime, they say, was 14,000,000,000 years ago. That's about 
441504000000000000 seconds, which we'll double (in case the scientists 
got it wrong, which is not exactly unheard of), giving us a million 
million million seconds to work with. Bags of time.

Now, the first seven times our monkey hits the keys, he can't possibly 
produce the sonnetette. But the eighth time, he *may* have produced a 
sonnetette. That is, if the bit length of the target text T is L, then 
we have to produce at least L bits.

The probability of the first L bits producing T is 1/(2^L). More 
specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of 
matching L bits of data against all L bits of text, in a given position 
in the bitstream).

Now if we produce *more* bits, it gets a bit(sigh) more interesting. 
Let's assume we produce 9 bits. We now have two stabs at the T cherry: a 
match of T against the first 8 bits of the 9, and a match against the 
last 8 bits of the 9. Any one match will do, so both matches have to 
fail for the monkey to fail. We calculate this failure probability F by 
raising (1-p) to the power of the number of match attempts:

F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625

More generally, if we produce B bits (where B>=L):

F = (1 - (1/(2^L))^(B+1-L)

and this will take us B/R seconds, where R is our bit production rate 
(bits per second). Clearly, the probability P of success is 1-F.

At this point, we have a program spec.

Inputs: L (the length of the test corpus, in bits)
         R (bit production rate)
         K (seconds sinceadawnatime - assume 10^18)

Calculation:
         Attempts = ((R * K) + 1) - L
         SingleFailure = 1.0 - (1.0 / pow(2.0, L))
         F = pow(SingleFailure, Attempts)
         P = 1.0 - F

The implementation is left as an exercise for the terminally bored reader.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/8/2010 10:59:09 AM

Richard Heathfield wrote:

> Let's assume that we have *one* monkey, then, hitting *one* typewriter
> *once* per second, purely at random. The typewriter has *two* keys, 0
> and 1. (When we install the MonkeyBrain XII later on, we can soup up the
> speed a bit.)
>
> Here is a Shakespeare sonnetette: 10010111. We want to know whether, if
> we'd installed our monkey attadawnatime, he would have a reasonable
> probability of having produced this sonnetette by now. Of course, we'd
> like to be more general than that.
>
> Attadawnatime, they say, was 14,000,000,000 years ago. That's about
> 441504000000000000 seconds, which we'll double (in case the scientists
> got it wrong, which is not exactly unheard of), giving us a million
> million million seconds to work with. Bags of time.
>
> Now, the first seven times our monkey hits the keys, he can't possibly
> produce the sonnetette. But the eighth time, he *may* have produced a
> sonnetette. That is, if the bit length of the target text T is L, then
> we have to produce at least L bits.
>
> The probability of the first L bits producing T is 1/(2^L). More
> specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of
> matching L bits of data against all L bits of text, in a given position
> in the bitstream).
>
> Now if we produce *more* bits, it gets a bit(sigh) more interesting.
> Let's assume we produce 9 bits. We now have two stabs at the T cherry: a
> match of T against the first 8 bits of the 9, and a match against the
> last 8 bits of the 9. Any one match will do, so both matches have to
> fail for the monkey to fail. We calculate this failure probability F by
> raising (1-p) to the power of the number of match attempts:
>
> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625

I don't think so. The two attempts are not independent.
0
Reply Noob 3/8/2010 12:35:23 PM

On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:

> shakespere didn't generate his plays by random means.

It was random that there even was a Shakespeare.
0
Reply stonerfish (284) 3/8/2010 2:44:55 PM

Noob wrote:
> Richard Heathfield wrote:

<snip>

>> We calculate this failure probability F by
>> raising (1-p) to the power of the number of match attempts:
>>
>> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
> 
> I don't think so. The two attempts are not independent.

It's a fair cop. Perhaps you could explain how to perform the 
calculation correctly?

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/8/2010 11:44:07 PM

>
>>
>> shakespere didn't generate his plays by random means.
>>
> It was random that there even was a Shakespeare.
>
.... but not that there was an Isaac Newton.  That is according to his 
epitaph, at any rate.

0
Reply J.deBoynePollard-newsgroups (95) 3/9/2010 2:36:27 AM

In article <yKWdnfd7BvIrFgjWnZ2dnUVZ8txi4p2d@bt.com>, 
rjh@see.sig.invalid says...
> Noob wrote:
> > Richard Heathfield wrote:
> 
> <snip>
> 
> >> We calculate this failure probability F by
> >> raising (1-p) to the power of the number of match attempts:
> >>
> >> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
> > 
> > I don't think so. The two attempts are not independent.
> 
> It's a fair cop. Perhaps you could explain how to perform the 
> calculation correctly?
> 
Unfortunately, the calculation depends on the specific text*. So for 
your binary version of Hamlet (or was it Othello) there is no simple 
answer.

* to prove this, imagine that the works of Shakespeare can be expressed 
as a two character binary string (maybe the Condendsed Books version) 
and the random monkey has typed three symbols. If the condensed string 
is '00' then there is a 5/8 chance that it will not appear in a random 
three character string, but if the string is '01' then there is a 1/2 
chance it will not be found in a random three character string.

Mike
0
Reply m.fee (39) 3/9/2010 9:02:40 PM

On Sun, 07 Mar 2010 15:14:30 -0500, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:

>>> For another take on this have a look at the story "The Library
>>> of Babel" by Jorge Luis Borges
http://jubal.westnet.com/hyperdiscordia/library_of_babel.html

>>>(who was, BTW, the model for
>>> the blind librarian in Eco's "The Name of the Rose").
>
>You might like the "Pierre Menard, Author of the Quixote" as well, for
>a different take on it.
http://www.coldbacon.com/writing/borges-quixote.html


And for a third suggestion by the same author try "The Book of Sand"
http://artificeeternity.com/bookofsand/

which compresses the entire library into a single book (though it does
have quite a few pages).

rossum

>
>
>        Stefan

0
Reply rossum48 (643) 3/9/2010 11:01:21 PM

On 8 Mar, 14:44, jellybean stonerfish <stonerf...@geocities.com>
wrote:
> On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:

> > shakespere didn't generate his plays by random means.
>
> It was random that there even was a Shakespeare.

only if you use an odd definition of "random"
0
Reply nick_keighley_nospam (4574) 3/10/2010 10:29:05 AM

On 10-Mar-10 11:29 AM, Nick Keighley wrote:
> On 8 Mar, 14:44, jellybean stonerfish<stonerf...@geocities.com>
> wrote:
>> On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:
>
>>> shakespere didn't generate his plays by random means.
>>
>> It was random that there even was a Shakespeare.
>
> only if you use an odd definition of "random"

Well, I disagree. Given the number of permutations of his parents' 
genes, and those of his grandparents, etc., it's a downright miracle 
there ever was a Shakespeare. And he landed in just the right time, too 
-- we will never know if there was a very good Neanderthal playwright 
because they didn't write.

But don't feel depressed; the odds of YOUR existence were just as slim, 
and yet the universe pulled that one off.

[Jw]
0
Reply jongware (25) 3/10/2010 11:01:58 AM

Karthik Balaguru wrote:
> 
> Hi,
> I came across the 'Infinite Monkey Theorem'.
> http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> 
> I wonder how can a monkey hitting keys at random on
> a typewriter keyboard for an infinite amount of time will
> almost surely type a given text, such as the complete
> works of William Shakespeare ? And why was
> monkey chosen to convey this theorem ?
> 
> How far is this theorem true ? Has any monkey
> proved this now :-) ??

The monkey is supposed to represent a random text generator,
which is capable of generating any text.
The infinite amount of time, is a condition which can not be met.

(A) And (Not A), Implies (B).

-- 
pete
0
Reply pfiland (6613) 3/10/2010 11:55:19 AM

On Wed, 10 Mar 2010 02:29:05 -0800, Nick Keighley wrote:

> On 8 Mar, 14:44, jellybean stonerfish <stonerf...@geocities.com> wrote:
>> On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:
> 
>> > shakespere didn't generate his plays by random means.
>>
>> It was random that there even was a Shakespeare.
> 
> only if you use an odd definition of "random"

I used a pseudo-random definition of random.

0
Reply stonerfish (284) 3/10/2010 2:34:42 PM

Jens Thoms Toerring wrote:
> 
> In comp.unix.programmer Karthik Balaguru <karthikbalaguru79@gmail.com> wrote:
> > I came across the 'Infinite Monkey Theorem'.
> > http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> 
> > I wonder how can a monkey hitting keys at random on
> > a typewriter keyboard for an infinite amount of time will
> > almost surely type a given text, such as the complete
> > works of William Shakespeare ?
> 
> What could be not in an infinite set? You will have not
> only the works of Shakespeare, but also all his works
> with all kinds of typos, readers digest versions etc.;-)

It is not an infinite set, it is an infinite sequence.  If the
characters in the sequence are treated as digits and the sequence is
construed as a real number (with an invisible "decimal" point at its
left hand end say) and if that real number is normal in �mile Borel's
sense, then you'll get the works of Shakespeare.  But does random equal
normal?

-- 
I can't go on, I'll go on.
0
Reply frederick.williams2 (82) 3/10/2010 3:46:27 PM

On 3/8/2010 4:56 AM, Nick Keighley wrote:
> [...]
> shakespere didn't generate his plays by random means.

     True.  His use of randomization was confined to the
spelling of his name.

-- 
Eric Sosman
esosman@ieee-dot-org.invalid
0
Reply esosman2 (2945) 3/10/2010 7:58:58 PM

In article <4B97BED3.D3D182AA@tesco.net>, frederick.williams2@tesco.net=20
says...
> Jens Thoms Toerring wrote:
> >=20
> > In comp.unix.programmer Karthik Balaguru <karthikbalaguru79@gmail.com> =
wrote:
> > > I came across the 'Infinite Monkey Theorem'.
> > > http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> >=20
> > > I wonder how can a monkey hitting keys at random on
> > > a typewriter keyboard for an infinite amount of time will
> > > almost surely type a given text, such as the complete
> > > works of William Shakespeare ?
> >=20
> > What could be not in an infinite set? You will have not
> > only the works of Shakespeare, but also all his works
> > with all kinds of typos, readers digest versions etc.;-)
>=20
> It is not an infinite set, it is an infinite sequence.  If the
> characters in the sequence are treated as digits and the sequence is
> construed as a real number (with an invisible "decimal" point at its
> left hand end say) and if that real number is normal in =C9mile Borel's
> sense, then you'll get the works of Shakespeare.  But does random equal
> normal?
>=20
It doesn't require an infinite sequence, just a sufficiently long one.

The complete works comprise 888,429 words or approximately 4.86 million=20
characters so, even without compression, a random binary string of=20
around 10^(11,700,000) 0/1 characters should be long enough to almost=20
guarantee to contain an ascii representation of the works. Bigger than a=20
google, but much smaller than a googleplex.

Mike
0
Reply m.fee (39) 3/10/2010 10:45:33 PM

>
>>
>> shakespere didn't generate his plays by random means.
>>
> True.  His use of randomization was confined to the spelling of his name.
>
Life imitates humour.  One of the recent proposed extensions to the DNS 
protocol involves random capitalization of domain names.
0
Reply J.deBoynePollard-newsgroups (95) 3/11/2010 12:18:02 AM

On 7 Mar, 17:34, Karthik Balaguru <karthikbalagur...@gmail.com> wrote:
> Hi,
> I came across the 'Infinite Monkey Theorem'.http://en.wikipedia.org/wiki/Infinite_monkey_theorem
>
> I wonder how can a monkey hitting keys at random on
> a typewriter keyboard for an infinite amount of time will
> almost surely type a given text, such as the complete
> works of William Shakespeare ?

How can a *random* process produce every
other work of literature and *avoid* that one?



0
Reply gremnebulin 3/11/2010 11:33:47 AM

On 7 Mar, 18:02, j...@toerring.de (Jens Thoms Toerring) wrote:
> In comp.unix.programmer Karthik Balaguru <karthikbalagur...@gmail.com> wrote:
>
> > I came across the 'Infinite Monkey Theorem'.
> >http://en.wikipedia.org/wiki/Infinite_monkey_theorem
> > I wonder how can a monkey hitting keys at random on
> > a typewriter keyboard for an infinite amount of time will
> > almost surely type a given text, such as the complete
> > works of William Shakespeare ?
>
> What could be not in an infinite set?

You mean infinite random set.


> Well, instead of using a single monkey, giving it infinite
> time, you can use a large number of monkeys for a shorter
> time. Now, since the works of Shakespeare actually have been
> written (assming that Shakespeare was a kind of monkey and
> you don't instsit on the typewriter part),

Neither infinite nor random, so not much of a proof of the monkey
theorem
0
Reply gremnebulin 3/11/2010 11:36:34 AM

On 8 Mar, 09:56, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:

> > What could be not in an infinite set?
>
> lots of things. An infinite set doesn't have to contain all values
> with equal probability.

But we are talking about an inifinite *random* set

> It is far from clear that pi expressed as a
> decimal fraction and then mapped to ascii in some reasonable manner /
> has/ to contain the complete works of shakespere.

pi is far from random. See Chaitin.


0
Reply gremnebulin 3/11/2010 11:47:33 AM

gremnebulin <peterdjones@yahoo.com> writes:
> On 7 Mar, 17:34, Karthik Balaguru <karthikbalagur...@gmail.com> wrote:
>> Hi,
>> I came across the 'Infinite Monkey Theorem'.http://en.wikipedia.org/wiki/Infinite_monkey_theorem
>>
>> I wonder how can a monkey hitting keys at random on
>> a typewriter keyboard for an infinite amount of time will
>> almost surely type a given text, such as the complete
>> works of William Shakespeare ?
>
> How can a *random* process produce every
> other work of literature and *avoid* that one?

By being a random process. A random process can produce a sequence of
alternating ones and zeroes for any arbitrarily large observation
interval. But 'monkeys' don't act randomily, anyway.

0
Reply rweikusat (2679) 3/11/2010 2:07:08 PM

mike wrote:
> In article <yKWdnfd7BvIrFgjWnZ2dnUVZ8txi4p2d@bt.com>, 
> rjh@see.sig.invalid says...
>> Noob wrote:
>>> Richard Heathfield wrote:
>> <snip>
>>
>>>> We calculate this failure probability F by
>>>> raising (1-p) to the power of the number of match attempts:
>>>>
>>>> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
>>> I don't think so. The two attempts are not independent.
>> It's a fair cop. Perhaps you could explain how to perform the 
>> calculation correctly?
>>
> Unfortunately, the calculation depends on the specific text*. So for 
> your binary version of Hamlet (or was it Othello) there is no simple 
> answer.
> 
> * to prove this, imagine that the works of Shakespeare can be expressed 
> as a two character binary string (maybe the Condendsed Books version) 

Readers Digest, eat your heart out. :-)

> and the random monkey has typed three symbols. If the condensed string 
> is '00' then there is a 5/8 chance that it will not appear in a random 
> three character string, but if the string is '01' then there is a 1/2 
> chance it will not be found in a random three character string.

ITYM 6/8 rather than 5/8? But anyway, yes, fair enough. Let me, then, 
ask the question a different way:

My calculation, albeit flawed, might reasonably be said to give a 
ballpark figure. Is it possible to come up with a reasonably simple 
calculation that gives at least an order of magnitude reduction in the 
error level?

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/14/2010 8:52:58 AM

In comp.programming jellybean stonerfish <stonerfish@geocities.com> wrote:
> On Wed, 10 Mar 2010 02:29:05 -0800, Nick Keighley wrote:
> 
>> On 8 Mar, 14:44, jellybean stonerfish <stonerf...@geocities.com> wrote:
>>> On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:
>> 
>>> > shakespere didn't generate his plays by random means.
>>>
>>> It was random that there even was a Shakespeare.
>> 
>> only if you use an odd definition of "random"
> 
> I used a pseudo-random definition of random.
> 

I'll responde to this as I do in real life.

..... Jerk.

--Ashton
0
Reply ashton (15) 3/14/2010 9:36:24 PM

In article <EL2dnXUKpJB6PgHWnZ2dnUVZ7qydnZ2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> > In article <yKWdnfd7BvIrFgjWnZ2dnUVZ8txi4p2d@bt.com>, 
> > rjh@see.sig.invalid says...
> >> Noob wrote:
> >>> Richard Heathfield wrote:
> >> <snip>
> >>
> >>>> We calculate this failure probability F by
> >>>> raising (1-p) to the power of the number of match attempts:
> >>>>
> >>>> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
> >>> I don't think so. The two attempts are not independent.
> >> It's a fair cop. Perhaps you could explain how to perform the 
> >> calculation correctly?
> >>
> > Unfortunately, the calculation depends on the specific text*. So for 
> > your binary version of Hamlet (or was it Othello) there is no simple 
> > answer.
> > 
> > * to prove this, imagine that the works of Shakespeare can be expressed 
> > as a two character binary string (maybe the Condendsed Books version) 
> 
> Readers Digest, eat your heart out. :-)
> 
> > and the random monkey has typed three symbols. If the condensed string 
> > is '00' then there is a 5/8 chance that it will not appear in a random 
> > three character string, but if the string is '01' then there is a 1/2 
> > chance it will not be found in a random three character string.
> 
> ITYM 6/8 rather than 5/8? But anyway, yes, fair enough. Let me, then, 
> ask the question a different way:

Presumably you missed one of '000' or '001' or '010'.
> 
> My calculation, albeit flawed, might reasonably be said to give a 
> ballpark figure. Is it possible to come up with a reasonably simple 
> calculation that gives at least an order of magnitude reduction in the 
> error level?
> 
Imagine you are searching for the substring "to be or not to be" and at 
some point you reach a substring "...zzzto be or not to", you are 
obviously a bit more optimistic of finding the substring in the next few 
characters than you would be if your latest characters were "...zzzto".
In the first case you have 15 out of 18 characters correct, whereas in 
the second case you only have 2 out of 18.

You can use a Markov-chain type calculation to determine the probability 
of switching between p out of n characters correct and q out of n 
correct.

As a simple example, looking at the two letter binary Shakespeare you 
can bild a matrix showing the chance of mopving from p correct to q 
correct.

EG  searching for '00'

M = [1/2 1/2 0  ]      (with fixed-width fonts)
    [1/2 0   1/2]
    [0   0   1  ]

where M(p,q) (p = row, q = column) = the probability of transitioning 
between p correct and q correct.  So if you have so far found the string 
'0' then you have p=1 letters correct and row 1 of the matrix (the 
matrix is zero indexed so it starts at (0,0) and goes to (2,2)) shows 
that you have 1/2 chance of ending up in column 0 (i.e you get a '1' 
next so your string is now '01' and you have to start again from 0 
corect), and a 1/2 chance of ending up in column 2 (i.e you get a '0' so 
now your string is '00' with 2 correct letters). Note that the 1 in 
position (2,2) just means that if you already have 2 characters correct 
(row 2) then you already have the probability of 1 (100%) of 
transitioning to 2 characters correct (column 2).  Hope this all makes 
sense.

Now, the probability of going from some p corect to some q correct in n 
turns (i.e in reading the next n characters of your search string) is 
just M^n. Starting from the beginning, with 0 characters correct, we can 
just look at the value of M(0,2)^n to find the probability of getting 2 
characters correct (somewhere within the n search characters).

So with M as above,


M^1 = 0.500 0.500 0.000
      0.500 0.000 0.500
      0.000 0.000 1.000

M^2 = 0.500 0.250 0.250
      0.250 0.250 0.500
      0.000 0.000 1.000

M^3 = 0.375 0.250 0.375
      0.250 0.125 0.625
      0.000 0.000 1.000

M^4 = 0.313 0.188 0.500
      0.188 0.125 0.688
      0.000 0.000 1.000

M^8 = 0.133 0.082 0.786
      0.082 0.051 0.868
      0.000 0.000 1.000

etc, and you now know that in a 1,2,3,4,8 character random binary string 
you have probability 0.000, 0.250, 0.375, 0.500, 0.786 of 'finding' the 
substring '00'.

You can do the same thing with the transition matrix for '01' which, if 
I think will be

M = [1/2 1/2 0  ]
    [1/2 1/2 0  ]
    [0   0   1  ],  or extend the same idea to longer and/or non binary 
strings.

Cheers,
Mike

0
Reply m.fee (39) 3/14/2010 10:58:18 PM

mike wrote:
> In article <EL2dnXUKpJB6PgHWnZ2dnUVZ7qydnZ2d@bt.com>, 
> rjh@see.sig.invalid says...
>> mike wrote:

<snip>

>>> and the random monkey has typed three symbols. If the condensed string 
>>> is '00' then there is a 5/8 chance that it will not appear in a random 
>>> three character string, but if the string is '01' then there is a 1/2 
>>> chance it will not be found in a random three character string.
>> ITYM 6/8 rather than 5/8? But anyway, yes, fair enough. Let me, then, 
>> ask the question a different way:
> 
> Presumably you missed one of '000' or '001' or '010'.

No, because '010' doesn't contain '00' as a substring. Actually, it was 
'100' that I missed! But yes, I missed one, obviously.

>> My calculation, albeit flawed, might reasonably be said to give a 
>> ballpark figure. Is it possible to come up with a reasonably simple 
>> calculation that gives at least an order of magnitude reduction in the 
>> error level?
>>
> Imagine you are searching for the substring "to be or not to be" and at 
> some point you reach a substring "...zzzto be or not to", you are 
> obviously a bit more optimistic of finding the substring in the next few 
> characters than you would be if your latest characters were "...zzzto".

(a) a miss is as good as a mile;
(b) I would actually start comparing at the /end/ of the substring - cf 
Boyer-Moore;
(c) cf Bob Newhart: "Hey, Harry, I think Station 63 has something... I 
think this is famous or something... 'To be or not to be, that is the 
gzornenplatz'".

> In the first case you have 15 out of 18 characters correct, whereas in 
> the second case you only have 2 out of 18.
> 
> You can use a Markov-chain type calculation to determine the probability 
> of switching between p out of n characters correct and q out of n 
> correct.

You lost me already. Given n characters, x of them are correct, with 0 
<= x <= n. If p != q, then either p == x or q == x or (p != x && q != 
x). You don't get to switch from p to q correct by some probability. I 
am guessing that you meant to write something different, but (if so) I 
can't glean your intent from what you've actually written. But I'll keep 
reading in case it becomes clearer...

> As a simple example, looking at the two letter binary Shakespeare you 
> can bild a matrix showing the chance of mopving from p correct to q 
> correct.
> 
> EG  searching for '00'
> 
> M = [1/2 1/2 0  ]      (with fixed-width fonts)
>     [1/2 0   1/2]
>     [0   0   1  ]
> 
> where M(p,q) (p = row, q = column) = the probability of transitioning 
> between p correct and q correct.

Okay, I'm with you. q = p+delta, and you're calculating the probability 
that you don't "go off track", so to speak, on the way from one to the 
other. But if I've understood you correctly and if you're right (and I 
lack the mathematical sophistication to allow me to determine whether or 
not I have and whether or not you are), we run into a problem. 
Shakespeare is around 40 Megabits; how big is the transition matrix 
going to be? If I understand you correctly, it will be 40,000,000 by 
40,000,000 entries. Just storing a matrix that size is going to be a 
huge problem, and multiplying two MxM matrices requires MxM 
multiplication operations. We're going to be talking bignums to keep any 
degree of precision, and at the bignum level multiplication is O(xy) 
where x and y are the digits in the two multiplicands. So, for one 
transition, we have to perform 1,600,000,000,000,000 O(xy) operations. 
Then, for 40,000,000 transitions, we have to raise that whole mess to 
the 40,000,000th power. Using your method, the whole question changes to 
whether your suggested program will run to completion before the monkeys 
either hit the jackpot or give up in disgust. Whether it does in fact 
yield the required order of magnitude error reduction seems, on the face 
of it, to be a side issue.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/15/2010 12:14:38 AM

In article <Pt6dnb2Q14hw5gDWnZ2dnUVZ8rKdnZ2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> > In article <EL2dnXUKpJB6PgHWnZ2dnUVZ7qydnZ2d@bt.com>, 
> > rjh@see.sig.invalid says...
> >> mike wrote:
> 
> <snip>
> 
> >>> and the random monkey has typed three symbols. If the condensed string 
> >>> is '00' then there is a 5/8 chance that it will not appear in a random 
> >>> three character string, but if the string is '01' then there is a 1/2 
> >>> chance it will not be found in a random three character string.
> >> ITYM 6/8 rather than 5/8? But anyway, yes, fair enough. Let me, then, 
> >> ask the question a different way:
> > 
> > Presumably you missed one of '000' or '001' or '010'.
> 
> No, because '010' doesn't contain '00' as a substring. Actually, it was 
> '100' that I missed! But yes, I missed one, obviously.
> 
> >> My calculation, albeit flawed, might reasonably be said to give a 
> >> ballpark figure. Is it possible to come up with a reasonably simple 
> >> calculation that gives at least an order of magnitude reduction in the 
> >> error level?
> >>
>
[ snipped bit ]
> You lost me already. Given n characters, x of them are correct, with 0 
> <= x <= n. If p != q, then either p == x or q == x or (p != x && q != 
> x). You don't get to switch from p to q correct by some probability. I 
> am guessing that you meant to write something different, but (if so) I 
> can't glean your intent from what you've actually written. But I'll keep 
> reading in case it becomes clearer...

Maybe I shoulkd have mentioned that for _this_particular_ Markov 
process, q <= p+1. For example, if you are searching for '000000' then 
either q = p+1 or q = 0. But if you are searching for '001000' then 
while p=3 could only transition to q=4 or q=0, p=5 could transition to 
q=6 (and you have found the string) or to q=3.  
> 
> > As a simple example, looking at the two letter binary Shakespeare you 
> > can bild a matrix showing the chance of mopving from p correct to q 
> > correct.
> > 
> > EG  searching for '00'
> > 
> > M = [1/2 1/2 0  ]      (with fixed-width fonts)
> >     [1/2 0   1/2]
> >     [0   0   1  ]
> > 
> > where M(p,q) (p = row, q = column) = the probability of transitioning 
> > between p correct and q correct.

So I lost you and then found you again.
> 
> Okay, I'm with you. q = p+delta, and you're calculating the probability 
> that you don't "go off track", so to speak, on the way from one to the 
> other. But if I've understood you correctly and if you're right (and I 
> lack the mathematical sophistication to allow me to determine whether or 
> not I have and whether or not you are), we run into a problem. 
> Shakespeare is around 40 Megabits; how big is the transition matrix 
> going to be? If I understand you correctly, it will be 40,000,000 by 
> 40,000,000 entries. Just storing a matrix that size is going to be a 
> huge problem, and multiplying two MxM matrices requires MxM 
> multiplication operations. We're going to be talking bignums to keep any 
> degree of precision, and at the bignum level multiplication is O(xy) 
> where x and y are the digits in the two multiplicands. So, for one 
> transition, we have to perform 1,600,000,000,000,000 O(xy) operations. 
> Then, for 40,000,000 transitions, we have to raise that whole mess to 
> the 40,000,000th power. Using your method, the whole question changes to 
> whether your suggested program will run to completion before the monkeys 
> either hit the jackpot or give up in disgust. Whether it does in fact 
> yield the required order of magnitude error reduction seems, on the face 
> of it, to be a side issue.
> 
Well you asked for a reasonably simple calculation that could 
significantly reduce the error in your estimate. While my explanation 
may not have been simple (I wasn't going to waste a lot of time editing 
my post) the calculation is very simple (what could be simpler than 
matrix multiplication?) and reduces the error to zero. 

I never claimed that it was going to be practical*...

Comparisons with slightly longer binary strings shows that your estimate 
could be very badly out. For example your calculation would predict that 
the probability of not finding a 4 digit binary string in 128 random 
digits would be 0.0003136 = (15/16)^(128-3), whereas the actual 
probability could be as low as 0.0001806 (for string '0110') or as high 
as 0.009712 for string '0000'. 

* It is probably worth noting that the initial transition matrix is very 
sparse. I would guess that for the complete works (in 7-bit ascii and 
assuming 40 M characters), it would consist of a 40,000,001 by 
40,000,001 element matrix with the top 40,000,000 entries of the first 
column equal to 1/128, the diagonal (1,0), (2,1),(3,2)...
(40,000,001,40,000,000) also set to 1/128, and element 
(40,000,001,40,000,001) set to 1. All the rest of the matrix would be 0, 
so think of the memory you could save (for the first few multiplications 
at least).

Mike
0
Reply m.fee (39) 3/15/2010 1:02:49 AM

mike wrote:
<snip>

>> Using your method, the whole question changes to 
>> whether your suggested program will run to completion before the monkeys 
>> either hit the jackpot or give up in disgust. Whether it does in fact 
>> yield the required order of magnitude error reduction seems, on the face 
>> of it, to be a side issue.
>>
> Well you asked for a reasonably simple calculation that could 
> significantly reduce the error in your estimate. While my explanation 
> may not have been simple (I wasn't going to waste a lot of time editing 
> my post)

Sure.

> the calculation is very simple (what could be simpler than 
> matrix multiplication?) and reduces the error to zero. 

And what could be more complicated than working out the transition 
matrix for a string of 40 million bits? (Well, okay, lots of things 
could be more complicated, but it's up there with the leaders...)

> I never claimed that it was going to be practical*...

<grin> True enough. So we can refine the challenge to that of coming up 
with a *practical* way of finding a closer answer than my method gave.

> Comparisons with slightly longer binary strings shows that your estimate 
> could be very badly out. For example your calculation would predict that 
> the probability of not finding a 4 digit binary string in 128 random 
> digits would be 0.0003136 = (15/16)^(128-3), whereas the actual 
> probability could be as low as 0.0001806 (for string '0110') or as high 
> as 0.009712 for string '0000'. 

Well, I could live with 200% (in the specific case of Monkeys vs 
Shakespeare, I hasten to add!), but of course the error would be much 
higher for longer needle strings, especially with the colossal haystacks 
that we would require. So I guess I have to scratch that method completely.

One reasonable compromise might be to use increasingly long subsets of 
Shakespeare to find the number of monkey keypress events required to 
give us, say, a 99% probability of duplicating those subsets, using your 
method of calculation, up until the calculations start to become 
impractical, and produce a graph which we could then extrapolate to find 
a ballpark figure for the complete Shakespeherian canon. It wouldn't be 
as accurate as using your method exhaustively, but I suppose it would be 
a lot more accurate than my method, and has the added benefit that it 
could actually be completed within a reasonable time.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/15/2010 2:49:54 AM

In article <GuudnSVUk7vPPQDWnZ2dnUVZ7sydnZ2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> <snip>
> 
> >> Using your method, the whole question changes to 
> >> whether your suggested program will run to completion before the monkeys 
> >> either hit the jackpot or give up in disgust. Whether it does in fact 
> >> yield the required order of magnitude error reduction seems, on the face 
> >> of it, to be a side issue.
> >>
> > Well you asked for a reasonably simple calculation that could 
> > significantly reduce the error in your estimate. While my explanation 
> > may not have been simple (I wasn't going to waste a lot of time editing 
> > my post)
> 
> Sure.
> 
> > the calculation is very simple (what could be simpler than 
> > matrix multiplication?) and reduces the error to zero. 
> 
> And what could be more complicated than working out the transition 
> matrix for a string of 40 million bits? (Well, okay, lots of things 
> could be more complicated, but it's up there with the leaders...)
> 
> > I never claimed that it was going to be practical*...
> 
> <grin> True enough. So we can refine the challenge to that of coming up 
> with a *practical* way of finding a closer answer than my method gave.
> 
> > Comparisons with slightly longer binary strings shows that your estimate 
> > could be very badly out. For example your calculation would predict that 
> > the probability of not finding a 4 digit binary string in 128 random 
> > digits would be 0.0003136 = (15/16)^(128-3), whereas the actual 
> > probability could be as low as 0.0001806 (for string '0110') or as high 
> > as 0.009712 for string '0000'. 
> 
> Well, I could live with 200% (in the specific case of Monkeys vs 
> Shakespeare, I hasten to add!), but of course the error would be much 
> higher for longer needle strings, especially with the colossal haystacks 
> that we would require. So I guess I have to scratch that method completely.
> 
> One reasonable compromise might be to use increasingly long subsets of 
> Shakespeare to find the number of monkey keypress events required to 
> give us, say, a 99% probability of duplicating those subsets, using your 
> method of calculation, up until the calculations start to become 
> impractical, and produce a graph which we could then extrapolate to find 
> a ballpark figure for the complete Shakespeherian canon. It wouldn't be 
> as accurate as using your method exhaustively, but I suppose it would be 
> a lot more accurate than my method, and has the added benefit that it 
> could actually be completed within a reasonable time.
> 
I think that as a reasonable compromise I am willing to admit that, for 
any moderately long search string that does not have a lot of copies of 
the first n letters of the string scattered through the rest of the 
string, your estimate is probably as exact as anyone would care to 
require. My slightly unfair examples only emphasised the difference 
between your prediction and reality because they were fairly short and 
the 'pattern' in the strings influenced the probability.

Mike
0
Reply m.fee (39) 3/15/2010 10:47:17 PM

mike wrote:
<snip>

> I think that as a reasonable compromise I am willing to admit that, for 
> any moderately long search string that does not have a lot of copies of 
> the first n letters of the string scattered through the rest of the 
> string, your estimate is probably as exact as anyone would care to 
> require. My slightly unfair examples only emphasised the difference 
> between your prediction and reality because they were fairly short and 
> the 'pattern' in the strings influenced the probability.

So it's reasonable compromises now, is it?

Usenet is going to the dogs.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply rjh (10789) 3/16/2010 12:34:47 AM

In article <OKCdnXnE3JeATwPWnZ2dnUVZ8jVi4p2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> <snip>
> 
> > I think that as a reasonable compromise I am willing to admit that, for 
> > any moderately long search string that does not have a lot of copies of 
> > the first n letters of the string scattered through the rest of the 
> > string, your estimate is probably as exact as anyone would care to 
> > require. My slightly unfair examples only emphasised the difference 
> > between your prediction and reality because they were fairly short and 
> > the 'pattern' in the strings influenced the probability.
> 
> So it's reasonable compromises now, is it?
> 
I never said it wasn't.

If you remember:

1) Someone else mentioned the probability of hitting the right text.
2) You provided a formula to calculate what that probability was.
3) Someone else pointed out that your formula was incorrect (and why).
4) You admitted the fact and asked for a better formula.
5) I provided an exact solution...
6) ...which you suggested was computationally difficult, and asked for a 
compromise solution.
7) I pointed out that your initial solution would be 'good enough' in 
normal circumstances.

At no point did I suggest that your formula was not a reasonable 
compromise. All I did was provide you with what you requested and, for 
illustrative purposes, describe some circumstances where your solution 
would be inadequate.

Cheers,
Mike
0
Reply m.fee (39) 3/16/2010 2:51:15 AM

mike wrote:
> In article <OKCdnXnE3JeATwPWnZ2dnUVZ8jVi4p2d@bt.com>, 
> rjh@see.sig.invalid says...
>> mike wrote:
>> <snip>
>>
>>> I think that as a reasonable compromise I am willing to admit that, for 
>>> any moderately long search string that does not have a lot of copies of 
>>> the first n letters of the string scattered through the rest of the 
>>> string, your estimate is probably as exact as anyone would care to 
>>> require. My slightly unfair examples only emphasised the difference 
>>> between your prediction and reality because they were fairly short and 
>>> the 'pattern' in the strings influenced the probability.
>> So it's reasonable compromises now, is it?
>>
> I never said it wasn't.
> 
> If you remember:
> 
> 1) Someone else mentioned the probability of hitting the right text.
> 2) You provided a formula to calculate what that probability was.
> 3) Someone else pointed out that your formula was incorrect (and why).
> 4) You admitted the fact and asked for a better formula.
> 5) I provided an exact solution...
> 6) ...which you suggested was computationally difficult, and asked for a 
> compromise solution.
> 7) I pointed out that your initial solution would be 'good enough' in 
> normal circumstances.
> 
> At no point did I suggest that your formula was not a reasonable 
> compromise. All I did was provide you with what you requested and, for 
> illustrative purposes, describe some circumstances where your solution 
> would be inadequate.


Ah - we have here a light-hearted all-Usenauts-together reply taken far 
too literally, and thoroughly but unnecessarily rebuffed; folks, things 
were touch and go there for a while, but Usenet is back to normal again!

:-)

(Sorry for the late reply - I've been kinda busy.)

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply Richard 3/21/2010 7:28:39 AM

In article <He-dnXLmZ6o_VzjWnZ2dnUVZ8l2dnZ2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> > In article <OKCdnXnE3JeATwPWnZ2dnUVZ8jVi4p2d@bt.com>, 
> > rjh@see.sig.invalid says...
> >> mike wrote:
> >> <snip>
> >>
> >>> I think that as a reasonable compromise I am willing to admit that, for 
> >>> any moderately long search string that does not have a lot of copies of 
> >>> the first n letters of the string scattered through the rest of the 
> >>> string, your estimate is probably as exact as anyone would care to 
> >>> require. My slightly unfair examples only emphasised the difference 
> >>> between your prediction and reality because they were fairly short and 
> >>> the 'pattern' in the strings influenced the probability.
> >> So it's reasonable compromises now, is it?
> >>
> > I never said it wasn't.
> > 
> > If you remember:
> > 
> > 1) Someone else mentioned the probability of hitting the right text.
> > 2) You provided a formula to calculate what that probability was.
> > 3) Someone else pointed out that your formula was incorrect (and why).
> > 4) You admitted the fact and asked for a better formula.
> > 5) I provided an exact solution...
> > 6) ...which you suggested was computationally difficult, and asked for a 
> > compromise solution.
> > 7) I pointed out that your initial solution would be 'good enough' in 
> > normal circumstances.
> > 
> > At no point did I suggest that your formula was not a reasonable 
> > compromise. All I did was provide you with what you requested and, for 
> > illustrative purposes, describe some circumstances where your solution 
> > would be inadequate.
> 
> 
> Ah - we have here a light-hearted all-Usenauts-together reply taken far 
> too literally, and thoroughly but unnecessarily rebuffed; folks, things 
> were touch and go there for a while, but Usenet is back to normal again!
> 
> :-)
> 
> (Sorry for the late reply - I've been kinda busy.)
> 
And sorry if I appeared to overreact - the coffee machine was empty.

Mike
0
Reply mike 3/21/2010 10:34:37 PM

mike wrote:
> In article <He-dnXLmZ6o_VzjWnZ2dnUVZ8l2dnZ2d@bt.com>, 
> rjh@see.sig.invalid says...
<snip>

>> (Sorry for the late reply - I've been kinda busy.)
>>
> And sorry if I appeared to overreact - the coffee machine was empty.

I feel your pain; so, as soon as I've posted this article, I'm going to 
fax some coffee to you, to tide you over until the vendor's next delivery.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply Richard 3/22/2010 6:50:33 AM

In article <ndydnWHHL76kjjrWnZ2dnUVZ8kti4p2d@bt.com>, 
rjh@see.sig.invalid says...
> mike wrote:
> > In article <He-dnXLmZ6o_VzjWnZ2dnUVZ8l2dnZ2d@bt.com>, 
> > rjh@see.sig.invalid says...
> <snip>
> 
> >> (Sorry for the late reply - I've been kinda busy.)
> >>
> > And sorry if I appeared to overreact - the coffee machine was empty.
> 
> I feel your pain; so, as soon as I've posted this article, I'm going to 
> fax some coffee to you, to tide you over until the vendor's next delivery.
> 
I have invested in a portable filter cup and a supply of fresh grounds 
now - so I can avoid recurence of the above. But do look forwards to 
your fax. 

Further to the original problem though, I believe (but have not taken 
the time to prove) that in practice, if we were looking for an 
uncompressed ascii subtext in a large random (monkey generated) string, 
then the small variations in probability from your approximation would 
be due to repetitions of the first few characters of the subtest - and 
only the first few repetitions would make any significant difference. So 
I believe it might be practical to take just the first few characters 
(maybe in the range of 100-1000) of the subtext, determine the influence 
of pattern on the probability of finding that text and then extrapolate 
with your approximation for the rest of the subtext. So, for example if 
we chose the first 1000 out of a 10000 character substring, then it 
would be a simple process (and a relatively modest amunt of processing)
to repeatedly square the 1000x1000 array a few dozen times to determine 
the probability of finding those 1000 characters within a 10^14-10^15 
character monkey string (for values of 'few' approximately equal to 4). 
Then we just scale that probability by your approximation of finding the 
remaining 9000 characters. Some care might be needed to ensure that the 
array elements were floating point numbers with sufficient resolution to 
avoid round-off errors during the process.

Mike
0
Reply mike 3/22/2010 10:56:53 PM

42 Replies
59 Views

(page loaded in 0.471 seconds)

Similiar Articles:





7/16/2012 10:26:09 PM


Reply: