COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### binary search with arbitrary or random split

• Follow

```1. If i do a binary search in an array of length n, splitting always at 1/4 (instead of 1/2) of the array length, is this still O log(n)  ?
I think it is, as that would only change the base of the logarithm, right ?

2. If i take the splitting point each time as a uniform random variable (say within 1 and the array length), would the search still be O log(n) ? I think so; how can i justify the answer in the simplest way ?

Thanks,

Pam
```
 0
Reply pamelafluente (240) 7/22/2012 3:52:56 PM

```pam <pamelafluente@libero.it> writes:

> 1. If i do a binary search in an array of length n, splitting always
> at 1/4 (instead of 1/2) of the array length, is this still O log(n) ?
> I think it is, as that would only change the base of the logarithm,
> right ?

Pretty much, yes.

> 2. If i take the splitting point each time as a uniform random
> variable (say within 1 and the array length), would the search still
> be O log(n) ? I think so; how can i justify the answer in the simplest
> way ?

That's more complex.  Unadorned, the Big O notation is usually taken to
be a upper bound on some measure of an algorithms work (with binary
search, the measure most often used is comparisons).  With a random
split, the asymptotic upper bound is no longer logarithmic, but you can
recover an O(log(n)) answer by doing a probabilistic analysis.

The gist of this is to say that, given some distribution of inputs
(uniform is common), the expectation of the cost will be O(log(n)).  To
do this in detail, though, can be very complex.

It was at one time a rich and active area of theory.  It may still be,
but I've been out of that loop for some time.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/22/2012 4:36:20 PM

```On 22 Lug, 18:36, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> pam <pamelaflue...@libero.it> writes:
> > 1. If i do a binary search in an array of length n, splitting always
> > at 1/4 (instead of 1/2) of the array length, is this still O log(n) ?
> > I think it is, as that would only change the base of the logarithm,
> > right ?
>
> Pretty much, yes.
>
> > 2. If i take the splitting point each time as a uniform random
> > variable (say within 1 and the array length), would the search still
> > be O log(n) ? I think so; how can i justify the answer in the simplest
> > way ?
>
> That's more complex. =A0Unadorned, the Big O notation is usually taken to
> be a upper bound on some measure of an algorithms work (with binary
> search, the measure most often used is comparisons). =A0With a random
> split, the asymptotic upper bound is no longer logarithmic, but you can
> recover an O(log(n)) answer by doing a probabilistic analysis.
>
> The gist of this is to say that, given some distribution of inputs
> (uniform is common), the expectation of the cost will be O(log(n)). =A0To
> do this in detail, though, can be very complex.
>
> It was at one time a rich and active area of theory. =A0It may still be,
> but I've been out of that loop for some time.
>
> --
> Ben.

Thank you Ben.

Actually i was wondering why most explanations of binary search start
with something like:
"In computer science, a binary search or half-interval search
algorithm finds the position  ..."   or   "the algorithm compares the
input key value with the key value of the middle element of the
array..." (this is wikipedia)

It seems to me that the essence of the procedure is the splitting
itself. Where this occur should not matter at all, as in fact we don't
care of what the actual numbers are within the array.

Depending on the number, each split can remove more or less numbers,
but this depends entirely on what the numbers are.
But the complexity analysis does not depend on the numbers contained
in the array. So viewed in this perspective a random split should
change nothing (under uniform distribution).  Or is this intuition
wrong ?

Pam

```
 0
Reply pamelafluente (240) 7/22/2012 5:58:54 PM

```On 7/22/2012 1:58 PM, pam wrote:
> Actually i was wondering why most explanations of binary search start
> with something like:
> "In computer science, a binary search or half-interval search
> algorithm finds the position  ..."   or   "the algorithm compares the
> input key value with the key value of the middle element of the
> array..." (this is wikipedia)
>
> It seems to me that the essence of the procedure is the splitting
> itself. Where this occur should not matter at all, as in fact we don't
> care of what the actual numbers are within the array.

Asymptotically, there isn't much difference... in practice, there are
some good reasons to prefer a straight binary split:
1. The calculation of the index is generally two one-cycle instructions
(a subtraction and a shift). More complex split points require more
complex logic. Sure, it's one the order of "only" a few more cycles, but
in very hot code, "only" matters a lot.
2. Binary splits guarantee that you can eliminate exactly half the array
each time. Non-binary splits means that sometimes you eliminate the
small half and other times the large half.
3. Binary split is easily analyzable in terms of complexity. If you go
to randomized analysis, you first have to build up knowledge on how to
analyze randomized algorithms, which is generally beyond most
introductory algorithm courses.

> But the complexity analysis does not depend on the numbers contained
> in the array. So viewed in this perspective a random split should
> change nothing (under uniform distribution).  Or is this intuition
> wrong ?

Randomized analysis is much more complicated and it depends on several
qualifiers that tend to brushed under the rugs in everyday thinking.
While I believe offhand that the asymptotic ratio remains the same, it
could be that the expected value makes it take (lg n)^2 time or
something in the average case.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/22/2012 6:45:53 PM

```pam <pamelafluente@libero.it> writes:

> On 22 Lug, 18:36, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> pam <pamelaflue...@libero.it> writes:
>> > 1. If i do a binary search in an array of length n, splitting always
>> > at 1/4 (instead of 1/2) of the array length, is this still O log(n) ?
>> > I think it is, as that would only change the base of the logarithm,
>> > right ?
>>
>> Pretty much, yes.
>>
>> > 2. If i take the splitting point each time as a uniform random
>> > variable (say within 1 and the array length), would the search still
>> > be O log(n) ? I think so; how can i justify the answer in the simplest
>> > way ?
>>
>> That's more complex.  Unadorned, the Big O notation is usually taken to
>> be a upper bound on some measure of an algorithms work (with binary
>> search, the measure most often used is comparisons).  With a random
>> split, the asymptotic upper bound is no longer logarithmic, but you can
>> recover an O(log(n)) answer by doing a probabilistic analysis.
>>
>> The gist of this is to say that, given some distribution of inputs
>> (uniform is common), the expectation of the cost will be O(log(n)).  To
>> do this in detail, though, can be very complex.
>>
>> It was at one time a rich and active area of theory.  It may still be,
>> but I've been out of that loop for some time.
<snip>
> Thank you Ben.
>
> Actually i was wondering why most explanations of binary search start
> with something like:
> "In computer science, a binary search or half-interval search
> algorithm finds the position  ..."   or   "the algorithm compares the
> input key value with the key value of the middle element of the
> array..." (this is wikipedia)

I don't understand this at all.  These are not two alternatives.  The
article has both, one after the other, as an explanation of what the
algorithm does:

"In computer science, a binary search or half-interval search
algorithm finds the position of a specified value (the input 'key')
within a sorted array. In each step, the algorithm compares the input
key value with the key value of the middle element of the array."

Why are you wondering about that?  It seems like a reasonable first
sentence.

> It seems to me that the essence of the procedure is the splitting
> itself. Where this occur should not matter at all, as in fact we don't
> care of what the actual numbers are within the array.

I don't get this either.  Yes, the essence is the splitting, but where
it occurs matters a great deal.  Why do you say it should not matter at
all?  Yes, splitting the array 1/1000th of the way along produces a
logarithmic big O bound, but it makes a big difference in practice.
Other splitting rules produce other bounds -- in particular, a random
split produces a linear upper bound because we can't rule out a
pathologically bad choice at every split.  That's the nature of upper
bounds.

> Depending on the number, each split can remove more or less numbers,
> but this depends entirely on what the numbers are.
> But the complexity analysis does not depend on the numbers contained
> in the array. So viewed in this perspective a random split should
> change nothing (under uniform distribution).  Or is this intuition
> wrong ?

Again, I am a little confused.  Are you using the term "numbers" in
place of elements?  Splitting at the mid point eliminates half the
elements.  If the elements are numbers, with many repeats, such a split
might not remove half the numbers, but that does not effect the
analysis.

If the bottom line is, "why does a random choice of split make the
algorithm O(n)?" then I think I've outlined the reason two paragraphs
up.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/22/2012 9:01:31 PM

```On 22 Lug, 23:01, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> pam <pamelaflue...@libero.it> writes:
> > On 22 Lug, 18:36, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> >> pam <pamelaflue...@libero.it> writes:
> >> > 1. If i do a binary search in an array of length n, splitting always
> >> > at 1/4 (instead of 1/2) of the array length, is this still O log(n) =
?
> >> > I think it is, as that would only change the base of the logarithm,
> >> > right ?
>
> >> Pretty much, yes.
>
> >> > 2. If i take the splitting point each time as a uniform random
> >> > variable (say within 1 and the array length), would the search still
> >> > be O log(n) ? I think so; how can i justify the answer in the simple=
st
> >> > way ?
>
> >> That's more complex. =A0Unadorned, the Big O notation is usually taken=
to
> >> be a upper bound on some measure of an algorithms work (with binary
> >> search, the measure most often used is comparisons). =A0With a random
> >> split, the asymptotic upper bound is no longer logarithmic, but you ca=
n
> >> recover an O(log(n)) answer by doing a probabilistic analysis.
>
> >> The gist of this is to say that, given some distribution of inputs
> >> (uniform is common), the expectation of the cost will be O(log(n)). =
=A0To
> >> do this in detail, though, can be very complex.
>
> >> It was at one time a rich and active area of theory. =A0It may still b=
e,
> >> but I've been out of that loop for some time.
> <snip>
> > Thank you Ben.
>
> > Actually i was wondering why most explanations of binary search start
> > with something like:
> > "In computer science, a binary search or half-interval search
> > algorithm finds the position =A0..." =A0 or =A0 "the algorithm compares=
the
> > input key value with the key value of the middle element of the
> > array..." (this is wikipedia)
>
> I don't understand this at all. =A0These are not two alternatives. =A0The
> article has both, one after the other, as an explanation of what the
> algorithm does:
>
> =A0 "In computer science, a binary search or half-interval search
> =A0 algorithm finds the position of a specified value (the input 'key')
> =A0 within a sorted array. In each step, the algorithm compares the input
> =A0 key value with the key value of the middle element of the array."
>
> Why are you wondering about that? =A0It seems like a reasonable first
> sentence.
>
> > It seems to me that the essence of the procedure is the splitting
> > itself. Where this occur should not matter at all, as in fact we don't
> > care of what the actual numbers are within the array.
>
> I don't get this either. =A0Yes, the essence is the splitting, but where
> it occurs matters a great deal. =A0Why do you say it should not matter at
> all? =A0Yes, splitting the array 1/1000th of the way along produces a
> logarithmic big O bound, but it makes a big difference in practice.
> Other splitting rules produce other bounds -- in particular, a random
> split produces a linear upper bound because we can't rule out a
> pathologically bad choice at every split. =A0That's the nature of upper
> bounds.
>
> > Depending on the number, each split can remove more or less numbers,
> > but this depends entirely on what the numbers are.
> > But the complexity analysis does not depend on the numbers contained
> > in the array. So viewed in this perspective a random split should
> > change nothing (under uniform distribution). =A0Or is this intuition
> > wrong ?
....
>
> Again, I am a little confused. =A0Are you using the term "numbers" in
> place of elements? =A0Splitting at the mid point eliminates half the
> elements. =A0If the elements are numbers, with many repeats, such a split
> might not remove half the numbers, but that does not effect the
> analysis.
>
> If the bottom line is, "why does a random choice of split make the
> algorithm O(n)?" then I think I've outlined the reason two paragraphs
> up.
>
> --
> Ben.

Thank you Ben,

Yes i have in mind an array which contains n "numbers" as elements
(clearly ordered).

(I am most interested in the asymptotic behavior.)

Well it seems to me that it does not really matter if one splits at
1/2 or 1/4 of the array length.
The complexity should always be O(logn), i believe. Is this wrong ?

Splitting at 1/4 will cause that, depending, on the actual numbers
contained in the array,
we may sometimes remove 1/4 or sometimes remove 3/4. So i would
suspect that the
asymptotic complexity is the same.

For the random split (uniform distribution) case, i would be inclined
to think intuitively that the asymptotic complexity
is still O(logn), although actually i don't know for sure, and that is
why i am asking you experts.

(This sounds like a "classic" problem, probably well known. Is there
any place (article or book or site) where i can read on
how determine the complexity in this case ? )

Pam

```
 0
Reply pamelafluente (240) 7/22/2012 11:13:04 PM

```On 7/22/2012 7:13 PM, pam wrote:
> Well it seems to me that it does not really matter if one splits at
> 1/2 or 1/4 of the array length.
> The complexity should always be O(logn), i believe. Is this wrong ?

The actual complexity analysis of this case is fairly trivial, and the
Master Theorem should spit the answer out nicely.

> For the random split (uniform distribution) case, i would be inclined
> to think intuitively that the asymptotic complexity
> is still O(logn), although actually i don't know for sure, and that is
> why i am asking you experts.

Doing some rough calculations, assuming uniform distribution of the
index chosen in each "round" and uniform distribution of the element
being searched for, I arrive that the expected number of elements in the
array to search for in the next round is about 2/3n [1]. Probabilistic
analysis is not my strong suit, but I believe that each "round" is
independent, so that we can conclude that the number of rounds we'd
expect it to take is logarithmic in n.

> (This sounds like a "classic" problem, probably well known. Is there
> any place (article or book or site) where i can read on
> how determine the complexity in this case ? )

The classic problems for probabilistic analysis are the coupon
collector's problem and the secretary problem. It turns out that
Wikipedia has articles on both of these problems that are decent.

[1] 2/3n - 1/2 + 5/6 * 1/n is the exact formula I get, if anyone wants
to check my work.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/22/2012 11:56:09 PM

```pam <pamelafluente@libero.it> writes:
<snip>
> Thank you Ben,
>
> Yes i have in mind an array which contains n "numbers" as elements
> (clearly ordered).

Just a suggestion: if you interleave your reply with my post, I'd know
what this "yes" refers to.  I think I know, but interleaving would make
it abundantly clear.

> (I am most interested in the asymptotic behavior.)
>
> Well it seems to me that it does not really matter if one splits at
> 1/2 or 1/4 of the array length.
> The complexity should always be O(logn), i believe. Is this wrong ?
>
> Splitting at 1/4 will cause that, depending, on the actual numbers
> contained in the array,
> we may sometimes remove 1/4 or sometimes remove 3/4. So i would
> suspect that the
> asymptotic complexity is the same.
>
> For the random split (uniform distribution) case, i would be inclined
> to think intuitively that the asymptotic complexity
> is still O(logn), although actually i don't know for sure, and that is
> why i am asking you experts.

Well, I answered, and I included what I though was a reasonable
explanation of why you get no better than an O(n) asymptotic upper
bound.  Maybe you could say what seems wrong (or indeed what *is*
wrong!) about my explanation?  Otherwise, I'll just be stuck repeating
it.

> (This sounds like a "classic" problem, probably well known. Is there
> any place (article or book or site) where i can read on
> how determine the complexity in this case ? )

I think Joshua has given you some pointers, but I found this text:

Micha Hofri: Probabilistic analysis of algorithms - on computing
methodologies for computer algorithms performance evaluation. Springer
1987

to be very helpful (though, if I recall correctly, rather dense).
Probably not in print anymore but some countries still have good
libraries...

--
Ben.
```
 0
Reply ben.usenet (6516) 7/23/2012 1:13:02 AM

```pam <pamelafluente@libero.it> wrote:
> It seems to me that the essence of the procedure
> is the splitting itself.

Sure, it's an example of divide-and-conquer, e.g.,
http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm

> Where this occur should not matter at all,

"not at all" might be too strong a statement, as per other
followups, but it's the same algorithmic strategy regardless.
In fact, if you happened to know something about the distribution
of values in your sorted list (that the values aren't uniformly
distributed), you might even be able to choose the split

> Pam
--
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )
```
 0
Reply john42 (311) 7/23/2012 4:45:44 AM

```On 7/22/2012 10:45 PM, JohnF wrote:
> pam <pamelafluente@libero.it> wrote:
>> It seems to me that the essence of the procedure
>> is the splitting itself.
>
> Sure, it's an example of divide-and-conquer, e.g.,
>    http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
>
>> Where this occur should not matter at all,
>
> "not at all" might be too strong a statement, as per other
> followups, but it's the same algorithmic strategy regardless.
> In fact, if you happened to know something about the distribution
> of values in your sorted list (that the values aren't uniformly
> distributed), you might even be able to choose the split
>
>> Pam

That's an interesting conjecture. However, if we disallow the
probabilities that constitute the distribution from having values 0 or
1, I don't think I buy it. Since the general complexity is worst case,
not average, you will in some cases make non-optimal moves - perhaps
lots of them - and that will make a worse worst case. (There must be a
better way to articulate this but I'll stand pat.) If one has 0 or 1
probabilities, you might know "the answer" without search/computation of
any sort at compile time.
-- Jeff Barnett
```
 0
Reply jbbrus (25) 7/23/2012 5:05:40 AM

```Jeff Barnett <jbbrus@comcast.net> wrote:
> On 7/22/2012 10:45 PM, JohnF wrote:
>> pam <pamelafluente@libero.it> wrote:
>>> It seems to me that the essence of the procedure
>>> is the splitting itself.
>>
>> Sure, it's an example of divide-and-conquer, e.g.,
>>    http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
>>
>>> Where this occur should not matter at all,
>>
>> "not at all" might be too strong a statement, as per other
>> followups, but it's the same algorithmic strategy regardless.
>> In fact, if you happened to know something about the distribution
>> of values in your sorted list (that the values aren't uniformly
>> distributed), you might even be able to choose the split
>>
>>> Pam
>
> That's an interesting conjecture. However, if we disallow the
> probabilities that constitute the distribution from having
> values 0 or 1, I don't think I buy it.

Well, tough, refunds not available, but I will send over
the waitress with a free round of drinks for the table.
Anyway, I did say "might". And, while I didn't work it
out thoroughly, I did have a rigorous-sounding supporting
argument, presented below yours...

> Since the general complexity is worst case,
> not average, you will in some cases make non-optimal moves - perhaps
> lots of them - and that will make a worse worst case. (There must be a
> better way to articulate this but I'll stand pat.) If one has 0 or 1
> probabilities, you might know "the answer" without search/computation of
> any sort at compile time.
> -- Jeff Barnett

....my off-the-cuff thinking, maybe wrong, was based on
entropy/information considerations. You want to gain maximum additional
information from each list comparison operation, as illustrated by the
following example. Suppose (I believe without loss of generality) the
space of list values is the integers.
For our non-uniform value distribution, suppose the distribution
is logarithmic, i.e., at the midpoint of the list, number-of-elements-wise,
nine-tenths of the value range is to your left. That is, if the list has
a thousand elements, with values zero to a million, then the value of
the 500th element is roughly 900,000.
Now, suppose you're given some random value to look up in this list.
If you compare it with the 500th element, your probability of being too
far right is 0.9, and of being too far left is 0.1, so the information
gained is H = -(.9log.9 + .1log.1) = 0.47 bits (log base 2). But if
you compare with the middle-value list element (whatever its position
in the list), the info gained is H -(.5log.5 + .5log.5) = 1.0 bits.
And that's maximal, as easily verified in the usual way.
So, at any iteration of the divide-and-conquer binary search,
your best strategy is to compare with the list element that has
the middle value in the remaining range of values, regardless of
its position in the list. That'll get you one additional bit of info each
time, which gives the usual result for the expected number of iterations.
Of course, you must have enough info about the list's value distribution
to know (to easily calculate) the expected position of the middle-value
element at each iteration. (Or maybe I'm totally wrong about all of this.)
--
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )
```
 0
Reply john42 (311) 7/23/2012 9:04:37 AM

```This discussion is most interesting.

> Well, I answered, and I included what I though was a reasonable
explanation of why you get no better than an O(n) asymptotic upper
bound.  Maybe you could say what seems wrong (or indeed what *is*
wrong!) about my explanation?  Otherwise, I'll just be stuck
repeating
it.

Well let me try, Ben. Clearly, just an intuitive argument attempt.
(sorry for strange line breaking, but it's google)
If i did not miss more, the argument was:

"Other splitting rules produce other bounds -- in particular, a
random
split produces a linear upper bound because we can't rule out a
pathologically bad choice at every split.  That's the nature of upper
bounds."

Let's say that we assume a (continuous) uniform for the elements
and a discrete uniform for the splitting point (and that we rule out
degenerate or trivial cases).

"pathologically bad choice at every split" should be an event which,
in the asymtotic analysis, has
probability 0 to occur, i think, so this might suggest an argument
against the "O(n) asymptotic upper
bound", as we are interested in bounds holding with probability 1
("almost certain").

Now if we agree that the splitting place "does not matter" in the
"deterministic" analysis,
as far as the asymptotic complexity is concerned, (and that i gathered
so far) and we have
O(logn) for any splitting place, i would think that this might be
enough to suggest
that even in the random-split case we still have O(logn).

In fact we could imagine that the complexity of the random case to be
bounded, step by step,
below and above by 2 extreme "deterministic cases".

Let's call R(n) the asymptotic complexity of the random case and,
S(j,n) the complexity of splitting at j and let's take:

L = j such that S(j,n) is minimum ,  U = j such that S(j,n) is
maximum   (at least from a given n on)

since at each step the random splitting must use one of the split
places j used by one of the deterministic procedures
we might have S(L,n) <= R(n) <= S(U,n)

This is just an intuitive suggestion, but perhaps you experts can turn
this in a more solid argument
(or destroy it).

> I arrive that the expected number of elements in the
array to search for in the next round is about 2/3n [1].
Probabilistic
analysis is not my strong suit, but I believe that each "round" is
independent, so that we can conclude that the number of rounds we'd
expect it to take is logarithmic in n

> [1] 2/3n - 1/2 + 5/6 * 1/n is the exact formula I get, if anyone wants
to check my work.

Joshua can you share some more details of your derivation and how you
arrived to those expressions ?

```
 0
Reply pamelafluente (240) 7/23/2012 1:05:48 PM

```On 7/23/2012 9:05 AM, pam wrote:
> "pathologically bad choice at every split" should be an event which,
> in the asymtotic analysis, has
> probability 0 to occur, i think, so this might suggest an argument
> against the "O(n) asymptotic upper
> bound", as we are interested in bounds holding with probability 1
> ("almost certain").

No. There are different kinds of asymptotic analysis, but the two that
are generally the most interesting are the "worst-case" analysis and
"average-case" analysis. In worst-case analysis, the probability of the
worst case occurring can be vanishingly small, but as long as it is not
exactly 0. Another way to think of it is that the worst-case is the one
that someone who wants to break your algorithm can construct to make it
take as long as possible.
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/23/2012 1:47:26 PM

```pam <pamelafluente@libero.it> writes:

> This discussion is most interesting.
>
>> Well, I answered, and I included what I though was a reasonable
>> explanation of why you get no better than an O(n) asymptotic upper
>> bound.  Maybe you could say what seems wrong (or indeed what *is*
>> wrong!) about my explanation?  Otherwise, I'll just be stuck
>> repeating it.
>
>
> Well let me try, Ben. Clearly, just an intuitive argument attempt.
> (sorry for strange line breaking, but it's google)
> If i did not miss more, the argument was:
>
> "Other splitting rules produce other bounds -- in particular, a random
> split produces a linear upper bound because we can't rule out a
> pathologically bad choice at every split.  That's the nature of upper
> bounds."
>
> Let's say that we assume a (continuous) uniform for the elements
> and a discrete uniform for the splitting point (and that we rule out
> degenerate or trivial cases).
>
> "pathologically bad choice at every split" should be an event which,
> in the asymtotic analysis, has
> probability 0 to occur, i think, so this might suggest an argument
> against the "O(n) asymptotic upper
> bound", as we are interested in bounds holding with probability 1
> ("almost certain").

I may see the problem.  That notion, of the asymptotic probability of
some event, has no place in the normal worst-case analysis.  The
argument goes, "can such and such ever happen?"  If so, and it's worse
than other cases considered so far, it forms a new upper bound.  Whilst
the asymptotic probability of a pathological choice of split is indeed
zero, the probability of it happening in any particular case is always
non-zero (because all cases are finite).

That's how I interpret worst-case in the light of a probabilistic choice
but, being a little more practical, perhaps it's better to say that
random algorithms should not be analysed for conventional worst-case
bounds.

I think the rest is a reply to Joshua's probabilistic analysis so I'll
leave that to him.

<snip>
--
Ben.
```
 0
Reply ben.usenet (6516) 7/23/2012 2:45:11 PM

```On 7/23/2012 3:04 AM, JohnF wrote:
> Jeff Barnett <jbbrus@comcast.net> wrote:
>> On 7/22/2012 10:45 PM, JohnF wrote:
>>> pam <pamelafluente@libero.it> wrote:
>>>> It seems to me that the essence of the procedure
>>>> is the splitting itself.
>>>
>>> Sure, it's an example of divide-and-conquer, e.g.,
>>>     http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
>>>
>>>> Where this occur should not matter at all,
>>>
>>> "not at all" might be too strong a statement, as per other
>>> followups, but it's the same algorithmic strategy regardless.
>>> In fact, if you happened to know something about the distribution
>>> of values in your sorted list (that the values aren't uniformly
>>> distributed), you might even be able to choose the split
>>>
>>>> Pam
>>
>> That's an interesting conjecture. However, if we disallow the
>> probabilities that constitute the distribution from having
>> values 0 or 1, I don't think I buy it.
>
> Well, tough, refunds not available, but I will send over
> the waitress with a free round of drinks for the table.
>     Anyway, I did say "might". And, while I didn't work it
> out thoroughly, I did have a rigorous-sounding supporting
> argument, presented below yours...
>
>> Since the general complexity is worst case,
>> not average, you will in some cases make non-optimal moves - perhaps
>> lots of them - and that will make a worse worst case. (There must be a
>> better way to articulate this but I'll stand pat.) If one has 0 or 1
>> probabilities, you might know "the answer" without search/computation of
>> any sort at compile time.
>> -- Jeff Barnett
>
> ...my off-the-cuff thinking, maybe wrong, was based on
> entropy/information considerations. You want to gain maximum additional
> information from each list comparison operation, as illustrated by the
> following example. Suppose (I believe without loss of generality) the
> space of list values is the integers.
>     For our non-uniform value distribution, suppose the distribution
> is logarithmic, i.e., at the midpoint of the list, number-of-elements-wise,
> nine-tenths of the value range is to your left. That is, if the list has
> a thousand elements, with values zero to a million, then the value of
> the 500th element is roughly 900,000.
>     Now, suppose you're given some random value to look up in this list.
> If you compare it with the 500th element, your probability of being too
> far right is 0.9, and of being too far left is 0.1, so the information
> gained is H = -(.9log.9 + .1log.1) = 0.47 bits (log base 2). But if
> you compare with the middle-value list element (whatever its position
> in the list), the info gained is H -(.5log.5 + .5log.5) = 1.0 bits.
> And that's maximal, as easily verified in the usual way.
>     So, at any iteration of the divide-and-conquer binary search,
> your best strategy is to compare with the list element that has
> the middle value in the remaining range of values, regardless of
> its position in the list. That'll get you one additional bit of info each
> time, which gives the usual result for the expected number of iterations.
> Of course, you must have enough info about the list's value distribution
> to know (to easily calculate) the expected position of the middle-value
> element at each iteration. (Or maybe I'm totally wrong about all of this.)

John, surely you must know that virtually all entropy arguments have to
do with expected values and behaviors and are, therefore, dedicated to
average case analysis. Not worst case analysis which is in play anytime
you are talking about average search times. Just remember, though, that
the item you are searching for is usually drawn from the same
distribution (again unless you specify differently) as the previously
drawn samples that make up the search "space".

So the way you need to formulate your argument is as follows: 1) state
the distribution, 2) select the number n of selections that comprise the
search space - you will increase this number analytically to develop
asymptotic estimates, 3) enumerate (analytically) each of the possible
selections of n element subsets and the probability of that set, 4) now
go through each possible element with its probability of being selected
as the search element, 5) calculate the search time for the element
selected in 3 in the set selected in 2, 6) sum up the product of the
probabilities determined in 2 and 3 and the search time calculated in 4;
that sum is the expected (average) search time given n and the
distributions you have chosen. BTW, the distributions used in 2 and 3
could be different but that usually makes the analysis, which you must
do, harder.

One of the beauties of binary search is you get an upper bound and a
good approximation of the average behavior without needing to specify
distributions at all. Your arguments will certainly be strengthened if
you make up a case and show all of us the math.
--
Jeff Barnett

```
 0
Reply jbbrus (25) 7/23/2012 6:19:53 PM

```On 7/23/2012 6:05 AM, pam wrote:
....
> "pathologically bad choice at every split" should be an event which,
> in the asymtotic analysis, has
> probability 0 to occur, i think, so this might suggest an argument
> against the "O(n) asymptotic upper
> bound", as we are interested in bounds holding with probability 1
> ("almost certain").
....

I think this is a confusion caused by merging two steps:

1. Identify some function f:N->N, such as "f(n) is the
maximum number of comparisons for this algorithm with input size n".

2. Calculate an asymptotic upper bound for the function f.

The algorithm analysis that defines the function to be bounded is always
analysis of the algorithm with a finite input size. If, for every finite
n, there is a non-zero probability of taking n comparisons, the
objective is worst case analysis, and no size n input takes more than n
comparisons, then f(n)=n.

Calculating asymptotic bounds of f(n)=n is trivial.

As a practical matter, we can often calculate asymptotic bounds for an
algorithm without being able to write down a closed form formula for the
function being bounded. We are still calculating bounds on a formula
whose domain is finite problem sizes.

Patricia
```
 0
Reply pats (3215) 7/23/2012 11:10:36 PM

```On 24 Lug, 01:10, Patricia Shanahan <p...@acm.org> wrote:
> On 7/23/2012 6:05 AM, pam wrote:
> ...> "pathologically bad choice at every split" should be an event which,
> > in the asymtotic analysis, has
> > probability 0 to occur, i think, so this might suggest an argument
> > against the "O(n) asymptotic upper
> > bound", as we are interested in bounds holding with probability 1
> > ("almost certain").
>
> ...
>
> I think this is a confusion caused by merging two steps:
>
> 1. Identify some function f:N->N, such as "f(n) is the
> maximum number of comparisons for this algorithm with input size n".
>
> 2. Calculate an asymptotic upper bound for the function f.
>
> The algorithm analysis that defines the function to be bounded is always
> analysis of the algorithm with a finite input size. If, for every finite
> n, there is a non-zero probability of taking n comparisons, the
> objective is worst case analysis, and no size n input takes more than n
> comparisons, then f(n)=n.
>
> Calculating asymptotic bounds of f(n)=n is trivial.
>
> As a practical matter, we can often calculate asymptotic bounds for an
> algorithm without being able to write down a closed form formula for the
> function being bounded. We are still calculating bounds on a formula
> whose domain is finite problem sizes.
>
> Patricia

Well which would be, in particular, those "pathologically bad choice
at every split" which
would occur with a random splitting, and *NOT* with a fixed
splitting ?

Assuming the *continuous* uniform for the values U(A, B)  (and
independence), i can't readily see anything
which has positive *probability* (not density).

Further, assuming a discrete uniform for the random splitting U(1,...,
n), in average we will be splitting
in the middle, so i have some hard time imagining that random
splitting could
damage the computational complexity so badly from O(logn) to "no
better than an O(n) asymptotic upper
bound" ... So what i am missing here ?  :-)

```
 0
Reply pamelafluente (240) 7/24/2012 12:03:10 AM

```On 7/23/2012 5:03 PM, pam wrote:
> On 24 Lug, 01:10, Patricia Shanahan <p...@acm.org> wrote:
>> On 7/23/2012 6:05 AM, pam wrote:
>> ...> "pathologically bad choice at every split" should be an event which,
>>> in the asymtotic analysis, has
>>> probability 0 to occur, i think, so this might suggest an argument
>>> against the "O(n) asymptotic upper
>>> bound", as we are interested in bounds holding with probability 1
>>> ("almost certain").
>>
>> ...
>>
>> I think this is a confusion caused by merging two steps:
>>
>> 1. Identify some function f:N->N, such as "f(n) is the
>> maximum number of comparisons for this algorithm with input size n".
>>
>> 2. Calculate an asymptotic upper bound for the function f.
>>
>> The algorithm analysis that defines the function to be bounded is always
>> analysis of the algorithm with a finite input size. If, for every finite
>> n, there is a non-zero probability of taking n comparisons, the
>> objective is worst case analysis, and no size n input takes more than n
>> comparisons, then f(n)=n.
>>
>> Calculating asymptotic bounds of f(n)=n is trivial.
>>
>> As a practical matter, we can often calculate asymptotic bounds for an
>> algorithm without being able to write down a closed form formula for the
>> function being bounded. We are still calculating bounds on a formula
>> whose domain is finite problem sizes.
>>
>> Patricia
>
> Well which would be, in particular, those "pathologically bad choice
> at every split" which
> would occur with a random splitting, and *NOT* with a fixed
> splitting ?

The bad case is splitting each size k subproblem into two sub-problems
of size 1 and size k-1, such that the target is in the larger
sub-problem. That can happen with a random split.

If you use the conventional binary search that cannot happen, because it
always divides into two sub-problems that are as close as possible to
equal size, limiting the depth of the decision tree.

Of course, one could construct a fixed size split that always splits off
a single element at each step. That is, in effect, what linear search does.

Patricia

```
 0
Reply pats (3215) 7/24/2012 12:27:27 AM

```Jeff Barnett <jbbrus@comcast.net> wrote:
> JohnF wrote:
>> Jeff Barnett <jbbrus@comcast.net> wrote:
>>> JohnF wrote:
>>>> pam <pamelafluente@libero.it> wrote:
>>>>> It seems to me that the essence of the procedure
>>>>> is the splitting itself.
>>>>
>>>> Sure, it's an example of divide-and-conquer, e.g.,
>>>>     http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
>>>>
>>>>> Where this occur should not matter at all,
>>>>
>>>> "not at all" might be too strong a statement, as per other
>>>> followups, but it's the same algorithmic strategy regardless.
>>>> In fact, if you happened to know something about the distribution
>>>> of values in your sorted list (that the values aren't uniformly
>>>> distributed), you might even be able to choose the split
>>>>
>>>>> Pam
>>>
>>> That's an interesting conjecture. However, if we disallow the
>>> probabilities that constitute the distribution from having
>>> values 0 or 1, I don't think I buy it.
>>
>> Well, tough, refunds not available, but I will send over
>> the waitress with a free round of drinks for the table.
>>     Anyway, I did say "might". And, while I didn't work it
>> out thoroughly, I did have a rigorous-sounding supporting
>> argument, presented below yours...
>>
>>> Since the general complexity is worst case,
>>> not average, you will in some cases make non-optimal moves - perhaps
>>> lots of them - and that will make a worse worst case. (There must be a
>>> better way to articulate this but I'll stand pat.) If one has 0 or 1
>>> probabilities, you might know "the answer" without search/computation of
>>> any sort at compile time.
>>> -- Jeff Barnett
>>
>> ...my off-the-cuff thinking, maybe wrong, was based on
>> entropy/information considerations. You want to gain maximum additional
>> information from each list comparison operation, as illustrated by the
>> following example. Suppose (I believe without loss of generality) the
>> space of list values is the integers.
>>     For our non-uniform value distribution, suppose the distribution
>> is logarithmic, i.e., at the midpoint of the list, number-of-elements-wise,
>> nine-tenths of the value range is to your left. That is, if the list has
>> a thousand elements, with values zero to a million, then the value of
>> the 500th element is roughly 900,000.
>>     Now, suppose you're given some random value to look up in this list.
>> If you compare it with the 500th element, your probability of being too
>> far right is 0.9, and of being too far left is 0.1, so the information
>> gained is H = -(.9log.9 + .1log.1) = 0.47 bits (log base 2). But if
>> you compare with the middle-value list element (whatever its position
>> in the list), the info gained is H -(.5log.5 + .5log.5) = 1.0 bits.
>> And that's maximal, as easily verified in the usual way.
>>     So, at any iteration of the divide-and-conquer binary search,
>> your best strategy is to compare with the list element that has
>> the middle value in the remaining range of values, regardless of
>> its position in the list. That'll get you one additional bit of info each
>> time, which gives the usual result for the expected number of iterations.
>> Of course, you must have enough info about the list's value distribution
>> to know (to easily calculate) the expected position of the middle-value
>> element at each iteration. (Or maybe I'm totally wrong about all of this.)
>
> John, surely you must know that virtually all entropy arguments have to
> do with expected values and behaviors and are, therefore, dedicated to
> average case analysis. Not worst case analysis which is in play anytime
> you are talking about average search times.

Right, average is indeed what I was talking about. Sorry I didn't
make that clear with my original remark, above, "...you might even
be able to choose the split advantageously compared to 50-50."
(At the time, I didn't anticipate a lengthy followup conversation:)
But, as Patricia Shanahan pointed out, the worst binary search case
degenerates to a linear search, which isn't exponentially bad news,
so an average case improvement (accepting an occasional worst case
situation that doesn't go off the "deep end") seems (to me) like
the more interesting question.
But, to tell you the truth, the few times I was looking for
a binary search speedup in practice, I just took the first "split"
for the current search at the point where the preceding search
succeeded (if it did succeed). That strategy was just based on
knowing the nature of the particular problem; no analysis whatsoever.

> Just remember, though, that
> the item you are searching for is usually drawn from the same
> distribution (again unless you specify differently) as the previously
> drawn samples that make up the search "space".

That seems to be a good thing with respect to this "middle-value"
strategy, no? Am I missing your point?

> So the way you need to formulate your argument is as follows: 1) state
> the distribution, 2) select the number n of selections that comprise the
> search space - you will increase this number analytically to develop
> asymptotic estimates,

That's what I kind of did, example-wise, above.

> 3) enumerate (analytically) each of the possible
> selections of n element subsets and the probability of that set,
> 4) now go through each possible element with its probability
> of being selected as the search element,

Too much work: that's why I chose integers. The subsets are just
intervals [i,j] = {n|i<=n<=j}, and the probability is just the
measure of that interval p([i,j]) = (j-i)/"max-range".
Please remember, again, I wasn't trying to illustrate an average
improvement in >>every<< possible case, just the >>existence<< of
a case exhibiting improvement.

> 5) calculate the search time for the element
> selected in 3 in the set selected in 2,

As always, that's just assumed proportional to the number of
iterations required to zero-in on the value you're searching for
in the list.

> 6) sum up the product of the
> probabilities determined in 2 and 3 and the search time calculated in 4;
> that sum is the expected (average) search time given n and the
> distributions you have chosen.

I kind of hand-waved that for the example, based on "interval measure
probability" and on entropy arguments, as outlined above.

> BTW, the distributions used in 2 and 3
> could be different but that usually makes the analysis, which you must
> do, harder.

To demonstrate existence, just one case is needed, the simpler the better.

> One of the beauties of binary search is you get an upper bound and a
> good approximation of the average behavior without needing to specify
> distributions at all.

As we've discussed, the usual argument ignoring the distribution might
fail in particular cases where the distribution is "pathological"
(unless my existence demonstration is wrong).

> Your arguments will certainly be strengthened if
> you make up a case and show all of us the math.

Not really interested enough to pursue it further.
--
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )
```
 0
Reply john42 (311) 7/24/2012 3:59:58 AM

```On Jul 24, 2:27=A0am, Patricia Shanahan <p...@acm.org> wrote:
> On 7/23/2012 5:03 PM, pam wrote:
>
>
>
>
>
>
>
>
>
> > On 24 Lug, 01:10, Patricia Shanahan <p...@acm.org> wrote:
> >> On 7/23/2012 6:05 AM, pam wrote:
> >> ...> "pathologically bad choice at every split" should be an event whi=
ch,
> >>> in the asymtotic analysis, has
> >>> probability 0 to occur, i think, so this might suggest an argument
> >>> against the "O(n) asymptotic upper
> >>> bound", as we are interested in bounds holding with probability 1
> >>> ("almost certain").
>
> >> ...
>
> >> I think this is a confusion caused by merging two steps:
>
> >> 1. Identify some function f:N->N, such as "f(n) is the
> >> maximum number of comparisons for this algorithm with input size n".
>
> >> 2. Calculate an asymptotic upper bound for the function f.
>
> >> The algorithm analysis that defines the function to be bounded is alwa=
ys
> >> analysis of the algorithm with a finite input size. If, for every fini=
te
> >> n, there is a non-zero probability of taking n comparisons, the
> >> objective is worst case analysis, and no size n input takes more than =
n
> >> comparisons, then f(n)=3Dn.
>
> >> Calculating asymptotic bounds of f(n)=3Dn is trivial.
>
> >> As a practical matter, we can often calculate asymptotic bounds for an
> >> algorithm without being able to write down a closed form formula for t=
he
> >> function being bounded. We are still calculating bounds on a formula
> >> whose domain is finite problem sizes.
>
> >> Patricia
>
> > Well which would be, in particular, those "pathologically bad choice
> > at every split" which
> > would occur with a random splitting, and *NOT* with a fixed
> > splitting ?
>
> The bad case is splitting each size k subproblem into two sub-problems
> of size 1 and size k-1, such that the target is in the larger
> sub-problem. That can happen with a random split.
>
> If you use the conventional binary search that cannot happen, because it
> always divides into two sub-problems that are as close as possible to
> equal size, limiting the depth of the decision tree.
>
> Of course, one could construct a fixed size split that always splits off
> a single element at each step. That is, in effect, what linear search doe=
s.
>
> Patricia

I see. What about we assume the Uniform (2, ..., n -1)  for the random
split ?

(1 and n can somehow be considerate "degenerate" cases. This would in
fact affect the "deterministic" case too)

Do we still have the issue?

ps.
(Note that even in the case split =3D 1 you would need a sequence
arbitrarily large
of splits occurring at 1. Which has probability 0.)

```
 0
Reply pamelafluente (240) 7/24/2012 8:24:18 AM

```.... And actually splitting deterministically at 1 (or n) we would
somehow be out of a "divide and conquer" scenario
(as indicated above by John), as in effect, as Patricia noted, we are
examining the elements one by one.

So i think this can be legitimately ruled out, as a real "degenerate"
case. No? This for the deterministic split.

For the random split case, i would guess that the complexity analysis
cannot include stuff which has probability 0. Just a guess ...
what would be the precise definition in this case ?
```
 0
Reply pamelafluente (240) 7/24/2012 8:43:27 AM

```pam <pamelafluente@libero.it> writes:

> ... And actually splitting deterministically at 1 (or n) we would
> somehow be out of a "divide and conquer" scenario
> (as indicated above by John), as in effect, as Patricia noted, we are
> examining the elements one by one.

It's not "splitting deterministically at 1 or n" is splitting that
happens, by outrageous fortune, at 1 or n all n-1 times.  If that can
happen, it forms the worst case for random splitting.

> So i think this can be legitimately ruled out, as a real "degenerate"
> case. No? This for the deterministic split.

Now I can answer your split at 2 or n-1 question.  That also gives O(n)
search so you'd have to rule that case out as well.  In fact, splitting
at k and n-k for any constant k (with some other rule when n <= k) gives
O(n) search so they all have to be ruled out. too!

> For the random split case, i would guess that the complexity analysis
> cannot include stuff which has probability 0. Just a guess ...
> what would be the precise definition in this case ?

I may have misunderstood the previous remarks because you are now
talking about "the random split case".  Is that not what's always been
under discussion?  But to answer...  Yes, anything with probability zero
is not considered because it can't happen.

The asymptotic bound is not what happens when n gets very large.  It's a
way to characterise the cost function.  You calculate the cost function
for an arbitrary n, and then use big O notation to summarise that
finding -- dropping any part of the formula that has no effect "in the
tail".  This is a re-wording of what Patricia wrote (or at least I
intended it to be).  You might want to go back to that post because I
thought at the time it was very clearly put.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/24/2012 11:05:50 AM

```On Jul 24, 1:05=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> pam <pamelaflue...@libero.it> writes:
047301/ffb8397007f4abf9#ffb8397007f4abf9

> > ... And actually splitting deterministically at 1 (or n) we would
> > somehow be out of a "divide and conquer" scenario
> > (as indicated above by John), as in effect, as Patricia noted, we are
> > examining the elements one by one.
>
> It's not "splitting deterministically at 1 or n" is splitting that
> happens, by outrageous fortune, at 1 or n all n-1 times. =A0If that can
> happen, it forms the worst case for random splitting.

I agree with that. I have a problem with the probability being 0, in
the asymptotic analysis.

>
> > So i think this can be legitimately ruled out, as a real "degenerate"
> > case. No? This for the deterministic split.
>
> Now I can answer your split at 2 or n-1 question. =A0That also gives O(n)
> search so you'd have to rule that case out as well. =A0In fact, splitting
> at k and n-k for any constant k (with some other rule when n <=3D k) give=
s
> O(n) search so they all have to be ruled out. too!

Here i am getting lost. Assume that  M is the middle index in an
arbitrarily large array.
Are you saying that if i split, for instance, deterministically at M-1
i suddenly get a O(n) search

>=A0But to answer... =A0Yes, anything with probability zero
> is not considered because it can't happen.

I thought so.

>
> The asymptotic bound is not what happens when n gets very large...

well it seems to me that there is a limit in the definition.

So, to recap, what would be the answer? That only a deterministic
search
using exactly the middle point is O(logn), while any other
deterministic and the random-split search
is O(n)  ?
```
 0
Reply pamelafluente (240) 7/24/2012 11:39:32 AM

```On 7/24/2012 7:39 AM, pam wrote:
> On Jul 24, 1:05 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> pam <pamelaflue...@libero.it> writes:
>
>>> ... And actually splitting deterministically at 1 (or n) we would
>>> somehow be out of a "divide and conquer" scenario
>>> (as indicated above by John), as in effect, as Patricia noted, we are
>>> examining the elements one by one.
>>
>> It's not "splitting deterministically at 1 or n" is splitting that
>> happens, by outrageous fortune, at 1 or n all n-1 times.  If that can
>> happen, it forms the worst case for random splitting.
>
> I agree with that. I have a problem with the probability being 0, in
> the asymptotic analysis.

If we're discussing *worst-case* analysis, it *doesn't matter* what the
probability of the worst-case is, as long as it is not exactly 0. It is
possible, trivial even, to craft an example that forces the worst-case
in this example and see that the worst case achieves linear time. That
the worst case is vanishingly rare is immaterial; that it can occur at
all is what matters.

>> Now I can answer your split at 2 or n-1 question.  That also gives O(n)
>> search so you'd have to rule that case out as well.  In fact, splitting
>> at k and n-k for any constant k (with some other rule when n <= k) gives
>> O(n) search so they all have to be ruled out. too!
>
>
> Here i am getting lost. Assume that  M is the middle index in an
> arbitrarily large array.
> Are you saying that if i split, for instance, deterministically at M-1
> i suddenly get a O(n) search
> instead of  O(log n)  ?

The split point is determined by some function f(n), where n is the
length of the array. If f(n) = k*n + b and k is 0 or 1, then the runtime
is linear. If 0 < k < 1, then the runtime is logarithmic.

> So, to recap, what would be the answer? That only a deterministic
> search
> using exactly the middle point is O(logn), while any other
> deterministic and the random-split search
> is O(n)  ?

Are you asking for the worst-case running time or for the "average-case"
running time? There is a difference between those two things, which you
seem to be persistently ignoring.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/24/2012 1:21:41 PM

```>
> Are you asking for the worst-case running time or for the "average-case"
> running time? There is a difference between those two things, which you
> seem to be persistently ignoring.
>

The initial intention was to focus on worst case.

I think that when you talk of computational complexity of an algo
you normally refer to that. No?

Anyway even other analyses are most interesting as well.

>If we're discussing *worst-case* analysis, it *doesn't matter* what the
probability of the worst-case is, as long as it is not exactly 0. It
is
possible, trivial even, to craft an example that forces the worst-
case
in this example and see that the worst case achieves linear time.
That
the worst case is vanishingly rare is immaterial; that it can occur
at
all is what matters.

So are you saying here that the asymptotic complexity of the random
split
search is O(n)  ? (I am asking because i have hard time believing
that.)

I don't think one can invoke a sequence of splits occurring at 1 or n
because we are talking of event with 0 probability as far as
asymptotics are concerned.

Further it would be quite weird that simply changing the split point
distribution
from U(1 , ... , n)  to  U(2 , ... , n-1) you raise the complexity to
O(n). It just sounds plain wrong.

```
 0
Reply pamelafluente (240) 7/24/2012 1:47:09 PM

```> That the worst case is vanishingly rare is immaterial; that it can occur at
all is what matters

Why is so ? I think we needy to prove it by using the definitions of
complexity
(in the probabilistic context).

If i want to pick up a normally distributed real on the line  i can
say that any of them "can occur"
but still each one of them has probability 0.

So the "can occur" argument seems weak.
```
 0
Reply pamelafluente (240) 7/24/2012 1:59:22 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 24, 1:05 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> pam <pamelaflue...@libero.it> writes:
>
>> > ... And actually splitting deterministically at 1 (or n) we would
>> > somehow be out of a "divide and conquer" scenario
>> > (as indicated above by John), as in effect, as Patricia noted, we are
>> > examining the elements one by one.
>>
>> It's not "splitting deterministically at 1 or n" is splitting that
>> happens, by outrageous fortune, at 1 or n all n-1 times.  If that can
>> happen, it forms the worst case for random splitting.
>
> I agree with that. I have a problem with the probability being 0, in
> the asymptotic analysis.

Where, in the standard definitions of worst-case complexity, does the
limit of some probability get used?  As far as I know, it does not.

Here's an example worked out in full:

// search for t in d[1]..d[n]:
binary-search(t, d, n)
{
l := 1; h := n
while h >= l do
m := (l + h) / 2
if d[m] = t then
return m
else if d[m] > t then
h := m - 1
else l = m + 1
done
return false
}

unless I've got something wrong (I probably have[1]), the number of
comparisons done for input size is bounded above by log2(n)+1.  This is
an exact and tight bound.  No case ever takes more, and there is always
some input of size n which takes this number of compares.

Here there is no need to use big O, but we might as well.  The algorithm
is O(log(n)) because the base of the log function and the +1 are lost.

What changes if we write m := uniform(l, h)?  What is the upper bound on
the cost now?  It's n -- again exactly and tightly.  There is always
some execution that takes n compares, and the probability that this
occurs is non-zero -- for all n.  We never consider an "infinite" case
when the probability is zero.

[1] See Bentley, Jon, 1986. "Programming Pearls", Addison-Wesley.  Lots
of professional programmer got some detail of binary search wrong when
set as an exercise.

>> > So i think this can be legitimately ruled out, as a real "degenerate"
>> > case. No? This for the deterministic split.
>>
>> Now I can answer your split at 2 or n-1 question.  That also gives O(n)
>> search so you'd have to rule that case out as well.  In fact, splitting
>> at k and n-k for any constant k (with some other rule when n <= k) gives
>> O(n) search so they all have to be ruled out. too!
>
>
> Here i am getting lost. Assume that  M is the middle index in an
> arbitrarily large array.
> Are you saying that if i split, for instance, deterministically at M-1
> i suddenly get a O(n) search
> instead of  O(log n)  ?

No.  Splitting at M+k for some (possibly negative) constant k as well as
splitting at M*k for some constant k in (0,1) all produce logarithmic
bounds.

I was talking about splitting at k and/or n-k.  That's generalising from
the pathological case of splitting at 1 and/or n.  Eliminating only one
element at a time is not, asymptotically, worse than eliminating 2 or
3 or 4 or...  They are all O(n).

>> But to answer...  Yes, anything with probability zero
>> is not considered because it can't happen.
>
> I thought so.

Just be sure: because a zero probability means that it can't happen so
it has no place in a worst case analysis.

>> The asymptotic bound is not what happens when n gets very large...
>
> well it seems to me that there is a limit in the definition.

You can define O(f(x)) in terms of limits but that's got nothing to do
with any algorithm.  Remember, we work out the worst-case cost first as
a function of n.  We may then choose to summarise that by using big O
notation.  If we do, we don't re-visit the code to see what happens "in
the limit".  We already know, exactly, the cost of the algorithm for all
n and the big O step is a technical re-writing of this function in a
simpler form.

In my examples above, the worst-case costs are log2(n) + 1 and n --
exactly.  No need to use big O, or take any limits.  One is logarithmic
and the other is linear.

> So, to recap, what would be the answer? That only a deterministic
> search
> using exactly the middle point is O(logn), while any other
> deterministic and the random-split search
> is O(n)  ?

No.  For example, a random split based on

m := (l+h)/uniform(2,5)

would be logarithmic.  The key distinction is between reducing the
range by a factor of n or by a fixed amount independent of n.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/24/2012 2:27:05 PM

```On 7/24/2012 9:59 AM, pam wrote:
>> That the worst case is vanishingly rare is immaterial; that it can occur at
> all is what matters
>
> Why is so ? I think we needy to prove it by using the definitions of
> complexity
> (in the probabilistic context).

Let me resort to a more standard analogue: quicksort. Most complexity
analysts would agree that quicksort is an O(n^2) sort in the worst case,
despite the fact that it's performance would rather closely match your
randomized split in binary search (indeed, the analysis of how the split
ought to be subdivided is exactly the same in both cases). The fact that
quicksort can achieve O(n^2) performance can be a security problem--on
the order of million-element arrays or more, you're talking about the
difference between milliseconds and seconds. An attacker who can craft a
denial-of-service attack.

Probabilistic analysis requires assumption of distributions; in terms of
what input you can get, I would not expect the uniform distribution to
be representative of what actually happens: nearly-sorted, sorted, and
reverse-sorted lists are probably much more likely to come up in
practice, and specially-crafted sequences to break algorithms probably
has a slightly higher chance than the rest of the background. Arguing
that a case is unlikely to come up in the uniform distribution is
pointless since we don't see a uniform distribution in practice.

> If i want to pick up a normally distributed real on the line  i can
> say that any of them "can occur"
> but still each one of them has probability 0.
>
> So the "can occur" argument seems weak.

We're dealing with discrete probability domains here, not continuous, so

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/24/2012 3:14:44 PM

```On Jul 24, 4:27=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

>
> would be logarithmic. =A0The key distinction is between reducing the
> range by a factor of n or by a fixed amount independent of n.
>
> --
> Ben.

Very interesting Ben (i like the point you made on the dependence on
n).
After all if there were no dependence on n, we would not be really
"dividing" anything  ;-)

So, as far as the splitting point is some fraction of n, say n/k,
where
1 < n/k < n we are within O(logn). Power of "divide and conquer".  ;-)

The only point which remains now a little unclear to me
is just the case (random splitting) where we allow n/k =3D 1, n.

Now i see your argument. In practice you say, since of each
array of size n, i "could" get to pick up always 1 (or n),
then i have to say that the search could take O(n).

(Clearly, we could easily fix that by saying "dont pick 1 or n"
in the random splitting.)

That is fine. However, we must also observe that in the asymptotic
analysis as n tends to infinity there is a region, in this case, say
intuitively
between 2 bounds O(logn) and O(n), whose probability can be made
arbitrarily small.

So, in essence, any experiment done with any kind of data * provided n
is large
enough * will eventually lead to observe a complexity of O(logn), and
not O(n)

So it becomes probably a matter of how complexity is defined within in
a probabilistic context.

I think that including events with 0 probability might be rather

Take for instance the case where one switches for instance from
1 < n/k < n to the condition 1 <=3D  n/k  < n.

Imagine you are applying the procedure to factorize a large modulus or
something. Now would you really
support that in practice you go from O(log n) to O(n) ?

It would be rather far from being meaningful, as to how one normally
understands complexity. No ?

```
 0
Reply pamelafluente (240) 7/24/2012 3:15:54 PM

```On 7/24/2012 8:15 AM, pam wrote:
....
> That is fine. However, we must also observe that in the asymptotic
> analysis as n tends to infinity there is a region, in this case, say
> intuitively
> between 2 bounds O(logn) and O(n), whose probability can be made
> arbitrarily small.

You are still confusing the algorithm analysis, whose result should be
the worst case performance as a function of finite problem size, with
setting an asymptotic bound on that function.

It may be worth going back to first principles. Suppose we are looking
for a worst case upper bound. That is, we want to be able to say the
worst case performance of the algorithm is O(g(n)) for some function g.

Let f(n) be the maximum number of comparisons over all inputs of size n
and all possible random sequences.

We need to show that there exist constants c and N such that, for all
n>N, f(n) <= c*g(n).

If there existed a finite integer M such that, for all finite problem
sizes greater than M, the probability of a particular case is actually
zero, then we can eliminate that case from the analysis, by using a
value of N is greater than M.

However, for the algorithms we are talking about, the probability of a
split sequence that will result in n comparisons is strictly positive
for all finite problem sizes. There is no finite N that is so big we can
eliminate that case for all values of n greater than N.

The algorithm analysis is always in terms of finite problem sizes. The
limit-like step is in mapping the worst case performance as a function
of problem size to a bounding function, and even that step requires us
to say something about performance for all finite n greater than N.

Patricia
```
 0
Reply pats (3215) 7/24/2012 4:25:35 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 24, 4:27 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>>
>> would be logarithmic.  The key distinction is between reducing the
>> range by a factor of n or by a fixed amount independent of n.
>>
>> --
>> Ben.
>
>
> Very interesting Ben (i like the point you made on the dependence on
> n).
> After all if there were no dependence on n, we would not be really
> "dividing" anything  ;-)
>
> So, as far as the splitting point is some fraction of n, say n/k,
> where
> 1 < n/k < n we are within O(logn). Power of "divide and conquer".  ;-)
>
>
> The only point which remains now a little unclear to me
> is just the case (random splitting) where we allow n/k = 1, n.
>
>
> Now i see your argument. In practice you say, since of each
> array of size n, i "could" get to pick up always 1 (or n),
> then i have to say that the search could take O(n).

I'd say "in theory" rather than "in practice".  In practical terms, the
worst case of a randomly chosen split is of little importance.  It's
significance is theoretical.

> (Clearly, we could easily fix that by saying "dont pick 1 or n"
> in the random splitting.)

What would that fix?  uniform(2, n-1) is as bad as uniform(1, n).

> That is fine. However, we must also observe that in the asymptotic
> analysis as n tends to infinity there is a region, in this case, say
> intuitively
> between 2 bounds O(logn) and O(n), whose probability can be made
> arbitrarily small.

I don't understand this, but it seems to lead to something I agree with
so there's probably little point in explaining it further.

> So, in essence, any experiment done with any kind of data * provided n
> is large
> enough * will eventually lead to observe a complexity of O(logn), and
> not O(n)

Yes, experiments will usually find the average-case performance.

However: You can't observe worst-case complexity as this example shows.
You simply won't see that uniform splitting is, theoretically, worse the
logarithmic because experiments will give you the average.

By the way, your use of big O is getting out of hand (we all do it
because it's become a short-hand for something else).  You can't say
that something has a complexity of O(log(n)) and not O(n).  Proper
binary search is O(n), it just happens to O(log(n)) as well!  If this
surprises you, check the definition.

> So it becomes probably a matter of how complexity is defined within in
> a probabilistic context.

Hmmm.  To what does "it" refer?  If you mean "the answer" then of
course, but that's just a tautology.  What the square root of 9 is
depends on how we define the square root in the context of comp.theory.

"Complexity", unadorned, usually means worst-case complexity.

> I think that including events with 0 probability might be rather

It would be wrong, not misleading.  That you say this, suggests that did
not agree with my rather detailed previous message.  Which bit suggests
that something with zero probability is being considered?

> Take for instance the case where one switches for instance from
> 1 < n/k < n to the condition 1 <=  n/k  < n.

I don't understand this.

> Imagine you are applying the procedure to factorize a large modulus or
> something. Now would you really
> support that in practice you go from O(log n) to O(n) ?

I think you are saying that some change (characterised by the two
inequalities above) to some as yet unspecified algorithm for factorising
numbers would not alter it's complexity bound from log(n) to n "in
practice".

If that's right, I can't say.  I don't understand the two inequalities
and I have no idea what factoring algorithm you have in mind.  Also, I
don't know what "in practice" means.  I know what complexity means "in
theory".  In practice, I measure run-times, but that's not complexity
analysis.  Another meaning of "in practice" could be that you want to do
"expected average-case cost" rather than "worst-case cost" analysis
since that's a measure that's often more useful for practical purposes.

> It would be rather far from being meaningful, as to how one normally
> understands complexity. No ?

Well it might be, but I just don't understand the set-up.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/24/2012 5:10:23 PM

```On Jul 24, 7:10=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

>
> What would that fix? =A0uniform(2, n-1) is as bad as uniform(1, n).
....
>
> By the way, your use of big O is getting out of hand (we all do it
> because it's become a short-hand for something else). =A0You can't say
> that something has a complexity of O(log(n)) and not O(n). =A0Proper
> binary search is O(n), it just happens to O(log(n)) as well! =A0If this
> surprises you, check the definition.
....
>
> "Complexity", unadorned, usually means worst-case complexity.

Well i understand we want an upper bounding function. But i guess you
normally refer to the "smallest"
(or "tightest" one) or else one could always place some enormous
function f(n) in O(f(n)).

However, in this particular instance i think that Ben's observation is
really key.
In fact i believe that talking of splitting at "given place"s, makes
the problem "ill
defined" and allows making formal remarks which do not attack the
essence of the matter under discussion.

What we should really do, i think, is saying that the splitting will
occur at place n/k,  with k a continuous Uniform in (1, n)

This way, i think, we are defining a proper generalization of the,
say, "mid point" binary search. A divide and conquer procedure.

Viewed under this perspective, suggested by the observation made by
Ben, I think we can solve various problems.

> > between 2 bounds O(logn) and O(n), whose probability can be made
> > arbitrarily small.
>
> I don't understand this, but it seems to lead to something I agree with
> so there's probably little point in explaining it further.

....
>
> > I think that including events with 0 probability might be rather
>
> It would be wrong, not misleading. =A0That you say this, suggests that di=
d
> not agree with my rather detailed previous message. =A0Which bit suggests
> that something with zero probability is being considered?
>

You seems to be saying that vanishing events should not be included.

But this seems in contrast with the post from Patricia:

>However, for the algorithms we are talking about, the probability of a
split sequence that will result in n comparisons is strictly positive
for all finite problem sizes. There is no finite N that is so big we
can
eliminate that case for all values of n greater than N.

So having reformulated the problem assigning a distribution to the the
split point n/k  (or its "floor")
(as a function of n) the question still remains: are you saying that
this is no better than O(n) or can have the better
bound O(logn)  ?
This is a huge difference as it can mean seconds instead of millions
years
if n is large, for some important problems.

```
 0
Reply pamelafluente (240) 7/24/2012 7:45:12 PM

```On 7/24/2012 1:45 PM, pam wrote:
> Well i understand we want an upper bounding function. But i guess you
> normally refer to the "smallest"
> (or "tightest" one) or else one could always place some enormous
> function f(n) in O(f(n)).

I don't want to deflect this thread but your last sentence above must be
addressed. The tightest/smallest upper bound would an EXACT analysis of
complexity. That would require an EXACT definition of all of the
computational operators. Unfortunately, that would vary from formalism
to formalism and as you have probably observed in this thread nobody
bothers with such trivia as formalisms. The analysis in Knuth's books
are the rare rare exception. He even defined MIX for that purpose.

The general goal of complexity analysis is to first classify a PROBLEM
as being in the sub-linear hierarchy, linear, polynomial, somewhere in
the NP jungle, multi-exponential for sure, or not Turing computable.
After that detail is dealt with (where the O notation is just fine) you
can get down to details of the problem and calculate bounds in the
proper hierarchy.

You can also analyze particular ALGORITHMS (this is really what Knuth
does with MIX). The goal here is to compare a proposed algorithm with
the problem complexity. But here again, one needs to pin down a
formalism and big O is sometimes good enough. Remember, when you present
an algorithm you present it in a particular language. The compiler for
that language (if you assume there may be one) might change the
complexity of what you have written and analyzed, e.g., it might
increase or decrease the polynomial degree if the algorithm and problem
are inherently polynomial time. So once again, a big O estimate that is
in the proper part of the complexity jungle is usually good information.
The amount of work and intellectual capitol to do much better is
typically enormous.

My suggestion for most of you contributing to this thread is to flip
open a random volume of Knuth's Art of Computer Programming at a random
place. Random from any distribution you like. (If you can arrange it to
pick an analysis that you already understand, you don't need this
exercise.) It's a good bet that you can't read through the complexity
analysis that you randomly picked let alone duplicate it on another
problem. The point of this probably hostile-sounding comment is that it
would be enlightening for you all to back off and work through one
example at least.

Some of the things you all are saying might be true and novel but you'll
convince nobody without presenting some of the deeper details. Hand
waving is fun but it's bad mathematics and hasn't and isn't going to
convince anyone. Even if the conclusion is right!

```
 0
Reply jbbrus (25) 7/24/2012 9:27:42 PM

```On 7/24/2012 2:27 PM, Jeff Barnett wrote:
....
> My suggestion for most of you contributing to this thread is to flip
> open a random volume of Knuth's Art of Computer Programming at a random
> place. Random from any distribution you like. (If you can arrange it to
> pick an analysis that you already understand, you don't need this
> exercise.) It's a good bet that you can't read through the complexity
> analysis that you randomly picked let alone duplicate it on another
> problem. The point of this probably hostile-sounding comment is that it
> would be enlightening for you all to back off and work through one
> example at least.

Although I prefer books that present their algorithms in higher level
pseudo-code rather than MIX, I've frequently used my copy of Knuth over
the decades since I bought it, and have not experienced any special
difficulty understanding his proofs.

I've also taken courses in subjects such as algorithm analysis, so there
were phases in my life when I was working through several complexity
analysis examples per week.

Writing style should change to match the context and audience. It would
be far easier, but less useful, for me to communicate what I'm trying to
say in mathematical notation.

Patricia

```
 0
Reply pats (3215) 7/24/2012 10:01:54 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 24, 7:10 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>>
>> What would that fix?  uniform(2, n-1) is as bad as uniform(1, n).
> ...
>>
>> By the way, your use of big O is getting out of hand (we all do it
>> because it's become a short-hand for something else).  You can't say
>> that something has a complexity of O(log(n)) and not O(n).  Proper
>> binary search is O(n), it just happens to O(log(n)) as well!  If this
>> surprises you, check the definition.
> ...
>>
>> "Complexity", unadorned, usually means worst-case complexity.
>
>
> Well i understand we want an upper bounding function. But i guess you
> normally refer to the "smallest"
> (or "tightest" one) or else one could always place some enormous
> function f(n) in O(f(n)).

There is other notation for other bounds. Big O is just an upper bound
so all functions that are O(log(x)) are also O(x) and O(x^2).  If you
want a tighter bound you write Theta(f(x)).

> However, in this particular instance i think that Ben's observation is
> really key.
> In fact i believe that talking of splitting at "given place"s, makes
> the problem "ill
> defined" and allows making formal remarks which do not attack the
> essence of the matter under discussion.

I don't know what observation of mine leads to this, but I did not
suggest that anything discussed so far leads to ill defined problems.

> What we should really do, i think, is saying that the splitting will
> occur at place n/k,  with k a continuous Uniform in (1, n)

Why should we do this?  To what end?  n/uniform(1,n) has the same range
as uniform(1,n), the only difference is the distribution.

> This way, i think, we are defining a proper generalization of the,
> say, "mid point" binary search. A divide and conquer procedure.

Except that it does not, in the worst case, divide.  The effect is still
a O(n) algorithm.

> Viewed under this perspective, suggested by the observation made by
> Ben, I think we can solve various problems.

I can't see how you get here from anything I've said.  You must have
misunderstood what I was saying.

<snip>
>> > I think that including events with 0 probability might be rather
>>
>> It would be wrong, not misleading.  That you say this, suggests that did
>> not agree with my rather detailed previous message.  Which bit suggests
>> that something with zero probability is being considered?
>
> You seems to be saying that vanishing events should not be included.

If it can happen, include it; if it can't, ignore it.  It seems so clear
to me, but you keep coming back with slightly vague phrasing ("vanishing
events" -- what are they?) as if you understand what I'm saying but
would rather it were not so.

> But this seems in contrast with the post from Patricia:
>
>>However, for the algorithms we are talking about, the probability of a
> split sequence that will result in n comparisons is strictly positive
> for all finite problem sizes. There is no finite N that is so big we
> can
> eliminate that case for all values of n greater than N.

Patricia and I are in agreement here.  No contrast at all that I can see.

> So having reformulated the problem assigning a distribution to the the
> split point n/k  (or its "floor")
> (as a function of n) the question still remains: are you saying that
> this is no better than O(n) or can have the better
> bound O(logn)  ?
> This is a huge difference as it can mean seconds instead of millions
> years
> if n is large, for some important problems.

Of course we can get better than O(n) -- just use k=2.  If you insist on
a random algorithm with better than O(n) worst-case bounds, split at
n/uniform(2, 2.00001).

What's going on here?  If there were any down side at to using n/2 I can
see why one might consider alternatives -- even random ones (some people
do for quicksort, partitioning for example) --- but since n/2 is simple,
easy to calculate and optimal you can't be proposing an alternative.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/24/2012 10:02:38 PM

```On 7/24/2012 3:45 PM, pam wrote:
> What we should really do, i think, is saying that the splitting will
> occur at place n/k,  with k a continuous Uniform in (1, n)

I'm not sure what you mean here that is different other than possibly
excluding the endpoints of the range. Just excluding the endpoints
doesn't guarantee you anything: instead of picking n each time, you can
pick n - 1 and still get the linear performance.

If you want to guarantee that the algorithm is logarithmic in the worst
case, you have to bound the split points by fractions of n, not numbers,
e.g., k uniform in [n/4, 3n/4].

>> It would be wrong, not misleading.  That you say this, suggests that did
>> not agree with my rather detailed previous message.  Which bit suggests
>> that something with zero probability is being considered?
>
> You seems to be saying that vanishing events should not be included.

You seem to be confused about limits at infinity. Just because the limit
of the probability of "killer" sequences is 0 does not mean that
probability of the killer sequences, for any N, is 0.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/24/2012 10:06:30 PM

```On 7/24/2012 3:02 PM, Ben Bacarisse wrote:
....
> What's going on here?  If there were any down side at to using n/2 I can
> see why one might consider alternatives -- even random ones (some people
> do for quicksort, partitioning for example) --- but since n/2 is simple,
> easy to calculate and optimal you can't be proposing an alternative.
>

Patricia

```
 0
Reply pats (3215) 7/24/2012 10:10:45 PM

```Jeff Barnett <jbbrus@comcast.net> writes:
<snip>
> My suggestion for most of you contributing to this thread is to flip
> open a random volume of Knuth's Art of Computer Programming at a
> random place. Random from any distribution you like. (If you can
> arrange it to pick an analysis that you already understand, you don't
> need this exercise.) It's a good bet that you can't read through the
> complexity analysis that you randomly picked let alone duplicate it on
> another problem. The point of this probably hostile-sounding comment
> is that it would be enlightening for you all to back off and work
> through one example at least.

And my suggestion to you is that you sdkj/,  Hey!  Why did you do that?
I was typing.  I need to finish this post.  No it can't wait.  Oh, I
suppose so.  Alright, I'll come to bed, maybe I'll have calmed down in
the morning.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/24/2012 10:18:11 PM

```On Jul 25, 12:06=A0am, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
> On 7/24/2012 3:45 PM, pam wrote:

>I'm not sure what you mean here that is different other than possibly
excluding the endpoints of the range

No.

The reason why i am formulating the problem as
a binary search made through random split at

n/k

with k being a uniform continuous random variable (or any other you
like), does not
lie in the inclusion or exclusion of endpoints of the range.

(Being the variable continuous that doe not even make difference.)

I am doing this to make the split point explicitly dependent on n
or else we have experts like Ben coming out and saying that it's
O(n) for reason due to the split point being independent of n   ;-)

This was always my intention, infact i initially talked about n/2, n/4
and then went on to the randomized version, clearly having in mind n/k
(with n/k random).

Now, having (hopefully) corrected the problem formulation, Can we say
that this is O(logn)
or we cannot do better than O(n) ? This is the question.

The whole point is that i have a feeling that this is O(logn), but you
guys have been
making a point about occurrences like: 1 1 1 1 ... infinite times
which would cause the complexity to be not better than O(n).

I have been arguing that these cases have limit probability 0 and
should not be take into account.

Patricia was saying that they have to be taken into account in any
case.

Ben seemed to agree at one point that vanishing events should not be
considered ("vanishing" meaning
with a limiting probability 0), but then said he is in agreement with
Patricia.

So at the moment i have still not understood what is the answer  ;-)

Replying to Joshua, the 1 1 1 1 1 .... where the number of 1s is
larger than any given number case
has a probability that can be made arbitrarily close to 0. This is why
i believe it should not
be considered.

Note that this is the only case which would "abruptly" lead to not
better than O(n) from O(logn). I don't think that is right
to say that the procedure is O(n) just for such impossible case.  It
would not match with any real
"experience" of the algorithm which  to any purpose would be perceived
as O(logn).

That is, a program dealing with a large n would always terminate in
seconds (or minutes),
and not in millennia (or millions years).

Well in any case i might say that is O(logn) "almost certain". Can we
agree on that ? Or no ?
```
 0
Reply pamelafluente (240) 7/24/2012 11:05:54 PM

```On 7/24/2012 7:05 PM, pam wrote:
> I am doing this to make the split point explicitly dependent on n
> or else we have experts like Ben coming out and saying that it's
> O(n) for reason due to the split point being independent of n   ;-)

You make k uniform from (1, n). This permits n / (n - 1), which is
"effectively" the same as a split point of 1. Or n / (n - 1000), which
is "effectively" the same as 1000. Both cases are still worst-case linear.

> This was always my intention, infact i initially talked about n/2, n/4
> and then went on to the randomized version, clearly having in mind n/k
> (with n/k random).
>
>
> Now, having (hopefully) corrected the problem formulation, Can we say
> that this is O(logn)
> or we cannot do better than O(n) ? This is the question.

If you want it to be logarithmic, you have to guarantee that the
worst-case split points are a constant fraction of n, like k being
uniform from (1/4, 3/4). Since one of your endpoints is a linear
function in n, your worst-cast split point becomes a non-constant
fraction of n, so you get the linear worst-case scenario.

> The whole point is that i have a feeling that this is O(logn), but you
> guys have been
> making a point about occurrences like: 1 1 1 1 ... infinite times
> which would cause the complexity to be not better than O(n).
>
> I have been arguing that these cases have limit probability 0 and
> should not be take into account.

If you are talking about worst-case runtime, you NEED to take those
cases into account.

> Replying to Joshua, the 1 1 1 1 1 .... where the number of 1s is
> larger than any given number case
> has a probability that can be made arbitrarily close to 0. This is why
> i believe it should not
> be considered.

Arbitrarily close but never exactly equal to 0. That is the key: there
is always a way for someone to construct a killer sequence.

> Note that this is the only case which would "abruptly" lead to not
> better than O(n) from O(logn). I don't think that is right
> to say that the procedure is O(n) just for such impossible case.  It
> would not match with any real
> "experience" of the algorithm which  to any purpose would be perceived
> as O(logn).

In algorithms like these where the runtime is heavily dependent on what
kind of input you get, people introduce the notion of "average" runtime
to better capture what typically happens on most input. The average
runtime of your algorithm is indeed logarithmic, assuming that the
probabilities of the "rounds" are independent.

But note that the worst-case runtime is still linear, and this can
matter in cases where degraded performance is absolutely intolerable.

> Well in any case i might say that is O(logn) "almost certain". Can we
> agree on that ? Or no ?

If the probability of a bad sequence is nonzero in theory, it can be
much, much higher in practice. The fact that it is "vanishingly small"
under a uniform distribution is completely meaningless, since the real
world almost never gives you a uniform distribution of inputs.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/24/2012 11:28:36 PM

```pam <pamelafluente@libero.it> writes:
<snip>
> I am doing this to make the split point explicitly dependent on n
> or else we have experts like Ben coming out and saying that it's
> O(n) for reason due to the split point being independent of n   ;-)

What does the wink signify to you?  To me it means, "I know this not
what Ben said but I hope I get a rise from him by saying it".  If that's
want you were going for, you nailed it; if not, you may want to re-think
you use of smilies.

<snip>
> Ben seemed to agree at one point that vanishing events should not be
> considered ("vanishing" meaning
> with a limiting probability 0), but then said he is in agreement with
> Patricia.
>
> So at the moment i have still not understood what is the answer  ;-)

That's a flat-out lie.  I said I did not know what a "vanishing event"
was, so how could you read that as agreeing to anything about them at
all?  I don't think you are confused about what I said, I think you want
to misrepresent it for your own purposes.

<snip>
> Well in any case i might say that is O(logn) "almost certain". Can we
> agree on that ? Or no ?

Why do you care?  No one would use a random slit when a binary split is
simple and efficient.  What's the actual algorithm you have in mind
whose worst-case cost you are so keen to be logarithmic?  There's more
to this that you are letting on, and it's starting to have a hint of
obsession about it.  Spill the beans.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/25/2012 1:26:05 AM

```On Jul 25, 3:26=A0am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> pam <pamelaflue...@libero.it> writes:
>
> <snip>
>
> > I am doing this to make the split point explicitly dependent on n
> > or else we have experts like Ben coming out and saying that it's
> > O(n) for reason due to the split point being independent of n =A0 ;-)
>
> What does the wink signify to you? =A0To me it means, "I know this not
> what Ben said but I hope I get a rise from him by saying it". =A0If that'=
s
> want you were going for, you nailed it; if not, you may want to re-think
> you use of smilies.
>
> <snip>
>
> > Ben seemed to agree at one point that vanishing events should not be
> > considered ("vanishing" meaning
> > with a limiting probability 0), but then said he is in agreement with
> > Patricia.
>
> > So at the moment i have still not understood what is the answer =A0;-)
>
> That's a flat-out lie. =A0I said I did not know what a "vanishing event"
> was, so how could you read that as agreeing to anything about them at
> all? =A0I don't think you are confused about what I said, I think you wan=
t
> to misrepresent it for your own purposes.
>
> <snip>
>
> > Well in any case i might say that is O(logn) "almost certain". Can we
> > agree on that ? Or no ?
>
> Why do you care? =A0No one would use a random slit when a binary split is
> simple and efficient. =A0What's the actual algorithm you have in mind
> whose worst-case cost you are so keen to be logarithmic? =A0There's more
> to this that you are letting on, and it's starting to have a hint of
> obsession about it. =A0Spill the beans.
>
> --
> Ben.

I think that you care more about appearing right at any time, then
really caring for the
real essence of the discussion. In fact i have not yet understood what
but mostly seen sort of "self defensing" formal remarks, or
aggressiveness when in difficulty.
We are not here to fight: just to discuss. I am not afraid of being
wrong: it's what
allows me to understand better.

> Why do you care?

If one said that for any new question there would be no progress.
You may never know what kind of implication may stem from new
developments.

Further, if for each question being asked, one replied with "why do
you care"
there would not be much point having a discussion group.  ;-)

About smiling i use normally as a sign of friendship, because since
the written communication does not allow to see people in the face,
sometime one might wonder about the nature of a remark.

So a smile is just a way to say i am smiling at you. I am thankful,
and so on,
and not an enemy or an obsessed  ;-))

```
 0
Reply pamelafluente (240) 7/25/2012 9:39:19 AM

```pam <pamelafluente@libero.it> writes:
<snip>
> I think that you care more about appearing right at any time, then
> really caring for the
> real essence of the discussion.

Then you've not seen any of the posts where I am wrong with depressing
frequency.  I don't recall ever being wrong and not saying so when it's
been pointed out.

> In fact i have not yet understood what
> but mostly seen sort of "self defensing" formal remarks, or
> aggressiveness when in difficulty.

No, you can't be any doubt about what my answer is.  I said it's linear
and not logarithmic -- O(n) and not O(log(n)) -- and I've said it more
than once.  I don't think you have any trouble understand that this is

You might have trouble understanding why this is my answer.  Here I can
only say that I've been as clear as I can be.  If you could say what
parts are not clear to you, I could say more.

> We are not here to fight: just to discuss. I am not afraid of being
> wrong: it's what
> allows me to understand better.

If you want a discussion, can I suggest you answer requests for
clarification?

>> Why do you care?
>
> If one said that for any new question there would be no progress.
> You may never know what kind of implication may stem from new
> developments.
>
> Further, if for each question being asked, one replied with "why do
> you care"
> there would not be much point having a discussion group.  ;-)

If every question got the detailed and helpful discussion that this one
has had, before someone asks "why do you care?" then Usenet would be a
health place indeed.

I was not dismissive of the original question (any idea how much time my
detailed answers have taken me?), it was a genuine request to know why
you care.  I went on to say why it hard to see why this matters to you
since both a practical alternative to the analysis (do an average case
analysis which will weight costs by likelihood) and an optimal
alternative algorithm (binary search) have been discussed.  There is
something else going on here which you are not talking about.

> About smiling i use normally as a sign of friendship, because since
> the written communication does not allow to see people in the face,
> sometime one might wonder about the nature of a remark.
>
> So a smile is just a way to say i am smiling at you. I am thankful,
> and so on,
> and not an enemy or an obsessed  ;-))

Then all I can say is take care.  ;-) is a wink not a smile.  If I say
"you agree that you are not telling us the whole story", that might just
be a misunderstanding, but if I add a wink to that, I'm being
deliberately provocative.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/25/2012 2:08:59 PM

```On Jul 25, 4:08=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

>
> No, you can't be any doubt about what my answer is. =A0I said it's linear
> and not logarithmic -- O(n) and not O(log(n)) -- and I've said it more
> than once. =A0I don't think you have any trouble understand that this is
>
> You might have trouble understanding why this is my answer. =A0Here I can
> only say that I've been as clear as I can be. =A0If you could say what
> parts are not clear to you, I could say more.

Well here the question is another.

When you answered O(n) you were considering a setting
where the split point does not depend on n.

Now we have "refined" the formulation making clear the dependence
("divide and conquer").

So, now the question is (and it has always been in essence):

Should one singularity which causes the least upper bound to raise
from O(logn)
to "no better than O(n)" be considered in the computational complexity
assessment,
considered that its probability tends to 0 ?

Patricia was saying yes.

You were saying, at one point, that stuff with probability 0 should
not be considered in the asymptotic analysis.

So, in this sense, i was saying that you have not answered to the
corrected formulation of the problem.

Now, as Patricia recalled, we say that the complexity C(n) is O(logn)
if:

limit for n to infinity  of  C(n) / logn is less than some
constant, say K.

Now in a probabilistic setting, limits are taken "in probability":
What we are interested in
is to see if:

Prob [ limit for n to infinity  of  C(n) / logn is less than
K  ] =3D 1

(this is an "almost sure" convergence (a.s.), as opposed to the "sure"
convergence
used in nonprobabilistic setting)

or if, at least, the above probability is converging to 1 (a.a.s.,
asymptotically almost sure).

In our case, from what i gathered so far we have a generally O(logn)
problem which only on singularities
might blow up to O(n).

So when we consider C(n) above we can split it into 2 parts: one of
order O(logn) and another ("atomic")
of order O(n). But the second part, as n goes to infinity has a
probability going (very quickly) to 0.

So we have that the above limit holds with probability 1.

In probabilistic contexts i have always seen considering forms of
convergence a.s. or a.a.s., or anyway involving
probability.

Also, from an intuitive point of view, it would be ridiculous, imho,
to consider the problem O(n). Just think:
imagine n is as big as an RSA modulus. The probability of the
"singularity" causing O(n)
is a number inconceivably small, that no computer on earth could even
ever represent.

This was in essence my argument. Clearly, i dont expect to be right,
and actually i would like to be corrected.

```
 0
Reply pamelafluente (240) 7/25/2012 4:25:12 PM

```On 7/25/2012 9:25 AM, pam wrote:
....
> Should one singularity which causes the least upper bound to raise
> from O(logn)
> to "no better than O(n)" be considered in the computational complexity
> assessment,
> considered that its probability tends to 0 ?
>
> Patricia was saying yes.
>
> You were saying, at one point, that stuff with probability 0 should
> not be considered in the asymptotic analysis.
....

Ben, do you agree with the following?

A case can be ignored for calculating worst case complexity if, and only
if, there exists a natural number N such that the probability of that
case is zero if the input size is greater than N.

Not tending to zero, not approximately zero, not zero for some large
inputs, but zero for *all* inputs larger than some finite size.

Patricia
```
 0
Reply pats (3215) 7/25/2012 5:18:48 PM

```On 7/25/2012 12:25 PM, pam wrote:
> Should one singularity which causes the least upper bound to raise
> from O(logn)
> to "no better than O(n)" be considered in the computational complexity
> assessment,
> considered that its probability tends to 0 ?
>
> Patricia was saying yes.
>
> You were saying, at one point, that stuff with probability 0 should
> not be considered in the asymptotic analysis.

Those two statements are not irreconcilable. That the probability tends
to 0 does not mean that the probability is 0.

> Now, as Patricia recalled, we say that the complexity C(n) is O(logn)
> if:
>
>
>     limit for n to infinity  of  C(n) / logn is less than some
> constant, say K.

I'd quibble a bit on the definition, but this is essentially correct.

> Now in a probabilistic setting, limits are taken "in probability":
> What we are interested in
> is to see if:
>
>
>        Prob [ limit for n to infinity  of  C(n) / logn is less than
> K  ] = 1

Eeeehhh, no. The problem you appear to have is you don't understand what
the C(n) metric is actually measuring.

The number of comparisons that needs to be done on the input amounts to
a probability-distribution function, call it pdf_C(n, x) (i.e., the
probability that on an input size of n we need x comparisons). When we
actually do the complexity analysis for a given function C(n), we need
to drop the x value somehow. The three easiest ways to do it are
basically to do maximum (i.e., C(n) = max x : pdf_C(n, x) > 0), minimum,
or expected value. These correspond to worst-case, best-case, and
average-case runtimes.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/25/2012 5:34:42 PM

```On 7/25/2012 10:34 AM, Joshua Cranmer wrote:
> On 7/25/2012 12:25 PM, pam wrote:
....
>> Now, as Patricia recalled, we say that the complexity C(n) is O(logn)
>> if:
>>
>>
>>     limit for n to infinity  of  C(n) / logn is less than some
>> constant, say K.
>
> I'd quibble a bit on the definition, but this is essentially correct.

I prefer a more direct definition, but this one works as long as one is
very careful about what is meant by "limit".

Patricia

```
 0
Reply pats (3215) 7/25/2012 6:27:06 PM

```On Jul 25, 7:34=A0pm, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:

> I'd quibble a bit on the definition, but this is essentially correct.

What is making you quibble (just out of curiosity), the uninhibited
use of the word "infinity" ?   ;-)  Sorry ;-))

Indulge me, i am just sketching ideas, details can be worked out once
general ideas are acceptable.

>
> Eeeehhh, no. The problem you appear to have is you don't understand what
> the C(n) metric is actually measuring.
>
> The number of comparisons that needs to be done on the input amounts to
> a probability-distribution function, call it pdf_C(n, x) (i.e., the
> probability that on an input size of n we need x comparisons). When we
> actually do the complexity analysis for a given function C(n), we need
> to drop the x value somehow. The three easiest ways to do it are
> basically to do maximum (i.e., C(n) =3D max x : pdf_C(n, x) > 0), minimum=
,
> or expected value. These correspond to worst-case, best-case, and
> average-case runtimes.
>
> --

I am already assuming the worst case. We are talking of a least
upper bound for a worst case.

All things that can happen can always be partitioned in 2 disjoint
cases:

Those requiring O(logn) and those requiring O(n)  [these are rather
"singularities", i'd say]. Say: Lo(n) and Li(n)

It seems legitimate to split C(n) this way. The cases requiring O(n)
are asymptotically negligible in probability.

A point could rather be in the formulation (just sketching ideas,
feel free to correct)

prob [    lim  (  Lo(n)  + Li(n)  ) / g(n)   ]

to take out the limit

lim  ( prob [  Lo(n)  ]  +  prob [  Li(n)  ] )   /   g(n)

but we may just need some continuity of the probability measure, i
think.

" there is no payoff in probability theory by using sure convergence
compared to using almost sure convergence.
The difference between the two only exists on sets with probability
zero.
This is why the concept of sure convergence of random variables is
very rarely used."

http://en.wikipedia.org/wiki/Convergence_of_random_variables
```
 0
Reply pamelafluente (240) 7/25/2012 6:48:45 PM

```On 7/25/2012 2:48 PM, pam wrote:
>
> On Jul 25, 7:34 pm, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
>
>> I'd quibble a bit on the definition, but this is essentially correct.
>
> What is making you quibble (just out of curiosity), the uninhibited
> use of the word "infinity" ?   ;-)  Sorry ;-))

I defer to Wikipedia's precise definition:
f(x) = O(g(x)) iff lim sup x->inf |f(x)/g(x)| < inf

>    It seems legitimate to split C(n) this way. The cases requiring O(n)
> are asymptotically negligible in probability.

The problem you run in here is where do you draw the line. If n = 1M,
then using 20 comparisons is "obviously" O(lg n) and using 1M is
"obviously" O(n), but what if you used 1000 comparisons?

>   A point could rather be in the formulation (just sketching ideas,
> feel free to correct)
>
>      prob [    lim  (  Lo(n)  + Li(n)  ) / g(n)   ]
>
>   to take out the limit
>
>     lim  ( prob [  Lo(n)  ]  +  prob [  Li(n)  ] )   /   g(n)

Your formulas are not even well-formed, so I'm struggling to figure out
what you mean in the first place.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/25/2012 7:05:26 PM

```On 7/25/2012 11:48 AM, pam wrote:
....
> What is making you quibble (just out of curiosity), the uninhibited
> use of the word "infinity" ?   ;-)  Sorry ;-))
....

It is possible to construct all sorts of plausible-sounding but
fallacious arguments using infinity. The key to getting it right is
precision in definitions and in applying the definitions to specific cases.

Patricia

```
 0
Reply pats (3215) 7/25/2012 10:29:06 PM

```Life intervened, and I had to postpone posting this after I'd already
spent some time one it.  I see now that there are more messages so the
debate may have moved on, but I don't want to lose the time invested so
I post this with apologies if its content duplicates other people's
contributions.

pam <pamelafluente@libero.it> writes:

> On Jul 25, 4:08 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>>
>> No, you can't be any doubt about what my answer is.  I said it's linear
>> and not logarithmic -- O(n) and not O(log(n)) -- and I've said it more
>> than once.  I don't think you have any trouble understand that this is
>>
>> You might have trouble understanding why this is my answer.  Here I can
>> only say that I've been as clear as I can be.  If you could say what
>> parts are not clear to you, I could say more.
>
> Well here the question is another.
>
> When you answered O(n) you were considering a setting
> where the split point does not depend on n.

I didn't make any general remark like that.  I make a very specific one:
"the key distinction is between reducing the range by a *factor of n* or
by a *fixed amount* independent of n" (emphasis added).  It's about only
two cases, and it's about the reduction in the range, not the formula
for the split point.

> Now we have "refined" the formulation making clear the dependence
> ("divide and conquer").

You once proposed splitting at n/uniform(1, n) but that does not always
reduce the range by anything more than a constant.

> So, now the question is (and it has always been in essence):
>
> Should one singularity which causes the least upper bound to raise
> from O(logn)
> to "no better than O(n)" be considered in the computational complexity
> assessment,
> considered that its probability tends to 0 ?
>
> Patricia was saying yes.

And I agree.  If you don't follow what I say below in clarification,
just ignore it and assume that I'm a terrible communicator.  You haven't
been objecting to Patricia's argument so I presume you accept it.  She's
right, and if I seem to be saying anything else, it's my bad writing
skills.

> You were saying, at one point, that stuff with probability 0 should
> not be considered in the asymptotic analysis.

Every case that can occur must be considered in a worst-case analysis.
There is no n so large that the probability of a sequence of bad splits
is zero.  These cases must be included.

> So, in this sense, i was saying that you have not answered to the
> corrected formulation of the problem.

I hope I have now.  The cost for all n-sized cases can be calculated and
the worst ones, taking linear time, all have non-zero probabilities.

If you disagree, what is the worst-case cost function?  I don't mean
what is it asymptotically similar to (i.e. I don't want the big O
"summary", I want the actual cost in comparisons of a search of n
elements).

> Now, as Patricia recalled, we say that the complexity C(n) is O(logn)
> if:
>
>
>    limit for n to infinity  of  C(n) / logn is less than some
> constant, say K.
>

But you've forgotten something else she said (twice).  You don't find a
O(f(x)) formula for C(n) until you have found C(n).  The big O stuff is
throwing you right off -- it a way of simplifying your result, not the
result itself.  Forget it for the moment: what is C(n) for your
algorithm?  Do you agree that it's just n?

> Now in a probabilistic setting, limits are taken "in probability":
> What we are interested in
> is to see if:
>
>
>       Prob [ limit for n to infinity  of  C(n) / logn is less than
> K  ] = 1
>
> (this is an "almost sure" convergence (a.s.), as opposed to the "sure"
> convergence
> used in nonprobabilistic setting)

Well, that's just not how complexity is defined.  You can't use big
O to mean something new without alerting everyone to this fact.

As for the probabilistic convergence, I have to leave that to you.  The
formula you have inside Prob[...] does not look like the kind of thing
I've see before in such contexts, but since I won't be able to help, it
hardly matters that I don't understand your formula.

> or if, at least, the above probability is converging to 1 (a.a.s.,
> asymptotically almost sure).
>
> In our case, from what i gathered so far we have a generally O(logn)
> problem which only on singularities
> might blow up to O(n).

The *problem* is O(log(n)) because there is an O(log(n)) worst-case
solution.  Your *algorithm* (random slitting) is O(n) in the worst case.
I have no idea if these worst cases can be described as singularities
but I think (talking off the top of my head) you believe that cost is
greater than k*log(n) only on a set of cases with measure zero, and that
you can formalise this with one or other of the standard forms of
probabilistic convergence.  Obviously I can't say if this it true or
not.

> So when we consider C(n) above we can split it into 2 parts: one of
> order O(logn) and another ("atomic")
> of order O(n). But the second part, as n goes to infinity has a
> probability going (very quickly) to 0.

I don't think it's that simple.  You can't describe cases as being
O(log(n)) or O(n).  If you run your algorithm with n=42 and it takes 36
compares, is that an O(log(n)) or an O(n) case?  The probabilistic
analysis must take the distribution of splits (and inputs if need be)
into account to give a random variable in place of the cost function.
You may then be able to determine that this variable has a limit as
n->oo using one of the limit definitions ("almost surely" is only one).

Alternatively, you might be able to derive the expectation etc. of this
variable to give yet another measure of the algorithm's performance.

This is not something I know much about, but it does seem clear that you
can't partition the cases into two.  Every set of split points
determines a cost and has a distinct probability of occurring.  They
must all be considered.

> So we have that the above limit holds with probability 1.

That needs to be established.  It does not seem unlikely but
probabilities are notoriously counter-intuitive.

> In probabilistic contexts i have always seen considering forms of
> convergence a.s. or a.a.s., or anyway involving
> probability.

Right, but in case it has not yet been said enough, in complexity
theory, you don't have to take any limits.  If the worst-case cost
function is n, you can call it O(n) without using a limit argument and
without using any probability theory.  Whatever measure you end up
deciding to use, you can't call it worst-case complexity.

> Also, from an intuitive point of view, it would be ridiculous, imho,
> to consider the problem O(n). Just think:
> imagine n is as big as an RSA modulus. The probability of the
> "singularity" causing O(n)
> is a number inconceivably small, that no computer on earth could even
> ever represent.

I can't stop you thinking it ridiculous, but complexity classes are
defined a particular way and you can't alter that on your own.  You can
choose to use what you consider a more intuitive measure, but an O(n)
algorithm will remain and O(n) algorithm.

> This was in essence my argument. Clearly, i dont expect to be right,
> and actually i would like to be corrected.
>

--
Ben.
```
 0
Reply ben.usenet (6516) 7/25/2012 10:40:47 PM

```Patricia Shanahan <pats@acm.org> writes:

> On 7/25/2012 9:25 AM, pam wrote:
> ...
>> Should one singularity which causes the least upper bound to raise
>> from O(logn)
>> to "no better than O(n)" be considered in the computational complexity
>> assessment,
>> considered that its probability tends to 0 ?
>>
>> Patricia was saying yes.
>>
>> You were saying, at one point, that stuff with probability 0 should
>> not be considered in the asymptotic analysis.
> ...
>
> Ben, do you agree with the following?
>
> A case can be ignored for calculating worst case complexity if, and only
> if, there exists a natural number N such that the probability of that
> case is zero if the input size is greater than N.
>
> Not tending to zero, not approximately zero, not zero for some large
> inputs, but zero for *all* inputs larger than some finite size.

Is sounds right but you seem to be using "case" in an a way that does
not match my usage.  A "case", for me, is one execution of the
algorithm, and the cost for some size is just the maximum cost of all
the possible executions with that input size.

You seem to use "case" to means a particular class of executions (maybe
characterised by some behaviour of the algorithm or some property of the
inputs) rather than single executions.  With that meaning, yes, I agree.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/25/2012 11:06:09 PM

```On 7/25/2012 4:06 PM, Ben Bacarisse wrote:
> Patricia Shanahan <pats@acm.org> writes:
>
>> On 7/25/2012 9:25 AM, pam wrote:
>> ...
>>> Should one singularity which causes the least upper bound to raise
>>> from O(logn)
>>> to "no better than O(n)" be considered in the computational complexity
>>> assessment,
>>> considered that its probability tends to 0 ?
>>>
>>> Patricia was saying yes.
>>>
>>> You were saying, at one point, that stuff with probability 0 should
>>> not be considered in the asymptotic analysis.
>> ...
>>
>> Ben, do you agree with the following?
>>
>> A case can be ignored for calculating worst case complexity if, and only
>> if, there exists a natural number N such that the probability of that
>> case is zero if the input size is greater than N.
>>
>> Not tending to zero, not approximately zero, not zero for some large
>> inputs, but zero for *all* inputs larger than some finite size.
>
> Is sounds right but you seem to be using "case" in an a way that does
> not match my usage.  A "case", for me, is one execution of the
> algorithm, and the cost for some size is just the maximum cost of all
> the possible executions with that input size.
>
> You seem to use "case" to means a particular class of executions (maybe
> characterised by some behaviour of the algorithm or some property of the
> inputs) rather than single executions.  With that meaning, yes, I agree.
>

Can you suggest a better term for a related set of executions, such as
those that are equivalent to a linear scan?

Patricia

```
 0
Reply pats (3215) 7/25/2012 11:25:37 PM

```Patricia Shanahan <pats@acm.org> writes:

> On 7/25/2012 4:06 PM, Ben Bacarisse wrote:
>> Patricia Shanahan <pats@acm.org> writes:
>>
>>> On 7/25/2012 9:25 AM, pam wrote:
>>> ...
>>>> Should one singularity which causes the least upper bound to raise
>>>> from O(logn)
>>>> to "no better than O(n)" be considered in the computational complexity
>>>> assessment,
>>>> considered that its probability tends to 0 ?
>>>>
>>>> Patricia was saying yes.
>>>>
>>>> You were saying, at one point, that stuff with probability 0 should
>>>> not be considered in the asymptotic analysis.
>>> ...
>>>
>>> Ben, do you agree with the following?
>>>
>>> A case can be ignored for calculating worst case complexity if, and only
>>> if, there exists a natural number N such that the probability of that
>>> case is zero if the input size is greater than N.
>>>
>>> Not tending to zero, not approximately zero, not zero for some large
>>> inputs, but zero for *all* inputs larger than some finite size.
>>
>> Is sounds right but you seem to be using "case" in an a way that does
>> not match my usage.  A "case", for me, is one execution of the
>> algorithm, and the cost for some size is just the maximum cost of all
>> the possible executions with that input size.
>>
>> You seem to use "case" to means a particular class of executions (maybe
>> characterised by some behaviour of the algorithm or some property of the
>> inputs) rather than single executions.  With that meaning, yes, I agree.
>>
>
> Can you suggest a better term for a related set of executions, such as
> those that are equivalent to a linear scan?

No, not off hand, but I don't think it matters.  "Case" is a very
general word and makes sense being used for either.  In fact, your
meaning subsumes mine since any execution can be seen as a case with one
member.  I don't see much scope for confusion -- the context will
probably make it clear if a case is class of executions or a single one.

Gosh that's wordy.  So, yes, I agree.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 12:00:43 AM

```On 25/07/2012 17:25, pam wrote:
> On Jul 25, 4:08 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>>
>> No, you can't be any doubt about what my answer is.  I said it's linear
>> and not logarithmic -- O(n) and not O(log(n)) -- and I've said it more
>> than once.  I don't think you have any trouble understand that this is
>>
>> You might have trouble understanding why this is my answer.  Here I can
>> only say that I've been as clear as I can be.  If you could say what
>> parts are not clear to you, I could say more.
>
> In our case, from what i gathered so far we have a generally O(logn)
> problem which only on singularities
> might blow up to O(n).

There are plenty of options worse than O(logN) and better than O(N).
I have chosen a natural split at sqrt(N) to prove you wrong.

> So when we consider C(n) above we can split it into 2 parts: one of
> order O(logn) and another ("atomic")
> of order O(n). But the second part, as n goes to infinity has a
> probability going (very quickly) to 0.

But it quite obviously doesn't as several people now have tried very
hard to explain to you. Taking your original formalism of a split

N/k : N.(1-1/k)

1 < k < N for a fixed k and initial value N

Mathematicians please excuse the lack of formality here. Ignoring the
extra work that has to be done when the search width has N < k. The
worst case search is down the longest path and takes y iterations s.t.

N(1-1/k)^y <= 1

log(N) + y.log(1-1/k) <= 0

log(1 - 1/k) ~ -1/k for the leading term (*)

so we have to first order in k

y >= k.log(N)

It is clear therefore that your claim is invalidated for all k > sqrt(N)
and there are a *lot* of numbers between sqrt(N) and N as N -> infinity.

Put another way O(log(N)) can still be a lousy algorithm if the leading
constant coefficient is far too large. The search algorithm you propose
is only O(log(N)) for k << sqrt(N) and k=2 is provably optimum.

Even for a modest k=20 you are already an order of magnitude slower!

Your random k split proposal is worse than O(sqrt(N)) for the bulk of
the range of possible choices of k for a given N.

--
Regards,
Martin Brown

(*) A more accurate approximation for small k is -(6k-1)/(6k-4)/k
```
 0

```Thank you Ben, for the complete clarification. I think i agree in all.

If the definition does not involve probability, than i have no choices
and i can only agree with you.

(Clearly maintaining a reservation on the "intuitive" aspects, in case
the
events leading to O(n) can be proven disjoint and have a probability
converging to 0.
Where i think that in a "probabilistic context" they should be dropped
for a "useful" assessment (see my RSA example).
So while i can accept the complexity class "definition", i may also
argue it does not seem to make much sense in the probabilistic case.)

On Jul 26, 10:14=A0am, Martin Brown <|||newspam...@nezumi.demon.co.uk>
wrote:

> It is clear therefore that your claim is invalidated for all k > sqrt(N)
> and there are a *lot* of numbers between sqrt(N) and N as N -> infinity.
>
>The search algorithm you propose
>is only O(log(N)) for k << sqrt(N) and k=3D2 is provably optimum.

Well i started asking about the complexity of the random search. I did
not make
claims about it. (At most I just expressed a feeling.)

Previous contributors have said this blows up to O(n) due to the
splitting occurring always
at 1 (or n). While for the rest (other cases of divide and conquer),
everyone seemed to agree on O(logn) [unless i missed something].

Now you are saying something different, think.

Leaving apart the cases which causes a linear search (already
outlined),
are you saying that it is instead no better than O(sqrt(n))  for k >
sqrt(n)   ? Or i am getting wrong
the real meaning of your post ?

```
 0
Reply pamelafluente (240) 7/26/2012 9:55:48 AM

```Another quick remark is about the

" but in case it has not yet been said enough, in complexity
theory, you don't have to take any limits."

i so far understood that the O() notation describes instead an
*asymptotic* behavior.

So i though the limit not only matters, but is actually at the core of
the notion.

We don't care what is happening at n=10, 1000, 1000000000,  or 10! ^
10!

this is describing an asymptotic behavior. That is why i though that
events with
limiting probability 0 should not affect it ...

It this idea wrong ? If yes i am interested to correct my
understanding.

```
 0
Reply pamelafluente (240) 7/26/2012 10:09:31 AM

```On 7/26/2012 5:55 AM, pam wrote:
> Previous contributors have said this blows up to O(n) due to the
> splitting occurring always
> at 1 (or n). While for the rest (other cases of divide and conquer),
> everyone seemed to agree on O(logn) [unless i missed something].

Your second statement is a complete misrepresentation of our positions.
The most we have said is either that the average case (expected value)
is O(lg n) or that your algorithm would be O(lg n) if you bounded the
split to occur between constant fractions of n (e.g., 1/4 and 3/4).

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
```
 0
Reply Pidgeot18 (1399) 7/26/2012 11:33:41 AM

```On 26/07/2012 10:55, pam wrote:
> Thank you Ben, for the complete clarification. I think i agree in all.
>
> If the definition does not involve probability, than i have no choices
> and i can only agree with you.

It is the worst case upper bound which is clearly O(N).

> (Clearly maintaining a reservation on the "intuitive" aspects, in case
> the
> events leading to O(n) can be proven disjoint and have a probability
> converging to 0.
> Where i think that in a "probabilistic context" they should be dropped
> for a "useful" assessment (see my RSA example).
> So while i can accept the complexity class "definition", i may also
> argue it does not seem to make much sense in the probabilistic case.)

This is meaningless gibberish.

> On Jul 26, 10:14 am, Martin Brown <|||newspam...@nezumi.demon.co.uk>
> wrote:
>
>> It is clear therefore that your claim is invalidated for all k > sqrt(N)
>> and there are a *lot* of numbers between sqrt(N) and N as N -> infinity.
>>
>> The search algorithm you propose
>> is only O(log(N)) for k << sqrt(N) and k=2 is provably optimum.
>
> Well i started asking about the complexity of the random search. I did
> not make
> claims about it. (At most I just expressed a feeling.)

Your "feeling" is hopelessly wrong. The cases where the search is
actually O(logN) are a very small subset of the possible values of k at
most k<sqrt(N) and in practice even less than that.

> Previous contributors have said this blows up to O(n) due to the
> splitting occurring always
> at 1 (or n). While for the rest (other cases of divide and conquer),
> everyone seemed to agree on O(logn) [unless i missed something].

You are misrepresenting what they have said. It is O(N).
It doesn't matter where the split occurs provided that k << sqrt(N).

Analysing the linear fixed split case :

k  :  N-k

1 < k <= N/2

It is obvious ignoring end effects to choose which one of the k elements
nibbled off that the search completes in the worst case when the number
of iterations y satisfies N = yk

Hence the maximum number of iterations required is:

y = N/k

For k << sqrt(N) this is O(N)

For k=1 it  takes N steps, k=2 takes N/2 steps, k=3 takes N/3
All of these are O(N).

But for k = sqrt(N) this is also O(sqrt(N))

And for k=N/2 it takes one step and returns you a range of half the
previous length (leading to O(logN) dependence again).

> Now you are saying something different, think.

No I am merely trying to show you algebraically how the performance of
the algorithm varies with the choice of k, N.
>
> Leaving apart the cases which causes a linear search (already
> outlined),
> are you saying that it is instead no better than O(sqrt(n))  for k >
> sqrt(n)   ? Or i am getting wrong
> the real meaning of your post ?

Why did you delete the algebra that makes my meaning clear?

It seems to me that you want to use big O notation for bogus handwaving
arguments invoking infinity that have no foundation in reality.

--
Regards,
Martin Brown
```
 0

```On Jul 26, 1:33=A0pm, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:

> Your second statement is a complete misrepresentation of our positions.
> The most we have said is either that the average case (expected value)
> is O(lg n) or that your algorithm would be O(lg n) if you bounded the
> split to occur between constant fractions of n (e.g., 1/4 and 3/4).

Nope. Note that i have excluded the cases which gives rise to a linear
search
(which were firstly pointed out by Patricia, if i remember well)

"Leaving apart the cases which causes a linear search (already
outlined)"

Now you are saying something which is still different from Martin:
"constant fractions "  (?,  remember that n/k is a random variable) .

while he is calling in sqrt(n) which is even more confusing (to me).

Can we just ignore for one moment the "linear search" cases which are
obviously O(n)
and focus on something which is a real "divide and conquer" ?

Also, let's only talk about (a least upper bound for the) asymptotic
worst case (and not average or so on)
or else it gets very confusing.

search: that is clear) ?

```
 0
Reply pamelafluente (240) 7/26/2012 1:13:04 PM

```On Jul 26, 3:06=A0pm, Martin Brown <|||newspam...@nezumi.demon.co.uk>
wrote:

>
> Analysing the linear fixed split case :
>
> =A0 k =A0: =A0N-k
>
> 1 < k <=3D N/2
>

We are now dealing with a random splitting which is a function of n.
(This has also an intuitive meaning, as if it did not depend on n, we
would not be "dividing" anything.)

>
> Why did you delete the algebra that makes my meaning clear?

First of all, i can't delete your posts, and everyone can see them.
Second you have a rather peculiar idea of what "clear" means.

```
 0
Reply pamelafluente (240) 7/26/2012 1:27:29 PM

```pam <pamelafluente@libero.it> writes:

> Another quick remark is about the
>
> " but in case it has not yet been said enough, in complexity
> theory, you don't have to take any limits."
>
>
> i so far understood that the O() notation describes instead an
> *asymptotic* behavior.
>
> So i though the limit not only matters, but is actually at the core of
> the notion.

Tow things: if I can get a formula for the cost, I'm done.  I've got
something much better than a big O class.  Knuth does this all the
time -- he gives exact formulae for some algorithms, not just the big O
class these cost functions belong to.

Obviously people want to compare algorithms so complexity classes are
vital, but you can determine the class with taking any limits.
Asymptotic behaviour is very similar to limiting behaviour, but its not
the same.  This looks like splitting hairs (especially since the big O
class can be defined in terms of limits) and I would not have mentioned
it except that you said that the when probabilities are concerned,
limits mean something else.

> We don't care what is happening at n=10, 1000, 1000000000,  or 10! ^
> 10!
>
> this is describing an asymptotic behavior. That is why i though that
> events with
> limiting probability 0 should not affect it ...
>
> It this idea wrong ? If yes i am interested to correct my
> understanding.

Yes, that's wrong (about limiting probabilities).  You can determine the
cost function and big O function class to which it belongs with no
limits involved.  You can use limits to get the complexity class if you
like, but it's an argument purely about classes of function.

You did not answer my key question (maybe I forgot to ask it!). Do you
agree that the cost function (in terms of compares) of your uniform
splitting algorithm is C(n) = n?  I can say that this is not in
O(log(n)) (but it is in O(n)) without taking any limits, and without any
reference to probability.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 1:55:48 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 26, 3:06 pm, Martin Brown <|||newspam...@nezumi.demon.co.uk>
> wrote:
>
>>
>> Analysing the linear fixed split case :
>>
>>   k  :  N-k
>>
>> 1 < k <= N/2
>>
>
> This has already been promptly corrected several posts ago. Ben made
> an insightful remark about that.
> We are now dealing with a random splitting which is a function of n.
> (This has also an intuitive meaning, as if it did not depend on n, we
> would not be "dividing" anything.)

This issue is not whether splitting "is a function of n" or not.  Please
read what I said again -- I repeated it when you made a similar claim
only a few posts ago.  Splitting at, say, n-2 is clearly a function of n
but it reduces the search range by a constant at each step.

Also, Martin Brown is not saying something that contradicts what's been
that give neither linear not logarithmic costs, and his example is only
one of an infinite number of other costs.  Trying to categorise them one
by one is hopeless.

<snip>
--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 2:09:01 PM

```pam <pamelafluente@libero.it> writes:

> Thank you Ben, for the complete clarification. I think i agree in all.
>
> If the definition does not involve probability, than i have no choices
> and i can only agree with you.

Well, only once case left (uniform splitting) and we are in total
agreement!

You still have not answered the "why do you care" question.  It really
does matter.  If your interests are practical, a probabilistic measure
of complexity is a good way to go.  If your interest is in relating some
work of your own to existing work on complexity theory, you have to
accept the standard meaning of a complexity class.

> (Clearly maintaining a reservation on the "intuitive" aspects, in case
> the
> events leading to O(n) can be proven disjoint and have a probability
> converging to 0.
> Where i think that in a "probabilistic context" they should be dropped
> for a "useful" assessment (see my RSA example).
> So while i can accept the complexity class "definition", i may also
> argue it does not seem to make much sense in the probabilistic case.)

You are free to use some other measure of complexity that you find more
intuitive and useful.  I am glad you accept the standard definition.
Can I suggest you choose a notation for your probabilistic measure?  I
don't think there is a standard one.

<snip>
--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 2:21:20 PM

```On Jul 26, 3:55=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> You did not answer my key question (maybe I forgot to ask it!). Do you
> agree that the cost function (in terms of compares) of your uniform
> splitting algorithm is C(n) =3D n? =A0I can say that this is not in
> O(log(n)) (but it is in O(n)) without taking any limits, and without any
> reference to probability.

Sure it is.
You have understood perfectly my point, though.

Considering the formulation with random splitting point n/k   ( not n-
k ! )  (again: "divide", not subtract!)

IF the only cases where i get cost n are those "degenerate", where by
mere (insignificant, for n large) chance
the binary search happens to be a linear search, i would still object
that in a probabilistic
setting, using sure convergence has little meaning.

So considering that O() is defined using limits, as far i can see
everywhere,
*in a probabilistic setting * (such is what we are considering here)
i would consider *probabilistic forms* of convergence.

You say that this is not the way this is done in CS and they always
consider "sure" convergence.
I can and have to accept that for sure. But i also feel to say it
makes no sense, in a probabilistic setting, nor intuitively.

I have in mind large problem, like n bigger than 5000 digits. In this
case i only care of asymptotics
and to me if the cost n reduces to O(logn) asymptotically, because the
events causing cost n are
practically impossible, i think it's right to see it as O(logn), no
matter what happens for small n.

Clearly, IF there are other situations which causes the cost n (apart
the linear searches and having
a probability which does not tend to 0 then ok,  i could not,
obviously, object anything).

> O(log(n)) (but it is in O(n)) without taking any limits

This part is not completely clear to me yet. I always understood O()
as a short hand
for an event involving a limit. A description of an asymptotic
behavior.
But maybe you are referring to complexity classes definition and they
don't use limits  (?).

>
> --
> Ben.

```
 0
Reply pamelafluente (240) 7/26/2012 3:32:29 PM

```On 7/26/2012 8:32 AM, pam wrote:
> On Jul 26, 3:55 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>> You did not answer my key question (maybe I forgot to ask it!). Do you
>> agree that the cost function (in terms of compares) of your uniform
>> splitting algorithm is C(n) = n?  I can say that this is not in
>> O(log(n)) (but it is in O(n)) without taking any limits, and without any
>> reference to probability.
>
>
> Sure it is.
> You have understood perfectly my point, though.
>
> Considering the formulation with random splitting point n/k   ( not n-
> k ! )  (again: "divide", not subtract!)
>
> IF the only cases where i get cost n are those "degenerate", where by
> mere (insignificant, for n large) chance
> the binary search happens to be a linear search, i would still object
> that in a probabilistic
> setting, using sure convergence has little meaning.

There is nothing wrong with saying you think worst-case analysis is
inappropriate for your situation and purposes. However, if you think
that you should not claim to be doing worst-case analysis.

Conversely, if you do want to present a worst-case complexity for your
algorithm, you need to look for the worst case, not the best or the
expected.

Patricia

```
 0
Reply pats (3215) 7/26/2012 4:02:29 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 26, 3:55 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>> You did not answer my key question (maybe I forgot to ask it!). Do you
>> agree that the cost function (in terms of compares) of your uniform
>> splitting algorithm is C(n) = n?  I can say that this is not in
>> O(log(n)) (but it is in O(n)) without taking any limits, and without any
>> reference to probability.
>
>
> Sure it is.
> You have understood perfectly my point, though.

Did you mean "not understood"?

> Considering the formulation with random splitting point n/k   ( not n-
> k ! )  (again: "divide", not subtract!)

Do you see why that makes no difference?  uniform(1,n), n-uniform(1,n)
and n/uniform(1,n) all produce splits in [1,n] but with different
distributions.

> IF the only cases where i get cost n are those "degenerate", where by
> mere (insignificant, for n large) chance
> the binary search happens to be a linear search, i would still object
> that in a probabilistic
> setting, using sure convergence has little meaning.

What would you object to?  The (worst case) cost function in n.  That n
occurs rarely does not alter the big O class to which this function
belongs.

Look at it another way, do you agree that "f(x) = x is in O(n)" is a
theorem of mathematics?  Do you think the truth or falsehood of this
theorem depends on where f(x) comes from -- if it's the distance
travelled by a car it's true but, it if it's the cost of an algorithm
with some random component it's false?

If you mean that the complexity class of this algorithm is not
interesting to you, that you find other measures more intuitive and
practical, then we can stop this discussion now.  I can't object to
such a preference, but you seem to agree with everything except this
last point.  Unfortunately it's not a matter of opinion or intuition;
the identity function is O(n).

> So considering that O() is defined using limits, as far i can see
> everywhere,
>  *in a probabilistic setting * (such is what we are considering here)
> i would consider *probabilistic forms* of convergence.

O() can be defined in terms of limits but it usually is not.  Even when
it is defined in terms of limits, these are the limits from analysis not
from probability theory.  Membership of the equivalence class O(n) does
not depend on the frequency with which a function takes certain values,
but on weather there is a point beyond which f(x) <= n*k for some
constant k.

> You say that this is not the way this is done in CS and they always
> consider "sure" convergence.
> I can and have to accept that for sure. But i also feel to say it
> makes no sense, in a probabilistic setting, nor intuitively.

We may be nearly done here.  I don't think there is any point in trying
to change your mind about what makes sense.  It makes sense to me because
I want an algorithm witch cost function n to be in O(n) regardless of
any further knowledge of the algorithm, but we can safely disagree about
that.

> I have in mind large problem, like n bigger than 5000 digits. In this
> case i only care of asymptotics
> and to me if the cost n reduces to O(logn) asymptotically, because the
> events causing cost n are
> practically impossible, i think it's right to see it as O(logn), no
> matter what happens for small n.

All find except for the notation.  You can't "see it as O(log(n))" if it
isn't.  You can see it as Opr(log(n)) for some new measure, Opr, that
uses some kind of almost sure convergence.  You'd still have to define
Opr, and prove that your chosen algorithm is in OPr(log(n)) though.

> Clearly, IF there are other situations which causes the cost n (apart
> the linear searches and having
> a probability which does not tend to 0 then ok,  i could not,
> obviously, object anything).

It so much more complicated than that.  The relationship between split
choices and cost is an almost smooth one.  Choosing n (always) *reduces*
the cost in many cases but increases it in others.  Capturing that in
your new OPr measure will be hard (but I am sure it's possible).

The trouble with thinking about "cases" is that most cases have a
probability that tends to zero.  For example, the set of executions
where the random algorithm choose n/2 every time is vanishingly small in
the long run.  There are also an unbounded number of cases that give
pathological results (always choosing 1, always choosing 2 etc...).
Don't get me wrong, I know that probability theory can resolve all these
issues but I think the details will be hard for and actual algorithm.

>> O(log(n)) (but it is in O(n)) without taking any limits
>
> This part is not completely clear to me yet. I always understood O()
> as a short hand
> for an event involving a limit. A description of an asymptotic
> behavior.
> But maybe you are referring to complexity classes definition and they
> don't use limits  (?).

Discussed above.  You agree that the cost function in n.  n is in O(n)
and not in O(log(n)) these are theorems of mathematics.  In those cases
where O() is defined using limits, these are not probabilistic limits,
they are normal analytic limits.

--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 4:56:23 PM

```On Jul 26, 6:56=A0pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> Discussed above. =A0You agree that the cost function in n. =A0n is in
O(n)
> and not in O(log(n)) these are theorems of mathematics. =A0In those cases
> where O() is defined using limits, these are not probabilistic limits,
> they are normal analytic limits.

Thank you Ben.

I think we agree in all. Even about what we can "safely
disagree"   ;-)

[btw, i normally use ;-) as a form of smile because is just under my
finger, so don't see in it any malice
but just a smile  ;-)  ]

About the "why i care", is simply because i am curious. Instead of
being curious about the latest gossips
i sometimes get curious about these questions  ;-)

I can see that you have come to see O() no more in its "original"
meaning, but almost as something
else, a symbol denoting a "class".

Well i can understand that, I don't have a CS background, so to me
that does not sound quite familiar.
When i see O() i understand it uniquely in the sense involving a
limit: i don't see any "class".

A little search reveals also this interesting article:

http://en.wikipedia.org/wiki/Big_O_in_probability_notation

And if we are in a probabilistic setting, i am still of the opinion,
we need to use the probabilistic definition.
Not the definition suitable for deterministic contexts.

popularized the notation
among the CS world  ...  ;-)    I am actually watching one of his
videos right now ...

I fully understand your point of view and i am very happy to have
learned  a lot so far in this discussion.

```
 0
Reply pamelafluente (240) 7/26/2012 7:50:45 PM

```pam <pamelafluente@libero.it> writes:

> On Jul 26, 6:56 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>  > Discussed above.  You agree that the cost function in n.  n is in
> O(n)
>> and not in O(log(n)) these are theorems of mathematics.  In those cases
>> where O() is defined using limits, these are not probabilistic limits,
>> they are normal analytic limits.
>
>
> Thank you Ben.
>
> I think we agree in all. Even about what we can "safely
> disagree"   ;-)
>
> [btw, i normally use ;-) as a form of smile because is just under my
> finger, so don't see in it any malice
> but just a smile  ;-)  ]

As I said before, take care.  It takes longer to explain your particular
usage than to use the normal one, so you'll probably use it
unannounced.  People will read it for what it is, not for what you
intend.[1]

I'd always prefer the cost of the extra key stroke over the possibility
of being misunderstood.

> About the "why i care", is simply because i am curious. Instead of
> being curious about the latest gossips
> i sometimes get curious about these questions  ;-)
>
> I can see that you have come to see O() no more in its "original"
> meaning, but almost as something
> else, a symbol denoting a "class".
>
> Well i can understand that, I don't have a CS background, so to me
> that does not sound quite familiar.
> When i see O() i understand it uniquely in the sense involving a
> limit: i don't see any "class".

Writing f = O(g) establishes a relation between f and g, and it turns
out that this slightly odd looking = (there is no literal equality
there) has the properties of an equivalence relation.  It therefore
partitions the set of functions into equivalence classes.  I used class
as a shorthand, because equivalence class is such a mouthful.

There's no change in meaning, as I'm sure you know.

> A little search reveals also this interesting article:
>
> http://en.wikipedia.org/wiki/Big_O_in_probability_notation
>
> And if we are in a probabilistic setting, i am still of the opinion,
> we need to use the probabilistic definition.
> Not the definition suitable for deterministic contexts.

Does this need extend input distributions?  There are polynomial SAT
solvers that work for "most" inputs, but we still say that SAT is not
(apparently) in P.  Some quicksorts are only O(n^2) for vanishingly
rare inputs.  Would you use a probabilistic definition of asymptotic
equivalence for these sorts of probability?

I think these examples make it clear why you need a new notation.  You
can't re-define O(f) with a probabilistic meaning without there being
way too much scope for confusion.

<snip>

[1] I always think of this sketch when this sort of thing comes up:

--
Ben.
```
 0
Reply ben.usenet (6516) 7/26/2012 11:29:24 PM

```On Jul 27, 1:29=A0am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> pam <pamelaflue...@libero.it> writes:

> [1] I always think of this sketch when this sort of thing comes up:

nice sketch   :-)

> I think these examples make it clear why you need a new notation. =A0You
> can't re-define O(f) with a probabilistic meaning without there being
> way too much scope for confusion.
>

Alas, it looks like it has already invented since long time.  So i
arrived late  :-(

Making a quick check, it looks like that (while someone here was
accusing me of "gibberish"  :-)
using the big O in the probabilistic sense is the norm even in
official CS literature (SIAM) when
dealing with probabilistic formulations.

See for instance:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=3D10.1.1.30.2557
5D7D062968CC925F?doi=3D10.1.1.30.2557&rep=3Drep1&type=3Dpdf

You can read at pag 4, under 1.3 just the same things i have been
saying here ;-)

"Just like big-O function serves to represent the complexity bounds of
deterministic al-
e
gorithms, we employ O to represent complexity bounds of randomized
algorithms. We say
e
a randomized algorithm has resource (like time, space, etc.) bound
O(g(n)) if there is a
constant c such that the amount of resource used by the algorithm (on
any input of size n)
is no more than c g(n) with probability 1 1=3Dn for any > 1  ... "

(i found similar stuff in several other papers)   Hmm, it's time i
begin to read some CS article too ... :-)
```
 0
Reply pamelafluente (240) 7/27/2012 12:22:08 AM

```pam <pamelafluente@libero.it> writes:

> On Jul 27, 1:29 am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
<snip>
>> I think these examples make it clear why you need a new notation.  You
>> can't re-define O(f) with a probabilistic meaning without there being
>> way too much scope for confusion.
>
> Alas, it looks like it has already invented since long time.  So i
> arrived late  :-(

Except that the paper you cite does not seem to use the same definition
that you were proposing.  Of course, if you like this definition, go
ahead and you their notation -- no need to proliferate symbols -- but
you'll need a way to write it for Usenet posts as your quote below shows
so clearly.

> Making a quick check, it looks like that (while someone here was
> accusing me of "gibberish"  :-)

It's possible that what you wrote was gibberish (I'm sorry but I don't
recall).  What's clear is that the paper you cite does not use the same
reasoning you do.  It uses what looks like a significantly different
definition.  In particular it does not rely on "almost sure"
convergence.  It's possible, of course, that their definition is that
same as yours.  It would make an interesting exercise to see if that is
the case.

> using the big O in the probabilistic sense is the norm even in
> official CS literature (SIAM) when
> dealing with probabilistic formulations.

That's not a clear way to report it.  The paper you cite uses big O as
everyone else does.  They define another relation (with it's own
notation) that relies on probabilities.  I'd need a few more references
to know if either this new notation or the specific definition they use
is anything like "the norm".

What is still not in doubt is that big O is used, even in the context
probabilistic algorithms, to mean what it has always meant.  The
textbook I cited earlier does so, and so does the paper you've
worded the above remark in such a way that I felt it needed to be said
again.)

> See for instance:
>
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.2557
>
> You can read at pag 4, under 1.3 just the same things i have been
> saying here ;-)

Well, not "just the same"; "similar" is the best I can do right now.
They don't use almost sure convergence and it's not clear how their
definition relates to your algorithm.  Have you been able to show that
your uniform splitting search is OPr(log(n))?  (I'm using my suggested
OPr because the notation they use does not obviously translate to
ASCII.)

> "Just like big-O function serves to represent the complexity bounds of
> deterministic algorithms, we employ O to represent complexity bounds
> of randomized algorithms. We say a randomized algorithm has resource
> (like time, space, etc.) bound O(g(n)) if there is a constant c such
> that the amount of resource used by the algorithm (on any input of
> size n) is no more than c g(n) with probability 1 1=n for any > 1
> ... "

That went wrong.  For everyone else (is anyone else reading this?) the
paper uses O with a tilde over it for this probabilistic bound, and the
definition is that

We say a randomized algorithm has resource [...]  bound O(g(n)) if
there is a constant c such that the amount of resource used by the
algorithm (on any input of size n) is no more than c a g(n) with
probability >= 1 - 1/(n^a) for any a > 1.

> (i found similar stuff in several other papers)   Hmm, it's time i
> begin to read some CS article too ... :-)

Do they all use this definition?  Any agreement between them on the
notation to be used?

--
Ben.
```
 0
Reply ben.usenet (6516) 7/27/2012 2:02:12 AM

```On Jul 27, 4:02=A0am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

>
> It's possible that what you wrote was gibberish (I'm sorry but I don't
> recall). =A0What's clear is that the paper you cite does not use the same
> reasoning you do. =A0It uses what looks like a significantly different
> definition. =A0In particular it does not rely on "almost sure"
> convergence. =A0It's possible, of course, that their definition is that
> same as yours. =A0It would make an interesting exercise to see if that is
> the case.

Well it's also possible that other people here write gibberish too,
pretending to be masters
and not even knowing some basics of their own main discipline  :-)
just an hypothesis  :-))

>
> Well, not "just the same"; "similar" is the best I can do right now.
> They don't use almost sure convergence and it's not clear how their
> definition relates to your algorithm. =A0Have you been able to show that
> your uniform splitting search is OPr(log(n))? =A0(I'm using my suggested
> OPr because the notation they use does not obviously translate to
> ASCII.)

I did not have time to read through the paper and think, but at first
sight i'd say
it's a form of convergence in probability (implying converg. in
distribution), which clearly
is implied by the stronger form (a.s.).

(they introduce some additional constants which don't seem to matter
much in the asymptotics
apart than changing the speed of convergence: not sure why, maybe
because of the structure of the proof,
or they are "taking" from some other paper which did so.)

Well, as to the uniform splitting search i have been confused by
different inputs here, so i would need to work it out [ my first
intuition is that it's O(logn) a.s.
for a large class of distributions (or at least a.a.s.),  or
OPr(log(n)) as you call it. ]

What i find a little weird is that such seemingly basic problem is not
analyzed in any textbook (i don't know, obviously)
of CS (you experts would know, obviously). It's both interesting and
instructive and i find really hard to imagine
it has not been examined in detail in a number of textbooks.

I should probably write to Don and get one of his checks  :-))

>
> Do they all use this definition? =A0Any agreement between them on the
> notation to be used?
>

I did not have time to examine. On the other hand, the ideas are
pretty classic:
apart small possible variations: one usually wants a limit to hold
with probability 1 (a.s),
or converging to 1 (a.a.s) or can use convergence in r-th mean or in
distribution, etc.

```
 0
Reply pamelafluente (240) 7/27/2012 3:35:01 AM

```On 26/07/2012 14:13, pam wrote:
> On Jul 26, 1:33 pm, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
>
>> Your second statement is a complete misrepresentation of our positions.
>> The most we have said is either that the average case (expected value)
>> is O(lg n) or that your algorithm would be O(lg n) if you bounded the
>> split to occur between constant fractions of n (e.g., 1/4 and 3/4).
>
> Nope. Note that i have excluded the cases which gives rise to a linear
> search
> (which were firstly pointed out by Patricia, if i remember well)
>
> "Leaving apart the cases which causes a linear search (already
> outlined)"

But you *cannot* ignore them they are an important part of the problem.
>
> Now you are saying something which is still different from Martin:
> "constant fractions "  (?,  remember that n/k is a random variable) .

No he isn't. We are saying the same things in different ways. He is just
picking a concrete example safe range with a small value of k in your
notation. 4 is << sqrt(N) for most N. There is no disagreement.

For the purpose of the analysis we pick a given k and decide how the
algorithm would behave you can then average over all values of k later.

If you want to force this pathetic approach to have something vaguely
like probabilistic O(logN) behaviour choose P(k) = 1/k and give or take
some convergence issues that I can't be bothered to work out it will be.

P(k) = 1/k^2 should certainly force good behaviour to prob-O(logN).
(runtime will be terrible but that is a different matter)
>
> while he is calling in sqrt(n) which is even more confusing (to me).

That is because you refuse to look at the *mathematics*.

O(sqrt(N)) or worse is the dominant behaviour for the bulk of all
possible choices of k and it is only at the two extremes that it is
O(logN) or O(N). And the O(N) behaviour would totally dominate runtime.
>
> I am getting a little confused about what your statement is.
>
> Can we just ignore for one moment the "linear search" cases which are
> obviously O(n)
> and focus on something which is a real "divide and conquer" ?

The only bit which is "divide and conquer" in your  N/k : N(1-1/k) split
is when k << sqrt(N) a hard constraint. I thought you might eventually
understand but it is clear I am wasting my breath now.

There may be some mileage in certain types of algorithm where properties
of specific small values of k and the structure of the problem can be
exploited but there is no merit in going too far away from the optimum
since computation *time* gets worse by a factor of k/2 even though it
remains O(logN) provided that k << sqrt(N).

> Also, let's only talk about (a least upper bound for the) asymptotic
> worst case (and not average or so on)
> or else it gets very confusing.
>
> What is your precise statement about that (leave out the linear
> search: that is clear) ?

Mine is that for the range of possible k values the behaviour is roughly

k << sqrt(N)  			O(logN)
transition region
sqrt(N) 			O(sqrt(N))
transition region		NB this is the vast bulk of all cases
k >> N-sqrt(N)			O(N)

There are roughly speaking only as many favourable O(logN) cases around
k=2 ... sqrt(N) as there are damaging O(N) cases at the other extreme
k=N-sqrt(N) ... N all the rest are approximately O(sqrt(N)).

The vast bulk of all other choices of k behave worse than O(sqrt(N)).

But you obviously don't believe in numerical or any other sort of
mathematical analysis so the whole thing is pointless to continue.

Why don't you try your algorithm seeking a 64 bit number and see what
happens as you vary k by powers of 2? That way you will perhaps gain
some insight into this problem instead of posting random musings.

Be aware that hell will freeze over before the thing terminates for
k > 2^40 (actually runtime will exceed a year at that point).
k=2^37 will take about a day and k=2 will take about 64ns.

Or try it on 32 bit numbers where you will only have a couple of seconds
to wait even for a linear O(N) search.

--
Regards,
Martin Brown
```
 0

```On Jul 27, 9:50=A0am, Martin Brown <|||newspam...@nezumi.demon.co.uk>
wrote:

> For the purpose of the analysis we pick a given k and decide how the
> algorithm would behave you can then average over all values of k later.
>
> If you want to force this pathetic approach to have something vaguely
> like probabilistic O(logN) behaviour choose P(k) =3D 1/k and give or take
> some convergence issues that I can't be bothered to work out it will be.

I said that the random variable is n/k ,

Is that really so hard to understand ? That makes a world of
difference,
in case you have not get it yet.

As to the probability space, i see 2 main approaches

- "conditional" (respect to the data array)
- unconditional

In the first case, you would study the problem considering as random
variable n/k  alone
in the second case you would study the problem assuming the array x
also to have some join distribution.

so say

- P ( s | x )
-  P ( s ,  x )

they are both meaningful and interesting.

( Other interesting formalizations, may be considering fuzzy data
instead of crisp, and even probabilistic fuzzy. Clearly one would
introduce membership functions. )

```
 0
Reply pamelafluente (240) 7/27/2012 10:38:36 AM

```On 07/24/2012 06:02 PM, Ben Bacarisse wrote:
> What's going on here?  If there were any down side at to using n/2 I can
> see why one might consider alternatives -- even random ones (some people
> do for quicksort, partitioning for example) --- but since n/2 is simple,
> easy to calculate and optimal you can't be proposing an alternative.

For searching a sorted list, I agree it doesn't make much sense, but
consider sorting.

Quicksort is a simple algorithm, and is close to O(n log(n)) for most
inputs, but you can't use it when the input is chosen by an adversary
because in worst case it is O(n^2).

It is reasonable to use quicksort with each pivot point chosen randomly
from a uniform distribution over the list. Unlike search, there is no
consistently good choice, and any deterministic rule could be exploited
to give O(n^2) cost *every* time.

Worst case is *still* O(n^2), but that's ok. What you really care about
is that the average over your random choices is O(n log(n)), even for
the worst possible input.

I think you can actually say something slightly stronger than that, and
show that the probability of very bad cases is small.

It's OK to assume that your random choices follow a known distribution,
you can make them so, but it can be a disaster to assume that the input
array is anything but worse case.

There are sorting algorithms with better worst case, but they are often
not worth the trouble.

None of this is *remotely* new.

Ralph Hartley

```
 0
Reply hartley (156) 8/7/2012 9:58:26 PM

```[Followups-to set since I don't think sci.math.num-analysis is topical
anymore.]

Ralph Hartley <hartley@aic.nrl.navy.mil> writes:

> On 07/24/2012 06:02 PM, Ben Bacarisse wrote:
>> What's going on here?  If there were any down side at to using n/2 I can
>> see why one might consider alternatives -- even random ones (some people
>> do for quicksort, partitioning for example) --- but since n/2 is simple,
>> easy to calculate and optimal you can't be proposing an alternative.
>
> For searching a sorted list, I agree it doesn't make much sense, but
> consider sorting.
>
> Quicksort is a simple algorithm, and is close to O(n log(n)) for most
> inputs, but you can't use it when the input is chosen by an adversary
> because in worst case it is O(n^2).
>
> It is reasonable to use quicksort with each pivot point chosen
> randomly from a uniform distribution over the list. Unlike search,
> there is no consistently good choice, and any deterministic rule could
> be exploited to give O(n^2) cost *every* time.

That's not true.  You can pick the median in O(n) time and that means
(if you're careful about a few other details) that you can have a
deterministic quicksort that is O(n log(n)) in the worst case.  It's not
really practical, but that's not the point.

> Worst case is *still* O(n^2), but that's ok. What you really care
> about is that the average over your random choices is O(n log(n)),
> even for the worst possible input.

Depends on who "you" refers to!  People interested in complexity theory
(this is comp.theory after all) are more likely to be interested in
worst-case analysis.  This is not to deny the large body of theory on
probabilistic analysis -- it's just not the majority concern for
complexity theorists.

> I think you can actually say something slightly stronger than that,
> and show that the probability of very bad cases is small.

Yes, I think so too, but I don't know if there is an agreed measure for
"bad cases are sufficiently rare".  The one paper cited in this thread,

> It's OK to assume that your random choices follow a known
> distribution, you can make them so, but it can be a disaster to assume
> that the input array is anything but worse case.
>
> There are sorting algorithms with better worst case, but they are
> often not worth the trouble.
>
> None of this is *remotely* new.

No, but probabilistic measures of "almost always" O(f) are newer (if not
actually new).  I'd be interested to hear from anyone who know what
recent thinking is about such things.

--
Ben.
```
 0
Reply ben.usenet (6516) 8/8/2012 12:22:53 AM

```On 08/07/2012 08:22 PM, Ben Bacarisse wrote:
> Ralph Hartley <hartley@aic.nrl.navy.mil> writes:
>> Quicksort is a simple algorithm, and is close to O(n log(n)) for most
>> inputs, but you can't use it when the input is chosen by an adversary
>> because in worst case it is O(n^2).
>>
>> It is reasonable to use quicksort with each pivot point chosen
>> randomly from a uniform distribution over the list. Unlike search,
>> there is no consistently good choice, and any deterministic rule could
>> be exploited to give O(n^2) cost *every* time.
>
> That's not true.  You can pick the median in O(n) time and that means
> (if you're careful about a few other details) that you can have a
> deterministic quicksort that is O(n log(n)) in the worst case.  It's not
> really practical, but that's not the point.

OK, I should have said "any *simple* deterministic rule".

>> Worst case is *still* O(n^2), but that's ok. What you really care
>> about is that the average over your random choices is O(n log(n)),
>> even for the worst possible input.
>
> Depends on who "you" refers to!

I was referring to someone who cares how well the program really works.
I would hope theorists would at least pay lip service to that.

> People interested in complexity theory
> (this is comp.theory after all) are more likely to be interested in
> worst-case analysis.  This is not to deny the large body of theory on
> probabilistic analysis

Huge, I would say. As for the majority, we would have to hire a pollster.

> it's just not the majority concern for
> complexity theorists.

That just means (if true) that a majority of complexity theorists are
not interested in this sort of algorithm. That may be so, but when they
*do* analyze it, they should use the appropriate measure, not just the
one they "always" use.

For this sort of randomized algorithm (look up las vagas algorithm), the
*natural* measure is the average (or other statistical properties) over
the random choices, and worst case over inputs (ordered so that the
input can't depend on the random choices).

Talking about the worst case complexity of this algorithm doesn't make
*sense*, because the algorithm doesn't make sense in terms of worst case.

Worst case is a natural measure for deterministic algorithms, but it
shouldn't be used (even by theorists :-) ) where it isn't appropriate.
That would be just as bad as an economist who insists that all
computational costs be expressed in dollars.

Note that we are still taking the worst case over the inputs. For
deterministic algorithms that's all there is. It doesn't matter how you
handle "arbitrary" choices if you don't make any.

If you do, there are several ways to handle them, and they give
different models of computation. Probabilistically is one way. It is
important, because it is easy to implement. We know we *can* make random
choices with a known distribution.

Of course there are other ways. Best case (for example) gives
non-deterministic computation (note again that it is still worst case
with respect to input). Whether it can be implemented efficiently is the
biggest open question in CS (and arguably in all of mathematics).

>> I think you can actually say something slightly stronger than that,
>> and show that the probability of very bad cases is small.
>
> Yes, I think so too, but I don't know if there is an agreed measure for
> "bad cases are sufficiently rare".  The one paper cited in this thread,
> proposed one, but I had trouble getting me head around it.

There are several definitions, one is "O in probability". For a sequence
of random variables X(x), and a function c(n)

X(n) = O_p(a(n))

iff there exists a finite constant c such that

lim_{n->infinity} P( X(n) > c*a(n) ) = 0

(http://en.wikipedia.org/wiki/Big_O_in_probability_notation)
(as always, check this if you use it for anything)

> No, but probabilistic measures of "almost always" O(f) are newer (if not
> actually new).  I'd be interested to hear from anyone who know what
> recent thinking is about such things.

Perhaps, because randomized algorithms are (slightly) newer than
deterministic (also they are usually taught later).

In some fields they are pretty common, though don't think there is a
standard measure or notation everyone uses (is there ever?).

In machine learning, for instance, worst case bounds are impossible, so
some form of statement about probability is all there *ever* is. (Look
up PAC learning).

Ralph Hartley

```
 0
Reply hartley (156) 8/9/2012 3:40:46 PM

76 Replies
56 Views

Similiar Articles:

7/22/2012 5:35:05 PM