Why is recursion useful?

  • Follow


Hi;

I was reading a digg article the other day.  It was buy some uber geek
who has been interviewing people for jobs and who has started to feel
as if programmers don't have the level of skill that they used to
have.

One of his examples was that many applicants didn't know what
recursion was or how to solve a problem with it.

I guess I am in the middle between the "real programmers" of
yesteryear and the glorified web developers applying for entry level
positions.

Why is recursion useful,  why do you need it instead of using an
ordinary loop inside of a function?

Thanks  much in advance for any thoughts.

0
Reply tinker123 (141) 3/15/2007 12:11:53 AM

Steve wrote:
> Why is recursion useful,  why do you need it instead of using an
> ordinary loop inside of a function?

Recursion is useful for two reasons:
(1) it often makes the code much easier to understand, and
(2) not all cases are as simple as linearly iterating through a
     list of things; sometimes when you see one thing, that means
     you need to iterate over several things.

An example of #2 is traversing a directory tree.  You have, say,
this structure:

	foo/
	   1/
	      a
	      b
	   2/
	      c
	   3/
	      d
	      e
	      f

When you're done looking at foo, you need to look at 1 and all its
children, then when you're done with that, you need to look at 2
and all its children, and so on.  This type of thing is just a
zillion times easier and clearer when expressed recursively.  Here's
some example pseudo code:

	traverse_directory(file f) {
	    if (f.is_dir()) {
	        print "I am in directory " + f;

	        foreach subdir in f.list_dir() {
	            traverse_directory (subdir);
	        }
	    }

	    if (f.is_plain_file()) {
	        print "I see file " + f;
	    }
	}

It is certainly not impossible to do this iteratively, but it's
not easy, especially when compared to the above.

   - Logan
0
Reply Logan 3/15/2007 12:57:21 AM


Steve wrote:

> Why is recursion useful,  why do you need it instead of using an
> ordinary loop inside of a function?

Recurse when you need a LIFO stack, and borrowing the system stack makes 
your methods more elegant.

A Last In First Out stack works like one of those springy plate dispensers 
in cafeteria. (The dispenser is springy, not the plates.) You put a plate on 
top, and it goes down; you take a plate off the top, and the stack comes up. 
You can't easily get to the plates at the bottom when the dispenser is full.

This is exactly how variables in methods work. The more deeply you call 
methods,

So here's a little tree:

     <
   <
  /   <
<
  \   <
   <
     <

If you have a Composite Pattern in memory, you can "walk" the object model 
by calling your own walk function twice, once for the left node and again 
for the right node. You could do that without recursion, by storing the list 
of parent nodes between your current point and the root. Recursion 
encapsulates that list, making your code cleaner.

Now bone up on Design Patterns, to become a Real Software Engineer!

-- 
  Phlip
  http://www.greencheese.us/ZeekLand <-- NOT a blog!!!


0
Reply Phlip 3/15/2007 2:21:41 AM

In article <1173917513.890226.177220@y80g2000hsf.googlegroups.com>,
 "Steve" <tinker123@gmail.com> wrote:
> 
> I was reading a digg article the other day.  It was buy some uber geek
> who has been interviewing people for jobs and who has started to feel
> as if programmers don't have the level of skill that they used to
> have.
> 
> One of his examples was that many applicants didn't know what
> recursion was or how to solve a problem with it.
> 
> I guess I am in the middle between the "real programmers" of
> yesteryear and the glorified web developers applying for entry level
> positions.
> 
> Why is recursion useful,  why do you need it instead of using an
> ordinary loop inside of a function?
> 
> Thanks  much in advance for any thoughts.

There are several reasons why recursion is an extremely useful tool for 
constructing programs.  One is that recursive program structures echo 
the elegant simplicity of inductive definitions, and that makes it much 
easier to write programs and reason about their behaviour.  A second is 
that recursion is a more general control mechanism than a loop, in that 
recursion semantics can model loop semantics without additional 
structure -- this permits a very powerful language to be constructed 
without a large number of primitive abstractions.

As a consequence, recursion provides a very expressive and rather 
beautiful basic tool -- and clearly one that nearly every modern 
programming language supports to some degree.  And I contend that even 
if you need a more practical reason to make an effort to understand the 
subtleties of recursion, its ubiquity alone ought to provide sufficient 
argument.

That said, I think it would be a mistake to treat any single 
abstraction, even one as powerful as recursion, as a silver bullet.  In 
the words of the late Alan Perlis, "beware the Turing tar pit, in which 
everything is possible, but nothing of interest is easy."  Well, with 
recursion, many interesting things ARE easy, but of course it helps to 
have a few other program structuring tools as well, as part of a 
complete language.

Cheers,
-M

-- 
Michael J. Fromberger             | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/  | Dartmouth College, Hanover, NH, USA
0
Reply Michael 3/15/2007 3:52:39 AM

On Mar 14, 5:11 pm, "Steve" <tinker...@gmail.com> wrote:
> Hi;
>
> I was reading a digg article the other day.  It was buy some uber geek
> who has been interviewing people for jobs and who has started to feel
> as if programmers don't have the level of skill that they used to
> have.
>
> One of his examples was that many applicants didn't know what
> recursion was or how to solve a problem with it.
>
> I guess I am in the middle between the "real programmers" of
> yesteryear and the glorified web developers applying for entry level
> positions.
>
> Why is recursion useful,  why do you need it instead of using an
> ordinary loop inside of a function?
>
> Thanks  much in advance for any thoughts.

There are lots of problems for which a 'divide and conquer' strategy
is useful.
For these problems, recursion is an elegant and natural fit.

Often, the teaching examples (e.g. 'use recursion to calculate the
factorial') are actually the wrong way to solve it.  In these
instances, iteration is really superior.  But you may have a problem
which will have a much better solution implemented recursively.

Mathematical recurrence relations are a simple example of something
that can be solved with recursion and the solution will be easily
understandable.  For example, many orthogonal polynomial sets are
defined by recurrence relations.

The quicksort algorithm uses the divide and conquor strategy.  We
recursively divide the data set into partitions (<,=,> pivot) until
the partitions are of size zero.

Sometimes it is worthwhile to transform a recursive solution into an
iterative one, and sometimes it is more bother than it is worth (e.g
the transformed solution may not be much faster and may be far more
complex to maintain).  That's the real art of it (knowing when
transformation to iteration is a good idea or not).

0
Reply user923005 3/15/2007 8:51:41 PM

user923005 wrote:
 
> The quicksort algorithm uses the divide and conquor strategy.  We
> recursively divide the data set into partitions (<,=,> pivot) until
> the partitions are of size zero.
> 
> Sometimes it is worthwhile to transform a recursive solution into an
> iterative one, and sometimes it is more bother than it is worth (e.g
> the transformed solution may not be much faster and may be far more
> complex to maintain).  That's the real art of it (knowing when
> transformation to iteration is a good idea or not).

I've never seen or heard of an iterative version of mergesort.
I've thought about attempting it a couple of times 
but I drew a blank.

-- 
pete
0
Reply pete 3/16/2007 1:24:07 AM

On Mar 15, 6:24 pm, pete <pfil...@mindspring.com> wrote:
> user923005 wrote:
> > The quicksort algorithm uses the divide and conquor strategy.  We
> > recursively divide the data set into partitions (<,=,> pivot) until
> > the partitions are of size zero.
>
> > Sometimes it is worthwhile to transform a recursive solution into an
> > iterative one, and sometimes it is more bother than it is worth (e.g
> > the transformed solution may not be much faster and may be far more
> > complex to maintain).  That's the real art of it (knowing when
> > transformation to iteration is a good idea or not).
>
> I've never seen or heard of an iterative version of mergesort.
> I've thought about attempting it a couple of times
> but I drew a blank.
>
> --
> pete

http://penguin.ewu.edu/~trolfe/NaturalMerge/index.html
http://ds4beginners.wordpress.com/2006/09/22/merge-sort-iteration/
http://www.cosc.canterbury.ac.nz/tad.takaoka/cosc229/imerge.c
http://p2p.wrox.com/TopicIndex/17558.htm

0
Reply user923005 3/16/2007 1:43:06 AM

pete <pfiland@mindspring.com> writes:
> I've never seen or heard of an iterative version of mergesort.

Really?  All of my linked-list merge sorts are iterative.
-- 
Ben Pfaff 
blp@cs.stanford.edu
http://benpfaff.org
0
Reply Ben 3/16/2007 1:45:40 AM

pete wrote in message

> I've never seen or heard of an iterative version of mergesort.
> I've thought about attempting it a couple of times
> but I drew a blank.

That's what I meant by recursion borrows the system stack for a data stack. 
If you write a quicksort with an explicit data stack, then your function 
might technically not call itself, yet your algorithm is still recursive - 
defined in terms of itself.

-- 
  Phlip
  http://www.greencheese.us/ZeekLand <-- NOT a blog!!! 


0
Reply Phlip 3/16/2007 2:02:05 AM

pete wrote:
> 
.... snip ...
> 
> I've never seen or heard of an iterative version of mergesort.
> I've thought about attempting it a couple of times
> but I drew a blank.

ANY recursive algorithm can be converted into iteration by use of
an auxiliary stack.  Very straightforward process.

-- 
Chuck F (cbfalconer at maineline dot net)
   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>



-- 
Posted via a free Usenet account from http://www.teranews.com

0
Reply CBFalconer 3/16/2007 3:28:14 AM

pete wrote:
> I've never seen or heard of an iterative version of mergesort.
> I've thought about attempting it a couple of times 
> but I drew a blank.

The how and why of an iterative mergesort dawned on me one time
when I was considering how one would use mergesort to sort data
using only tapes (i.e. when there is not enough RAM to store the
whole data set in memory).

My first thought was that you would want to write a single 1-element
list to each of two tapes.  Then sequentially read both tapes and
merge those two onto a third.  Then you'd have a tape with a sorted
2-element list.  If you then had another tape with another sorted
2-element list, you could merge those two 2-element lists onto another
tape and get a 4-element list.

Then it occurred to me that in order not to have an arbitrarily
large number of tapes (one for each item in the list, since you
are starting with trivially-sorted 1-element lists), you could
write a series of lists to each tape, putting some kind of marker
between the lists.  (Many tape drives support such markers in
hardware.)  Thus, the sorting process would consist of reading
the initial unsorted items off a single tape and writing half
of them to one tape ('tape A') and half to another ('tape B').
Now, if you had N items to start, tape A doesn't have a single
list with N/2 items in it; instead, it has a N/2 sorted lists of
length 1.  Same thing for tape B.  Now you do N/2 merge operations.
You merge the first list from tape A with the first list from
tape B and write the result to tape C.  Then you merge the next
list from tape A and the next list from tape B and write the
result to tape D.  Then you merge another two and write to C,
and you keep alternating between C and D.

Now tapes C and D each have N/4 sorted lists of size 2.  You
repeat the process but bounce it back the other direction
so that you're reading from C and D and writing to A and B.
When you're done, A and B have N/8 sorted lists of size 4.
You can see where it's going.  And it's definitely not recursive.
It's not even random access, for crying out loud!

   - Logan
0
Reply Logan 3/16/2007 5:11:36 AM

On Fri, 16 Mar 2007 01:24:07 GMT, pete <pfiland@mindspring.com> wrote:

>user923005 wrote:
> 
>> The quicksort algorithm uses the divide and conquor strategy.  We
>> recursively divide the data set into partitions (<,=,> pivot) until
>> the partitions are of size zero.
>> 
>> Sometimes it is worthwhile to transform a recursive solution into an
>> iterative one, and sometimes it is more bother than it is worth (e.g
>> the transformed solution may not be much faster and may be far more
>> complex to maintain).  That's the real art of it (knowing when
>> transformation to iteration is a good idea or not).
>
>I've never seen or heard of an iterative version of mergesort.
>I've thought about attempting it a couple of times 
>but I drew a blank.

The iterative version of mergesort does the merging explicitly bottom
up.  You have an outer loop that goes from n/2 to 1 dividing by 2 in
each step, and an inner merge loop of length (n/2)/outer_index. You have
to do some fiddles if n is not a power of two.  Details are left to the
student.

 
0
Reply cri 3/16/2007 5:27:22 AM

Logan Shaw wrote:
> 
.... snip ...
> 
> Now tapes C and D each have N/4 sorted lists of size 2.  You
> repeat the process but bounce it back the other direction
> so that you're reading from C and D and writing to A and B.
> When you're done, A and B have N/8 sorted lists of size 4.
> You can see where it's going.  And it's definitely not recursive.
> It's not even random access, for crying out loud!

Look up polyphase sort.  Wirth has a good description in
"Algorithms + Data Structures = Programs".

-- 
Chuck F (cbfalconer at maineline dot net)
   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>


-- 
Posted via a free Usenet account from http://www.teranews.com

0
Reply CBFalconer 3/16/2007 7:29:44 AM

On Mar 16, 1:36 pm, c...@tiac.net (Richard Harter) wrote:
[snip]
> Looking up polyphase sort is always a good thing to do.  However it is
> only tangentially relevent to the issue at hand.

Though it (polyphase sort/merge) is used all over the place [e.g
PostgreSQL], polyphase sort/merge is obsolete.  So is replacement-
selection (for that matter).

There is no need to perform log2(subfiles) passes over the data, as it
is a simple matter to merge the entire sequence of subfiles in a
single pass.  Simply place the heads of the subfiles in a priority
queue that allows change of priority and empty all of the lists.

Consider 1024 subfiles.  That would require 10 + 1 = 11 read/write
passes using polyphase sort/merge.
Using a priority queue cuts it down to 1 + 1 = 2 read/write passes.
Consider 2 subfiles.  Polyphase needs 3 passes, and priority queue
requires 2.  This is the minimum advantage for priority queue based
merge.

So looking up polyphase sort/merge is useful and interesting for
historical purposes.  Just don't implement it in modern software
unless you have a striking and powerful reason to do so.

0
Reply user923005 3/16/2007 8:29:17 PM

On Fri, 16 Mar 2007 02:29:44 -0500, CBFalconer <cbfalconer@yahoo.com>
wrote:

>Logan Shaw wrote:
>> 
>... snip ...
>> 
>> Now tapes C and D each have N/4 sorted lists of size 2.  You
>> repeat the process but bounce it back the other direction
>> so that you're reading from C and D and writing to A and B.
>> When you're done, A and B have N/8 sorted lists of size 4.
>> You can see where it's going.  And it's definitely not recursive.
>> It's not even random access, for crying out loud!
>
>Look up polyphase sort.  Wirth has a good description in
>"Algorithms + Data Structures = Programs".

Looking up polyphase sort is always a good thing to do.  However it is
only tangentially relevent to the issue at hand.

 
0
Reply cri 3/16/2007 8:36:35 PM

user923005 wrote:
> 
> Often, the teaching examples (e.g. 'use recursion to calculate the
> factorial') are actually the wrong way to solve it.  In these
> instances, iteration is really superior.  But you may have a problem
> which will have a much better solution implemented recursively.
> 

I struggled mightily with recursion in freshman CS.  Part of it was my 
own stupidity, part was the teaching method, and part was the tools 
available.  CS was in its infancy; K&R C hadn't even been published.

Years later, when I taught CS, I used natural examples and fractals 
(patterns in a leaf, doodles on a sheet of paper, and so on) and my 
students caught on very quickly.

Recursion is very natural - so natural that it is counterintuitive. 
Most people think it is something esoteric, so they don't want to 
comprehend just how easy it really is.
0
Reply CptDondo 3/16/2007 9:33:05 PM

Richard Harter wrote:
> CBFalconer <cbfalconer@yahoo.com> wrote:
>> Logan Shaw wrote:
>>
>>... snip ...
>>>
>>> Now tapes C and D each have N/4 sorted lists of size 2.  You
>>> repeat the process but bounce it back the other direction
>>> so that you're reading from C and D and writing to A and B.
>>> When you're done, A and B have N/8 sorted lists of size 4.
>>> You can see where it's going.  And it's definitely not recursive.
>>> It's not even random access, for crying out loud!
>>
>> Look up polyphase sort.  Wirth has a good description in
>> "Algorithms + Data Structures = Programs".
> 
> Looking up polyphase sort is always a good thing to do.  However
> it is only tangentially relevent to the issue at hand.

Why do you say that?  It is a mergesort optimized to minimize i/o.

-- 
Chuck F (cbfalconer at maineline dot net)
   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>



-- 
Posted via a free Usenet account from http://www.teranews.com

0
Reply CBFalconer 3/16/2007 10:20:22 PM

"user923005" writes:

>> Looking up polyphase sort is always a good thing to do.  However it is
>> only tangentially relevent to the issue at hand.
>
> Though it (polyphase sort/merge) is used all over the place [e.g
> PostgreSQL], polyphase sort/merge is obsolete.  So is replacement-
> selection (for that matter).
>
> There is no need to perform log2(subfiles) passes over the data, as it
> is a simple matter to merge the entire sequence of subfiles in a
> single pass.  Simply place the heads of the subfiles in a priority
> queue that allows change of priority and empty all of the lists.
>
> Consider 1024 subfiles.  That would require 10 + 1 = 11 read/write
> passes using polyphase sort/merge.
> Using a priority queue cuts it down to 1 + 1 = 2 read/write passes.
> Consider 2 subfiles.  Polyphase needs 3 passes, and priority queue
> requires 2.  This is the minimum advantage for priority queue based
> merge.
>
> So looking up polyphase sort/merge is useful and interesting for
> historical purposes.  Just don't implement it in modern software
> unless you have a striking and powerful reason to do so.

Most sites didn't have 1024 tape units. 


0
Reply osmium 3/17/2007 1:13:04 AM

On Mar 16, 6:13 pm, "osmium" <r124c4u...@comcast.net> wrote:
[snip]
> Most sites didn't have 1024 tape units.

Yes, but 1024 subfiles is possible on many modern disk subsystems.
Generally, sorting is done on disk now-days, even in mainframe shops.
Disk is cheaper and faster.  Tape drives today are only used for
backup.
Look at what a terrabyte of disk costs today:
http://www.nextag.com/LaCie-Big-Disk-Extreme-77082608/prices-html

0
Reply user923005 3/17/2007 5:49:31 AM

On Mar 16, 3:20 pm, CBFalconer <cbfalco...@yahoo.com> wrote:
> Richard Harter wrote:
> > CBFalconer <cbfalco...@yahoo.com> wrote:
> >> Logan Shaw wrote:
>
> >>... snip ...
>
> >>> Now tapes C and D each have N/4 sorted lists of size 2.  You
> >>> repeat the process but bounce it back the other direction
> >>> so that you're reading from C and D and writing to A and B.
> >>> When you're done, A and B have N/8 sorted lists of size 4.
> >>> You can see where it's going.  And it's definitely not recursive.
> >>> It's not even random access, for crying out loud!
>
> >> Look up polyphase sort.  Wirth has a good description in
> >> "Algorithms + Data Structures = Programs".
>
> > Looking up polyphase sort is always a good thing to do.  However
> > it is only tangentially relevent to the issue at hand.
>
> Why do you say that?  It is a mergesort optimized to minimize i/o.

However, it fails to accomplish that goal.  There are clearly better
ways.
It is (nevertheless) quite an interesting study.


0
Reply user923005 3/17/2007 5:50:41 AM

On Mar 16, 3:29 pm, "user923005" <dcor...@connx.com> wrote:
> There is no need to perform log2(subfiles) passes over the data, as it
> is a simple matter to merge the entire sequence of subfiles in a
> single pass.  Simply place the heads of the subfiles in a priority
> queue that allows change of priority and empty all of the lists.
>
> Consider 1024 subfiles.  That would require 10 + 1 = 11 read/write
> passes using polyphase sort/merge.
> Using a priority queue cuts it down to 1 + 1 = 2 read/write passes.
> Consider 2 subfiles.  Polyphase needs 3 passes, and priority queue
> requires 2.  This is the minimum advantage for priority queue based
> merge.


Treating disk like a random access device for sorting purposes is a
classic beginner mistake which leads to horrible sort performance.
The number of subfiles you can usefully merge at once is limited by
the amount of memory you have available to do large sequential reads
(large enough to minimize the seek penalty) of each subfile.
Obviously reducing the sort order increases the number of merge
passes.

Somewhat more concretely, doing a 32-way merge instead of 1024-way
merge allows you 32 times bigger input buffers.  If those 32 fold
bigger input buffers allow you to double or triple the subfile read
throughput, you'll sort faster by doing two 32-way merges rather than
a single 1024 merge.  If, for example, you move from 4K buffers to
128K buffers (4-8MB total input buffer space), you'll probably get a
15-20x improvement in throughput.  OTOH, moving from 1MB to 32MB
buffers (1-2GB total) will have only a modest impact.

Of course all that assumes that you've taken some care in disk layout
as regards your sort work areas.

0
Reply robertwessel2 3/17/2007 6:31:24 AM

On Mar 16, 11:31 pm, "robertwess...@yahoo.com"
<robertwess...@yahoo.com> wrote:
> On Mar 16, 3:29 pm, "user923005" <dcor...@connx.com> wrote:
>
> > There is no need to perform log2(subfiles) passes over the data, as it
> > is a simple matter to merge the entire sequence of subfiles in a
> > single pass.  Simply place the heads of the subfiles in a priority
> > queue that allows change of priority and empty all of the lists.
>
> > Consider 1024 subfiles.  That would require 10 + 1 = 11 read/write
> > passes using polyphase sort/merge.
> > Using a priority queue cuts it down to 1 + 1 = 2 read/write passes.
> > Consider 2 subfiles.  Polyphase needs 3 passes, and priority queue
> > requires 2.  This is the minimum advantage for priority queue based
> > merge.
>
> Treating disk like a random access device for sorting purposes is a
> classic beginner mistake which leads to horrible sort performance.
> The number of subfiles you can usefully merge at once is limited by
> the amount of memory you have available to do large sequential reads
> (large enough to minimize the seek penalty) of each subfile.
> Obviously reducing the sort order increases the number of merge
> passes.
>
> Somewhat more concretely, doing a 32-way merge instead of 1024-way
> merge allows you 32 times bigger input buffers.  If those 32 fold
> bigger input buffers allow you to double or triple the subfile read
> throughput, you'll sort faster by doing two 32-way merges rather than
> a single 1024 merge.  If, for example, you move from 4K buffers to
> 128K buffers (4-8MB total input buffer space), you'll probably get a
> 15-20x improvement in throughput.  OTOH, moving from 1MB to 32MB
> buffers (1-2GB total) will have only a modest impact.
>
> Of course all that assumes that you've taken some care in disk layout
> as regards your sort work areas.

1024 buffers of 1MB each is 1 GB.  That's about $100.
http://www.dealtime.com/xPO-Kingston-1GB-PC2-4300U-C4-N-ECC-2-BANK-1GB-PC2-4300U-C4-N-ECC-2-BANK

I have 2GB on this machine, and could easily perform a sort with 1024
megabyte sized buffers.

Now that we have 64 bit CPUs going mainstream, and soon to follow 64
bit operating systems, it's time to rethink some fundamental ways of
doing business.

0
Reply user923005 3/17/2007 7:17:47 AM

On 17 Mar 2007 00:17:47 -0700
"user923005" <dcorbit@connx.com> wrote:

> Now that we have 64 bit CPUs going mainstream, and soon to follow 64
> bit operating systems,

	Huh there have been 64 bit operating systems for many years now,
even for commodity (PC) hardware.

-- 
C:>WIN                                      |   Directable Mirror Arrays
The computer obeys and wins.                | A better way to focus the sun
You lose and Bill collects.                 |    licences available see
                                            |    http://www.sohara.org/
0
Reply Steve 3/17/2007 12:48:17 PM

Steve O'Hara-Smith wrote:
> On 17 Mar 2007 00:17:47 -0700
> "user923005" <dcorbit@connx.com> wrote:
> 
>> Now that we have 64 bit CPUs going mainstream, and soon to follow 64
>> bit operating systems,
> 
> 	Huh there have been 64 bit operating systems for many years now,
> even for commodity (PC) hardware.

I read that as:

	Now that we have 64 bit CPUs going mainstream, and soon to
	follow 64 bit operating systems also going mainstream, . . .

   - Logan
0
Reply Logan 3/17/2007 3:35:01 PM

robertwessel2@yahoo.com wrote:

> Treating disk like a random access device for sorting purposes is a
> classic beginner mistake which leads to horrible sort performance.

I've been waiting for someone to point that out.  I have seen a couple
of orders of magnitude difference (hours vs. minutes) made by
designing/choosing algorithms on the basis that seeks were likely to
dominate execution time.  Not at all a trivial or even just secondary
issue.

     -- chris
0
Reply Chris 3/17/2007 5:54:30 PM

user923005 wrote:
> 
> On Mar 15, 6:24 pm, pete <pfil...@mindspring.com> wrote:

> > I've never seen or heard of an iterative version of mergesort.
> > I've thought about attempting it a couple of times
> > but I drew a blank.

> http://penguin.ewu.edu/~trolfe/NaturalMerge/index.html
> http://ds4beginners.wordpress.com/2006/09/22/merge-sort-iteration/
> http://www.cosc.canterbury.ac.nz/tad.takaoka/cosc229/imerge.c
> http://p2p.wrox.com/TopicIndex/17558.htm

Thank you.

-- 
pete
0
Reply pete 3/18/2007 2:24:37 AM

On Sat, 17 Mar 2007 10:35:01 -0500
Logan Shaw <lshaw-usenet@austin.rr.com> wrote:

> Steve O'Hara-Smith wrote:
> > On 17 Mar 2007 00:17:47 -0700
> > "user923005" <dcorbit@connx.com> wrote:
> > 
> >> Now that we have 64 bit CPUs going mainstream, and soon to follow 64
> >> bit operating systems,
> > 
> > 	Huh there have been 64 bit operating systems for many years now,
> > even for commodity (PC) hardware.
> 
> I read that as:
> 
> 	Now that we have 64 bit CPUs going mainstream, and soon to

	I thought 64 bit CPUs had been mainstream since the AMD64 started
mass marketing. I've been seeing them in supermarkets for some time now.

-- 
C:>WIN                                      |   Directable Mirror Arrays
The computer obeys and wins.                | A better way to focus the sun
You lose and Bill collects.                 |    licences available see
                                            |    http://www.sohara.org/
0
Reply Steve 3/18/2007 6:34:26 AM

On Mar 15, 12:11 am, "Steve" <tinker...@gmail.com> wrote:
> Hi;
>
> I was reading a digg article the other day.  It was buy some uber geek
> who has been interviewing people for jobs and who has started to feel
> as if programmers don't have the level of skill that they used to
> have.
>
> One of his examples was that many applicants didn't know what
> recursion was or how to solve a problem with it.
>
> I guess I am in the middle between the "real programmers" of
> yesteryear and the glorified web developers applying for entry level
> positions.
>
> Why is recursion useful,  why do you need it instead of using an
> ordinary loop inside of a function?
>

The answer is strongly related to the answer to "why is induction
useful?". A Google on <induction recursion> will give examples.

Jon C.

0
Reply jg 3/18/2007 12:03:57 PM

Steve O'Hara-Smith wrote:
> On Sat, 17 Mar 2007 10:35:01 -0500
> Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
> 
>> Steve O'Hara-Smith wrote:
>>> On 17 Mar 2007 00:17:47 -0700
>>> "user923005" <dcorbit@connx.com> wrote:
>>>
>>>> Now that we have 64 bit CPUs going mainstream, and soon to follow 64
>>>> bit operating systems,
>>> 	Huh there have been 64 bit operating systems for many years now,
>>> even for commodity (PC) hardware.
>> I read that as:
>>
>> 	Now that we have 64 bit CPUs going mainstream, and soon to
> 
> 	I thought 64 bit CPUs had been mainstream since the AMD64 started
> mass marketing. I've been seeing them in supermarkets for some time now.

Well, I guess what it depends on what you mean by mainstream.  If you mean
"you can buy them in any regular store", then 64-bit CPUs have been
mainstream for a few years now.  But if you mean "most people have a 64-bit
CPU in their machine, and you can assume the process is 64-bit when you
write your software", they are only just starting to be mainstream.

   - Logan
0
Reply Logan 3/18/2007 3:29:08 PM

On Mar 17, 2:17 am, "user923005" <dcor...@connx.com> wrote:
> On Mar 16, 11:31 pm, "robertwess...@yahoo.com"
> > Treating disk like a random access device for sorting purposes is a
> > classic beginner mistake which leads to horrible sort performance.
> > The number of subfiles you can usefully merge at once is limited by
> > the amount of memory you have available to do large sequential reads
> > (large enough to minimize the seek penalty) of each subfile.
> > Obviously reducing the sort order increases the number of merge
> > passes.
>
> > Somewhat more concretely, doing a 32-way merge instead of 1024-way
> > merge allows you 32 times bigger input buffers.  If those 32 fold
> > bigger input buffers allow you to double or triple the subfile read
> > throughput, you'll sort faster by doing two 32-way merges rather than
> > a single 1024 merge.  If, for example, you move from 4K buffers to
> > 128K buffers (4-8MB total input buffer space), you'll probably get a
> > 15-20x improvement in throughput.  OTOH, moving from 1MB to 32MB
> > buffers (1-2GB total) will have only a modest impact.
>
> 1024 buffers of 1MB each is 1 GB.  That's about $100.http://www.dealtime.com/xPO-Kingston-1GB-PC2-4300U-C4-N-ECC-2-BANK-1G...
>
> I have 2GB on this machine, and could easily perform a sort with 1024
> megabyte sized buffers.


The numbers I provided were indented to illustrate the effect of
excessively small buffers.  Trivially if you have enough memory to
have a large number of large buffers it would be silly to reduce your
merge order.   That's been understood since the early sixties.  You're
also assuming that just because a machine has multiple GB of memory
available you'll be able to dedicate that much to your sort process.

Or just pick a bigger dataset to sort.  Back in the sixties, a one GB
dataset was a really, really big external sort (which would have
required something on the order of 50 2314 disk drives just for sort
work space), now I'd often expect that to be an internal sort.  Things
change, datasets get bigger.  Saying that all merges should be order N
for an N-run sort is as silly now as it would have been in 1965.
Obviously increasing main memory sizes allow for bigger runs and
higher merge orders, but it certainly doesn't eliminate the issue
entirely.

0
Reply robertwessel2 3/19/2007 9:54:47 PM

pete wrote:
> 
> user923005 wrote:
> >
> > On Mar 15, 6:24 pm, pete <pfil...@mindspring.com> wrote:
> 
> > > I've never seen or heard of an iterative version of mergesort.
> > > I've thought about attempting it a couple of times
> > > but I drew a blank.

> > http://www.cosc.canterbury.ac.nz/tad.takaoka/cosc229/imerge.c

> Thank you.

I translated imerge.c to an e_type interface.
It's about 2.5 times as slow on arrays of integers
as the recursive mergesort that I already have , 
but I'm not done rewriting it yet.
I think it might have potential.

void imsort(e_type *array, size_t nmemb)
{ 
    size_t i, j, k, n_1, k2;
    e_type *b;
    
    if (nmemb == 0) {
        return;
    }
    b = malloc(nmemb * sizeof *b);
    if (b == NULL) {
        exit(EXIT_FAILURE);
    }
    n_1 = nmemb - 1;
    for (k = 1; n_1 > k; k = k2) {
        k2 = k * 2;
        for (i = 0; nmemb > i + k; i += k2) {
            j = i + k2;
            if (j > nmemb) {
                j = nmemb;
            }
            i_merge(array, b, i, i + k, j);
        }
    }
    free(b);
}

static void 
i_merge(e_type *a, e_type *b, size_t L, size_t r, size_t u)
{ 
    size_t i, j, k;

    k = i = L; 
    j = r;
    while (r > i && u > j) { 
        b[k++] = a[(*(GT(a + i, a + j) ? &j : &i))++];
    }
    while (r > i) { 
        b[k++] = a[i++]; 
    }
    while (u > j) { 
        b[k++] = a[j++];
    }
    for (k = L; u > k; ++k) { 
        a[k] = b[k]; 
    }
}

-- 
pete
0
Reply pete 3/22/2007 8:53:12 PM

pete wrote:
> 
> pete wrote:
> >
> > user923005 wrote:
> > >
> > > On Mar 15, 6:24 pm, pete <pfil...@mindspring.com> wrote:
> >
> > > > I've never seen or heard of an iterative version of mergesort.
> > > > I've thought about attempting it a couple of times
> > > > but I drew a blank.
> 
> > > http://www.cosc.canterbury.ac.nz/tad.takaoka/cosc229/imerge.c
> 
> > Thank you.
> 
> I translated imerge.c to an e_type interface.
> It's about 2.5 times as slow on arrays of integers
> as the recursive mergesort that I already have ,
> but I'm not done rewriting it yet.
> I think it might have potential.
> 
> void imsort(e_type *array, size_t nmemb)
> {
>     size_t i, j, k, n_1, k2;
>     e_type *b;
> 
>     if (nmemb == 0) {
>         return;
>     }
>     b = malloc(nmemb * sizeof *b);
>     if (b == NULL) {
>         exit(EXIT_FAILURE);
>     }
>     n_1 = nmemb - 1;
>     for (k = 1; n_1 > k; k = k2) {
>         k2 = k * 2;
>         for (i = 0; nmemb > i + k; i += k2) {
>             j = i + k2;
>             if (j > nmemb) {
>                 j = nmemb;
>             }
>             i_merge(array, b, i, i + k, j);
>         }
>     }
>     free(b);
> }

The n_1 variable should be removed from the function and 
    for (k = 1; n_1 > k; k = k2) {
should be
    for (k = 1; nmemb > k; k = k2) {
instead.

-- 
pete
0
Reply pete 3/23/2007 12:21:19 AM

  But humans are not supposed to understand
 the code !

  A few good programmers code about
 a megabyte Kernel and then

 no human ever uses at low level ,

 they become computer operators ,
 they never code ....

  They will never need to know Recursion,
 but a picture of it in their head would be useful .
    So they can use the s/w ..

  Why make EVERYONE a programmer ?

  Why not make people users ?

   If you use C++ , you will force the User
 to dis-assemble your source ...

  Thats not productive .

   Lets program once , then make everyone
  in to Users .....


   This requires putting down bloaters
  and trade unionists .

  Every time you see/hear a person
 who wants to program , old fashioned

    SHOUT  THEM DOWN !
  They are taking your taxes !
  They are violating the public trust .
     Trade Unionists ...


 They only want to belabor programming
  for job security ..
   C++ is the most time comsuming method
 to create low level code .


   Low level code must be created ONCE .

0
Reply Werty 3/27/2007 5:32:19 PM

32 Replies
169 Views

(page loaded in 0.25 seconds)


Reply: