Better Random benchmark - need help with profiling

  • Follow


I recently started to learn Ruby. I thought that the obviously flawed  
Random
benchmark at http://shootout.alioth.debian.org would make a good first
project.

I need some help with profiling as my system is a lot different from the
shootout setup. My system is a iBook G4 with Mac OS 10.3 and Ruby 1.8. They
use Linux on AMD with Ruby 1.9.

The original implementation have two eyecatching flaws:
  1) It use Floats instead of Fixnum
  2) It uses a really costly way to get the first program parameter.

After I corrected these two misstakes it used 30% less time and
considerably less memory (at least half as much, hard to measure).

Then I started from scratch and wrote my own version, doing a lot of
experimentation. This is the result:

# Code start

module Kernel
   $srandom = 42
   def random(max)
     (max * ($srandom = ($srandom * 3877 + 29573) % 139968)).quo(139968)
   end
end

($*.first.to_i-1).times do
   random(100)
end

printf "%.9f\n", random(100)

# Code end


The most important qualities is (in order of saved time):
* Use Fixnum for integer arithmetics. Much faster and with better  
precision.
* Use ARGV.first instead of ARGV.shift to get first parameter
* Use literals instead of constants. Ruby constants is no real constants,
   they are global variables with A LOT of overhead. The specification
   C-program use literals. An alternative would be to use
   local variables.
* Use $* instead of ARGV. Sigh. Ruby constants are really, reeeally  
slooooow.
* Use times as iterator. As do the original solution.
* Use a global variable to hold the seed. As do the original solution.
   I can't find a sane way to keep the seed between method calls without
   consuming a lot of precious time.
* Wraped the function in module Kernel. For some odd reason this makes the
   program somewhat faster. I really don't like it, but...

I've also tried a nice Simula flavoured version an, not so nice,  
"oo"-version and a lot of really wacky versions.
They are a lot faster then the original code, but slower then the above  
version.
0
Reply martialis (8) 9/10/2005 6:14:39 PM

On 10/09/05, Martin Jansson <martialis@bigfoot.com> wrote:
>=20
> I recently started to learn Ruby. I thought that the obviously flawed
> Random
> benchmark at http://shootout.alioth.debian.org would make a good first
> project.
>=20
> I need some help with profiling as my system is a lot different from the
> shootout setup. My system is a iBook G4 with Mac OS 10.3 and Ruby 1.8. Th=
ey
> use Linux on AMD with Ruby 1.9.
>=20
> The original implementation have two eyecatching flaws:
>   1) It use Floats instead of Fixnum
>   2) It uses a really costly way to get the first program parameter.
>=20
> After I corrected these two misstakes it used 30% less time and
> considerably less memory (at least half as much, hard to measure).
>=20
> Then I started from scratch and wrote my own version, doing a lot of
> experimentation. This is the result:
>=20
> # Code start
>=20
> module Kernel
>    $srandom =3D 42
>    def random(max)
>      (max * ($srandom =3D ($srandom * 3877 + 29573) % 139968)).quo(139968=
)
>    end
> end
>=20
> ($*.first.to_i-1).times do
>    random(100)
> end
>=20
> printf "%.9f\n", random(100)
>=20
> # Code end
>=20
>=20
> The most important qualities is (in order of saved time):
> * Use Fixnum for integer arithmetics. Much faster and with better
> precision.
> * Use ARGV.first instead of ARGV.shift to get first parameter
> * Use literals instead of constants. Ruby constants is no real constants,
>    they are global variables with A LOT of overhead. The specification
>    C-program use literals. An alternative would be to use
>    local variables.
> * Use $* instead of ARGV. Sigh. Ruby constants are really, reeeally
> slooooow.
> * Use times as iterator. As do the original solution.
> * Use a global variable to hold the seed. As do the original solution.
>    I can't find a sane way to keep the seed between method calls without
>    consuming a lot of precious time.
> * Wraped the function in module Kernel. For some odd reason this makes th=
e
>    program somewhat faster. I really don't like it, but...
>=20
> I've also tried a nice Simula flavoured version an, not so nice,
> "oo"-version and a lot of really wacky versions.
> They are a lot faster then the original code, but slower then the above
> version.
>=20
>=20

Risking to be burned because I'm somehow helping with the aliot things
here are my 2cents

bschroed@black:~/svn/projekte/random$ cat random.rb=20
module Kernel
  $srandom =3D 42
  def random(max)
    (max * ($srandom =3D ($srandom * 3877 + 29573) % 139968)).quo(139968)
  end
end

($*.first.to_i-1).times do
  random(100)
end

printf "%.9f\n", random(100)
bschroed@black:~/svn/projekte/random$ cat random1.rb=20
class RandomNumberGenerator
  def initialize(seed =3D 42)
    @seed =3D seed
  end
 =20
  def random(max)
    (max * (@seed =3D (@seed * 3877 + 29573) % 139968)).quo(139968)
  end
end

rng =3D RandomNumberGenerator.new

(ARGV[0].to_i-1).times do
  rng.random(100)
end

puts "%.9f" % rng.random(100)
bschroed@black:~/svn/projekte/random$ cat benchmark-random.rb=20
require 'benchmark'

Benchmark.bm(30) do | b |
  [["random.rb", "Original"],
  ["random1.rb", "Rubiesque"]].each do | file, title |
    b.report(title) do
      200.times do system "ruby #{file} 5000 > /dev/null" end
    end
  end
end
bschroed@black:~/svn/projekte/random$ ruby benchmark-random.rb=20
                                    user     system      total        real
Original                        0.020000   0.050000   9.620000 (  9.583356)
Rubiesque                       0.010000   0.060000   9.820000 (  9.792614)

As you see, it is quite cheap to write prettier code. And the modified
version even keeps the state in a class-variable, so you can have
multiple rng's with differen states around.

hope to help,

Brian

--=20
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/


0
Reply ruby.brian (377) 9/10/2005 7:34:44 PM


On Sat, 10 Sep 2005 21:34:44 +0200, Brian Schr�der <ruby.brian@gmail.com>  
wrote:

> class RandomNumberGenerator
>   def initialize(seed = 42)
>     @seed = seed
>   end
>  def random(max)
>     (max * (@seed = (@seed * 3877 + 29573) % 139968)).quo(139968)
>   end
> end
> rng = RandomNumberGenerator.new
> (ARGV[0].to_i-1).times do
>   rng.random(100)
> end
> puts "%.9f" % rng.random(100)


> bschroed@black:~/svn/projekte/random$ ruby benchmark-random.rb
                                     user     system      total        real
> Original                        0.020000   0.050000   9.620000 (   
> 9.583356)
> Rubiesque                       0.010000   0.060000   9.820000 (   
> 9.792614)
>As you see, it is quite cheap to write prettier code. And the modified
> version even keeps the state in a class-variable, so you can have
> multiple rng's with differen states around.

Shroeder! What version of Ruby do you use?

I did something similar in one of my many versions. And it is the oo thing  
to do.
(I haven't actually benchmarked puts vs printf, I will do that ;)

It's not faster on my system. That may be proof that profiling is a bad  
thing and that we all should prove our code mathematically (shudder).

martinjansson$ time ruby "random_shroeder.rb" 900000
75.544410151

real    0m10.296s
user    0m9.760s
sys     0m0.060s

martinjansson$ time ruby "random.rb" 900000
75.544410151

real    0m9.984s
user    0m9.230s
sys     0m0.020s

martinjansson$ ruby "random_benchmark.rb" 200 5000
                                     user     system      total        real
Ugly                            0.100000   0.380000  24.770000 ( 26.368260)
Rubiesque                       0.130000   0.370000  25.580000 ( 27.071112)

martinjansson$ ruby "random_benchmark.rb" 10 900000
                                     user     system      total        real
Ugly                            0.000000   0.010000  85.770000 ( 98.821162)
Rubiesque                       0.000000   0.020000  99.810000 (105.068127)


Here is the benchmark code:

#random_benchmark.rb

require 'benchmark'
i = ARGV[0].to_i
j = ARGV[1].to_i

Benchmark.bm(30) do | b |
   [["random.rb", "Ugly"],
   ["random_shroeder.rb", "Rubiesque"]].each do | file, title |
     b.report(title) do
     i.times do system "ruby #{file} #{j} > /dev/null" end
     end
   end
end

I have a lot of white noise on my system, so I haven't used Rubys  
benchmark thingie much. It would be more useful to me if it could do  
matched pair sampling (is that the correct English term?) and/or throw  
away extreme values.

0
Reply martialis (8) 9/11/2005 7:36:50 AM

On 11/09/05, Martin Jansson <martialis@bigfoot.com> wrote:
> On Sat, 10 Sep 2005 21:34:44 +0200, Brian Schr=F6der <ruby.brian@gmail.co=
m>
> [snip]
>=20
> Shroeder! What version of Ruby do you use?
>=20

$ ruby -v
ruby 1.8.3 (2005-06-23) [i486-linux]


> I did something similar in one of my many versions. And it is the oo thin=
g
> to do.
> (I haven't actually benchmarked puts vs printf, I will do that ;)
>=20
> It's not faster on my system. That may be proof that profiling is a bad
> thing and that we all should prove our code mathematically (shudder).

If you look at the benchmark I provided, it is a bit slower on my
system too. But it is a lot prettier IMHO. I just tried rewriting the
thing in nice ruby, because thats what ruby is for. Otherwise just
write a c extension and do

require 'random.so"
or even better just use the builtin mersenne twister (Kernel#rand)



> [snip]
>=20
> I have a lot of white noise on my system, so I haven't used Rubys
> benchmark thingie much. It would be more useful to me if it could do
> matched pair sampling (is that the correct English term?) and/or throw
> away extreme values.
>=20

If the noise is really white, you just need to sample often enough. It
will average out.

hope to help,

Brian

--=20
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/


0
Reply ruby.brian (377) 9/11/2005 8:07:54 AM

You convinced me. I will make a nice looking version. It will still be  
faster then the originally submited code. Beating PHP and TCL isn't worth  
it, because it will still be much slower then Python ;-). The code should  
prove where Rubys strength is.

martinjansson$ ruby "random_benchmark.rb" 10 900000
                                     user     system      total        real
Original                        0.000000   0.020000 132.660000 (134.971980)
Rubiesque                       0.000000   0.010000  96.310000 ( 97.665882)
Ugly                            0.000000   0.010000  90.470000 ( 92.571550)

As for a better benchmarking tool. It might be a good next project for me.  
It shouldn't be that hard to take an array with pairs of textstrings and  
lambdas. Iterate over the array. time the lambdas. Iterate a couple of  
more times. Sum up the used time for each lambda and make a nice  
presentation.



On Sun, 11 Sep 2005 10:07:54 +0200, Brian Schr�der <ruby.brian@gmail.com>  
wrote:

> On 11/09/05, Martin Jansson <martialis@bigfoot.com> wrote:
>> On Sat, 10 Sep 2005 21:34:44 +0200, Brian Schr�der  
>> <ruby.brian@gmail.com>
>> [snip]
>>
>> Shroeder! What version of Ruby do you use?
>>
>
> $ ruby -v
> ruby 1.8.3 (2005-06-23) [i486-linux]
>
>
>> I did something similar in one of my many versions. And it is the oo  
>> thing
>> to do.
>> (I haven't actually benchmarked puts vs printf, I will do that ;)
>>
>> It's not faster on my system. That may be proof that profiling is a bad
>> thing and that we all should prove our code mathematically (shudder).
>
> If you look at the benchmark I provided, it is a bit slower on my
> system too. But it is a lot prettier IMHO. I just tried rewriting the
> thing in nice ruby, because thats what ruby is for. Otherwise just
> write a c extension and do
>
> require 'random.so"
> or even better just use the builtin mersenne twister (Kernel#rand)
>
>
>
>> [snip]
>>
>> I have a lot of white noise on my system, so I haven't used Rubys
>> benchmark thingie much. It would be more useful to me if it could do
>> matched pair sampling (is that the correct English term?) and/or throw
>> away extreme values.
>>
>
> If the noise is really white, you just need to sample often enough. It
> will average out.
>
> hope to help,
>
> Brian
>

0
Reply martialis (8) 9/11/2005 5:38:12 PM

Martin Jansson wrote:
> I recently started to learn Ruby. I thought that the obviously flawed
> Random
> benchmark at http://shootout.alioth.debian.org would make a good first
> project.
>
> I need some help with profiling as my system is a lot different from the
> shootout setup. My system is a iBook G4 with Mac OS 10.3 and Ruby 1.8. They
> use Linux on AMD with Ruby 1.9.
>
> The original implementation have two eyecatching flaws:
>   1) It use Floats instead of Fixnum
>   2) It uses a really costly way to get the first program parameter.
>
> After I corrected these two misstakes it used 30% less time and
> considerably less memory (at least half as much, hard to measure).
>
> Then I started from scratch and wrote my own version, doing a lot of
> experimentation. This is the result:
>
> # Code start
>
> module Kernel
>    $srandom = 42
>    def random(max)
>      (max * ($srandom = ($srandom * 3877 + 29573) % 139968)).quo(139968)
>    end
> end
>
> ($*.first.to_i-1).times do
>    random(100)
> end
>
> printf "%.9f\n", random(100)
>
> # Code end
>
>
> The most important qualities is (in order of saved time):
> * Use Fixnum for integer arithmetics. Much faster and with better
> precision.
> * Use ARGV.first instead of ARGV.shift to get first parameter
> * Use literals instead of constants. Ruby constants is no real constants,
>    they are global variables with A LOT of overhead. The specification
>    C-program use literals. An alternative would be to use
>    local variables.


"Each program should use symbolic constants (or whatever is closest) to
define the A, C, and M constants in the algorithm, not literal
constants."

http://shootout.alioth.debian.org/benchmark.php?test=random&lang=all&sort=fullcpu#about


> * Use $* instead of ARGV. Sigh. Ruby constants are really, reeeally
> slooooow.
> * Use times as iterator. As do the original solution.
> * Use a global variable to hold the seed. As do the original solution.
>    I can't find a sane way to keep the seed between method calls without
>    consuming a lot of precious time.
> * Wraped the function in module Kernel. For some odd reason this makes the
>    program somewhat faster. I really don't like it, but...
>
> I've also tried a nice Simula flavoured version an, not so nice,
> "oo"-version and a lot of really wacky versions.
> They are a lot faster then the original code, but slower then the above
> version.

0
Reply igouy (1005) 9/12/2005 5:53:11 AM

5 Replies
24 Views

(page loaded in 0.106 seconds)


Reply: