Pleased to announce the initial release of a "google_hash" gem.
Its goal. To boldly be faster than any hash hash before (cue star trek
TNG theme).
Or basically a better hash, either one that is faster or more space
efficient than ruby's default. To attempt this we wrap the google
sparse and dense hashes [1].
Speed results (populating/iterating over 500000 integers):
1.9.1p376 (mingw):
Hash (Ruby default)
0.359375 (populate)
1.1875 (each)
GoogleHashDense
0.1875 (populate)
0.078125 (each)
GoogleHashSparse
0.53125 (populate)
0.078125 (each)
Usage:
a = GoogleHashSparse.new
b = GoogleHashDense.new # or just GoogleHash.new
a[3] = 4
b[4] = 'abc'
a['abc'.hash] = 'some complex object'
# it only accepts int's currently--only because I'm too lazy to add more
types yet.
a.each{|k, v| }
Installation:
gem install google_hash (if on doze, you'll need the devkit installed)
Both these classes are currently more space efficient than a hash,
because they store keys as "native" ints, so the keys no longer affect
GC time, as well as only use 4 bytes instead of 20 (or 8 instead of 40,
on 64 bit). This should release some stress on the GC. In terms of
total memory usage, GoogleHashDense uses more (more buckets), and is
more speedy, and GoogleHashSparse uses less space, and is much more
memory efficient (2 bits per entry, or so I'm told).
This is meant to be one more tool in the rubyists toolbelt when trying
to optimize speed-wise, and plans to expand to more types, but at least
with this release it has a #each method.
If you have a desired use case let me know and I might well be able to
code it up for you.
Enjoy.
-r
[1] http://code.google.com/p/google-sparsehash
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
rogerpack2005 (1307)
|
12/15/2009 11:47:26 PM |
|
Roger Pack wrote:
> Pleased to announce the initial release of a "google_hash" gem.
>
>
> Both these classes are currently more space efficient than a hash,
> because they store keys as "native" ints, so the keys no longer affect
> GC time, as well as only use 4 bytes instead of 20 (or 8 instead of 40,
> on 64 bit). This should release some stress on the GC. In terms of
> total memory usage, GoogleHashDense uses more (more buckets), and is
> more speedy, and GoogleHashSparse uses less space, and is much more
> memory efficient (2 bits per entry, or so I'm told).
>
>
> [1] http://code.google.com/p/google-sparsehash
Neat. Does it keep the order in which the elements were entered? Is it
sortable?
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Aldric
|
12/16/2009 4:02:24 PM
|
|
Aldric Giacomoni wrote:
> Roger Pack wrote:
>> Pleased to announce the initial release of a "google_hash" gem.
> Neat. Does it keep the order in which the elements were entered? Is it
> sortable?
Nope. What would be the requirements for something you're looking for?
Shot wrote:
> I have a big need for a fast Set of integers/bignums
You just need fast iteration? Fast insertion?
Yeah it should be doable.
The google hash lib itself has both Hashes and Sets--I just have focused
on the Hash stuff as of yet, but hope to do them all (with more types)
eventually.
With just int (or double) you could reasonably easily store your keys
and/oor values as native.
With Bignum I'm not totally sure how they're stored in memory, but with
some hacking it is probably possible to "store them away" (i.e. store a
copy of them on entry so that Ruby is no longer memory tracking them).
What would the ideal be in terms of requirements? (oh and feel free to
checkout the code/fork/file issues -- http://github.com/rdp/google_hash
)
Enjoy.
-r
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Roger
|
12/16/2009 11:53:28 PM
|
|
Hi,
I'm not sure, but Judy (http://judy.sourceforge.net/) and it's Ruby
bindings (http://rjudy.sourceforge.net/) seem to be another candidate
for efficient hash implementations. I tried purple
(http://purple.rubyforge.org/), which is built upon Judy, too.
hth
ralf
|
|
0
|
|
|
|
Reply
|
Ralf
|
12/17/2009 7:09:14 AM
|
|
Roger Pack wrote:
> Aldric Giacomoni wrote:
>> Roger Pack wrote:
>>> Pleased to announce the initial release of a "google_hash" gem.
>
>> Neat. Does it keep the order in which the elements were entered? Is it
>> sortable?
>
> Nope. What would be the requirements for something you're looking for?
>
In Ruby 1.9, hashes remember in which order the data was entered (as
opposed to being "somehow sorted". Basically, a data structure with the
capabilities of the Dictionary in the Facets gem.
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Aldric
|
12/17/2009 1:52:54 PM
|
|
> Or basically a better hash, either one that is faster or more space
> efficient than ruby's default. To attempt this we wrap the google
> sparse and dense hashes [1].
As a note, also released 0.2.1 recently.
Changes:
Added RubyToRuby Hash [drop in replacement for the default hash].
added #keys and #values and #keys_combination_2 methods.
In general the :int => Ruby hashes are much faster than Ruby's hash
lookups.
The RubyToRuby Hash is a bit slower for insertion/lookup and far
faster for #each than the default hash.
new usage:
a = GoogleHashDenseRubyToRuby.new # or GoogleHash.new
b = GoogleHashDenseLongToRuby.new # a hash that is only :int => Ruby
b = GoogleHashSparseLongToRuby.new # a hash that is only :int => Ruby,
and uses less memory [Sparse]
a[3] = 4
b[4] = 'abc'
b['abc'.hash] = 'some complex object'
a.each{|k, v| ... }
a.keys => Array
a.values => Array
speed comparison:
1.9 mingw results
http://pastie.org/752318
1.9.2 linux:
http://pastie.org/752333
Enjoy.
-r
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Roger
|
12/23/2009 12:40:34 AM
|
|
> Pleased to announce the initial release of a "google_hash" gem.
> Speed results (populating/iterating over 500000 integers):
>
> 1.9.1p376 (mingw):
>
> Hash (Ruby default)
> 0.359375 (populate)
> 1.1875 (each)
>
> GoogleHashDense
> 0.1875 (populate)
> 0.078125 (each)
Released another update..
v 0.3.0
ChangeLog:
Now has several new types of hashes, ex:
GoogleHashDenseRubyToRuby # just like a normal Hash--you can store Ruby
values and Ruby keys.
GoogleHashDenseIntToInt # :int => :int only -- this one stores all its
keys and values a native, so ruby's GC doesn't ever touch them!
also has some slight speed improvements.
See http://github.com/rdp/google_hash
for more information.
Merry Christmas.
-r
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Roger
|
12/27/2009 1:05:58 AM
|
|
Roger Pack <rogerpack2005@gmail.com> wrote:
> Pleased to announce the initial release of a "google_hash" gem.
>
> Its goal. To boldly be faster than any hash hash before (cue star trek
> TNG theme).
>
>
> Or basically a better hash, either one that is faster or more space
> efficient than ruby's default. To attempt this we wrap the google
> sparse and dense hashes [1].
<snip>
> Both these classes are currently more space efficient than a hash,
> because they store keys as "native" ints, so the keys no longer affect
> GC time, as well as only use 4 bytes instead of 20 (or 8 instead of 40,
> on 64 bit). This should release some stress on the GC. In terms of
> total memory usage, GoogleHashDense uses more (more buckets), and is
> more speedy, and GoogleHashSparse uses less space, and is much more
> memory efficient (2 bits per entry, or so I'm told).
Hi Roger,
Any chance of having one of these can become the default MRI Hash
implementation, or even replace most MRI-internal uses of st.c?
Much of the st.c stuff inside MRI uses numeric keys.
--
Eric Wong
|
|
0
|
|
|
|
Reply
|
Eric
|
12/27/2009 2:25:47 AM
|
|
> Hi Roger,
>
> Any chance of having one of these can become the default MRI Hash
> implementation, or even replace most MRI-internal uses of st.c?
> Much of the st.c stuff inside MRI uses numeric keys.
I'm sure it would be possible--and possibly not even terribly hard--the
only concerns would be licensing, and the fact that it's written in C++
instead of C.
That being said, I could create a wrapper for it that takes a file and
converts its hashes "automagically" into GoogleHash.new's...
if desired.
-r
--
Posted via http://www.ruby-forum.com/.
|
|
0
|
|
|
|
Reply
|
Roger
|
12/27/2009 3:05:19 AM
|
|
|
8 Replies
139 Views
(page loaded in 0.204 seconds)
Similiar Articles: 3D model of banana - comp.soft-sys.matlab... if nargin<4 || isempty(ct) ct=-1+0*0.5*r; end %Collapse points within 0.5 r of each other npoints=1 ... code for 20 questions game powered by ANN ... Neural network using matlab - comp.ai.neural-nets... would have a input vector of size 50? with a 1 to represent each existing symptoms and a 0 to ... project on face recognition using artificial neural network - comp ... dear ... To display a 3D skeleton, what do I need to learn? - comp.graphics ...... anne (263) 1/20/2006 4:51:59 PM ... at the same time when I had more of a grounding in each. ... 1. showing objects with opengl, which I can figure ... tanh or logistic - comp.ai.neural-nets... is the MSE obtained when each output is fixed at it's prior probability value. This yields R^2 = 1-MSE/MSE00 >= 0.99. ... What is regression in Artificial Neural Network (ANN ... Submitting parallel jobs to local scheduler - comp.soft-sys.matlab ...The problem here is that each of those image arrays that gets passed off to the ... job = createJob(); > > for i=1:nChunks > tasks{i} = createTask(@reconChunk, 0 ... kohonen matlab - comp.soft-sys.matlab... random training sample of 100 records where each ... slider to fix the threshold value from 0 to 255 ... What is regression in Artificial Neural Network (ANN) ? - comp ... ... print setup saves the printer - comp.databases.filemaker ...In article <dm4955$jph$1@nwrdmz03.dmz.ncs.ea.ibs-infra ... rnmenegaux@free.fr> wrote in message news:4386dd2d$0$ ... behaviour and do i have to make a setup script for each ... comp.lang.ruby - page 23attr_accessor calls method_added twice for each method in 1.8 0 5 (7/18/2003 4:41:29 ... Re: [Ann] OSSL 0.2.0-pre3 #2 1 6 (7/17/2003 7:28:09 AM) Hello, Gotou-san could you ... Nueral Net for image classification. - comp.soft-sys.matlab ...... NN to output a vector target of [0 0 0 0 0 1 0 0 ... 199 for training) of image 20x50 pixels for each ... using neural ... input.I am going to use Artificial neural network for ... Caller ID Feature 811 not working - comp.dcom.sys.nortel ...... want to make sure that the Call Log has been turned on and the space allocated for each ... You must have a Rls. 1.0 through Rls. 7 ... [ANN] Youseful TAP PLUGIN - comp ... 0.999... - Wikipedia, the free encyclopediaMultiplication of 9 times 1 produces 9 in each digit, so 9 × 0.111… equals 0.999… and 9 ... A sequence (x 0, x 1, x 2, ...) has a limit x if the distance |x − x n | becomes ... Fibonacci number - Wikipedia, the free encyclopediaBy definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each ... If one traces the ancestry of any male bee (1 bee), he has 1 parent (1 bee), 2 ... 7/23/2012 3:04:10 AM
|