binning data using a table

  • Follow


I have a given input and need to know which "bin" of a table it fits into. For example, if I had a table [0 1 2 3 4 5 6 7 8 9] and my input was 3.14, the function would return 4 because it would be in the 4th bin (between the 4th and 5th table entries).

Does a function like this exist in matlab or do I need to write one? I am hoping for a built in function because I will be performing this sort of lookup on very large tables with very large datasets so I want to avoid any sort of looping through the table each time. 

Thanks!
0
Reply Alex 3/18/2010 8:01:21 PM

Yes. HISTC
0
Reply Matt 3/18/2010 8:10:21 PM


I see, so I can use max like:

 [a b]=max(histc(val,table))

and then b is the value that I actually want. 

OK, Thanks!

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hnu1bd$qem$1@fred.mathworks.com>...
> Yes. HISTC
0
Reply Alex 3/18/2010 8:26:04 PM

Alex wrote:
> I have a given input and need to know which "bin" of a table it fits 
> into. For example, if I had a table [0 1 2 3 4 5 6 7 8 9] and my input 
> was 3.14, the function would return 4 because it would be in the 4th bin 
> (between the 4th and 5th table entries).

> Does a function like this exist in matlab or do I need to write one? I 
> am hoping for a built in function because I will be performing this sort 
> of lookup on very large tables with very large datasets so I want to 
> avoid any sort of looping through the table each time.

a combination of ismember() and interp1 with the 'nearest' option might do the 
  trick for you. You would want to do timing tests to see whether that works 
faster than histc() .

In the example you show, the bins are equally spaced apart. Will that be the 
general case? If so then there are faster methods.

Also, is the table in sorted order? You sort of imply it is, but you don't 
actually say: you talk about the value being "between" the bin values, and 
that's something that can be true even if the values are not in sorted order.
0
Reply Walter 3/18/2010 9:24:47 PM

Walter Roberson <roberson@hushmail.com> wrote in message <hnu5n2$1j0$1@canopus.cc.umanitoba.ca>...
>
> 
> a combination of ismember() and interp1 with the 'nearest' option might do the 
>   trick for you. You would want to do timing tests to see whether that works 
> faster than histc() .

No way. HISTC is carefully implemented b TMW as one of the fastest command. INTERP1 is anything but fast.

Bruno
0
Reply Bruno 3/18/2010 9:36:04 PM

The table will be sorted, but not linearly spaced. Most likely it will be generated using logspace. is that a special case that has a fast method?

Walter Roberson <roberson@hushmail.com> wrote in message <hnu5n2$1j0$1@canopus.cc.umanitoba.ca>...
> Alex wrote:
> > I have a given input and need to know which "bin" of a table it fits 
> > into. For example, if I had a table [0 1 2 3 4 5 6 7 8 9] and my input 
> > was 3.14, the function would return 4 because it would be in the 4th bin 
> > (between the 4th and 5th table entries).
> 
> > Does a function like this exist in matlab or do I need to write one? I 
> > am hoping for a built in function because I will be performing this sort 
> > of lookup on very large tables with very large datasets so I want to 
> > avoid any sort of looping through the table each time.
> 
> a combination of ismember() and interp1 with the 'nearest' option might do the 
>   trick for you. You would want to do timing tests to see whether that works 
> faster than histc() .
> 
> In the example you show, the bins are equally spaced apart. Will that be the 
> general case? If so then there are faster methods.
> 
> Also, is the table in sorted order? You sort of imply it is, but you don't 
> actually say: you talk about the value being "between" the bin values, and 
> that's something that can be true even if the values are not in sorted order.
0
Reply Alex 3/19/2010 12:29:26 PM

"Alex" <abarrie@meicompany.com> wrote in message <hnvqn6$c47$1@fred.mathworks.com>...
> The table will be sorted, but not linearly spaced. Most likely it will be generated using logspace. is that a special case that has a fast method?

Sorted -> HISTC
Linearly spaced -> Use FLOOR and the rescaled data then ACCUMARRAY
Geometrically space -> Just take a log then you falls to the previous case

Bruno
0
Reply Bruno 3/19/2010 12:47:04 PM

Alex wrote:
> The table will be sorted, but not linearly spaced. Most likely it will 
> be generated using logspace. is that a special case that has a fast method?

Previous replies about histc and accumarray have assumed that you want 
to count the number that fall into each bin -- which might be true but 
is not something that you have indicated in your problem description.

If you had a linear range, then you would take the value, subtract the 
minimum value, divide that result by the width of the bin, floor, and 
add 1 if you want the first bin to be numbered 1 instead of 0.

When you have values generated by logspace, then the log of the values 
are evenly spaced, so you would take the log of the value, subtract the 
log of the minimum value, divide that result by the log-width of the 
bin, floor, and add 1 if you want the first bin to be numbered 1 instead 
of 0.
0
Reply Walter 3/19/2010 2:35:33 PM

7 Replies
492 Views

(page loaded in 0.042 seconds)

Similiar Articles:












7/22/2012 12:35:52 AM


Reply: