Find and count frequencies of sequences

  • Follow


Hello all.

I have a column of data of several hundred elements long consisting of 1s and 4s, like this e.g.;

a = 1
4
4
4
4
1
1
4
1
4
....

I need to obtain the total frequency for the appearance of several sequences of combinations of 1 and 4, i.e. how many times do sequences "1, 4", "1, 1, 4", "1, 1, 1, 4", and so on, appear in the column.

Any suggestions? Thanks in advance.
Iain
0
Reply Iain 2/4/2011 11:29:07 AM

Hi, if the sequences that you want to check is finite try this; you can get other ideas from it good luck:
a=[
4
4
4
4
1
1
4
1
4];
count=0;
for i=3:size(a);
    if a(i-2:i)==[1,1,4]';
        count=count+1;
    end 
end 
% the "count" value now has the frequency of [1 1 4]' in a
0
Reply stu 2/4/2011 8:14:05 AM


Iain:
I don't think you really want to do this.  Think about it.  There are
millions of different possible patterns of a variety of different
lengths from 2 to N (the number of elements).  I don't think you'd
really want them all even if you could get them from some brute force
algorithm (which is possible though).  If you could narrow it down to
a finite number of predefined sequences then you could do it.  For
example you want sequences of only up to 5 digits long.  You could
then use filter or imfilter with the pattern you're looking for and
then look for a particular value, i.e. sum(pattern .* pattern), in the
filtered output.

0
Reply ImageAnalyst 2/4/2011 12:54:17 PM

stu <tmnarges2@live.utm.my> wrote in message <895557605.43596.1296825277597.JavaMail.root@gallium.mathforum.org>...
> Hi, if the sequences that you want to check is finite try this; you can get other ideas from it good luck:
> a=[
> 4
> 4
> 4
> 4
> 1
> 1
> 4
> 1
> 4];
> count=0;
> for i=3:size(a);
>     if a(i-2:i)==[1,1,4]';
>         count=count+1;
>     end 
> end 
> % the "count" value now has the frequency of [1 1 4]' in a

Thanks for these suggestions. 

Apologies ImageAnalyst for lack of clarity in original post, I didn't mean to imply 2 to N sequence patterns - the max length of sequence will likely be around 10.  

stu - that idea is working very well (so far!), many thanks.
0
Reply Iain 2/4/2011 3:36:03 PM

"Iain" wrote in message <iignu3$qg$1@fred.mathworks.com>...
> Hello all.
> 
> I have a column of data of several hundred elements long consisting of 1s and 4s, like this e.g.;
> 
> a = 1
> 4
> 4
> 4
> 4
> 1
> 1
> 4
> 1
> 4
> ...
> 
> I need to obtain the total frequency for the appearance of several sequences of combinations of 1 and 4, i.e. how many times do sequences "1, 4", "1, 1, 4", "1, 1, 1, 4", and so on, appear in the column.
> 
> Any suggestions? Thanks in advance.
> Iain


Take a look at STRFIND and CELLFUN

C = [1 1 4 1 1 4 1 1  4 1 4 4 4 1 4 1 4 1 1].' ; % column vector
seq = {[1 4 1],[1 4],[4 1 1 4],[9 9 9]} % sequences
N = cellfun(@(x) numel(strfind(C(:).',x(:).')), seq) 

Note that strfind requires row vectors.

hth
Jos
0
Reply Jos 2/4/2011 4:02:03 PM

4 Replies
320 Views

(page loaded in 0.085 seconds)

Similiar Articles:













7/20/2012 7:56:11 PM


Reply: