Strread, I can't get it to work

  • Follow


I'm trying to read in data in the form of a BIG cell array of strings (2million lines).
A few lines for the example:

'"7";"1";"01-01-2008 07:00:00";"0,0000"'
'"8";"1";"01-01-2008 08:00:00";"0,0000"'
'"9";"1";"01-01-2008 09:00:00";"8639898,0000"'
'"10";"1";"01-01-2008 10:00:00";"10113328,6400"'

If I use strtok, it takes about 10 minutes. Is there a faster way to do this? I read here that strread might be the solution, but I cannot get it to work properly.
My best shot is something like this:

[nIndex,nStreet,sDate{:},nEnergy1,nEnergy2] = strread(a{:},'"%u" "%u" %q "%d,%u"',10,'delimiter',';');

What I would like to have:
nIndex : integer array of index values
nStreet: integer array of street values
sDate: datenumbers, but cell array of date-strings is also OK
nEnergy: energy value. But in Dutch we use the comma as the decimal separator..

As for the dates, 23 hour values are always like this:
datenum(sDate{i}, 'dd-mm-yyyy HH:MM:SS')
and the 24th one is alway like this:
datenum(sDate{i}, 'dd-mm-yyyy')

Most importantly, i'd like strread to work. 
0
Reply Lennart 12/3/2009 12:15:24 PM

"Lennart van Luijk" <MyFirstNameHere.van.luijk@gmail.com> wrote in message <hf8a4s$t$1@fred.mathworks.com>...
> I'm trying to read in data in the form of a BIG cell array of strings (2million lines).
> A few lines for the example:
> 
> '"7";"1";"01-01-2008 07:00:00";"0,0000"'
> '"8";"1";"01-01-2008 08:00:00";"0,0000"'
> '"9";"1";"01-01-2008 09:00:00";"8639898,0000"'
> '"10";"1";"01-01-2008 10:00:00";"10113328,6400"'
> 
> If I use strtok, it takes about 10 minutes. Is there a faster way to do this? I read here that strread might be the solution, but I cannot get it to work properly.
> My best shot is something like this:
> 
> [nIndex,nStreet,sDate{:},nEnergy1,nEnergy2] = strread(a{:},'"%u" "%u" %q "%d,%u"',10,'delimiter',';');
> 
> What I would like to have:
> nIndex : integer array of index values
> nStreet: integer array of street values
> sDate: datenumbers, but cell array of date-strings is also OK
> nEnergy: energy value. But in Dutch we use the comma as the decimal separator..
> 
> As for the dates, 23 hour values are always like this:
> datenum(sDate{i}, 'dd-mm-yyyy HH:MM:SS')
> and the 24th one is alway like this:
> datenum(sDate{i}, 'dd-mm-yyyy')
> 
> Most importantly, i'd like strread to work. 


Here is suggestion:

% your cell array of strings
C= {'"7";"1";"01-01-2008 07:00:00";"0,0000"' ;
'"8";"1";"01-01-2008 08:00:00";"0,0000"' ;
'"9";"1";"01-01-2008 09:00:00";"8639898,0000"' ;
'"10";"1";"01-01-2008 10:00:00";"10113328,6400"'}

% make into a single long string
C2 = sprintf('%s;',C{:}) ; 
% replace commas by dots 
C2 = strrep(C2,',','.') ;
% read, using the appropriate symbols as delimiter and whitespace
[nIndex, nStreet,sDate,nEnergy] = strread(C2,'%d%d%s%f','delimiter',';','whitespace',' "')

hth
Jos
0
Reply Jos 12/3/2009 12:45:08 PM


Thanks for the tip,

but this gives me an Out Of Memory error on a laptop with windows XP and 4GB of RAM. Also after i use the 'pack' command.

The procedure seems to work for smaller sets of data.
Any other ideas?
0
Reply Lennart 12/3/2009 2:18:20 PM

"Lennart van Luijk" <MyFirstNameHere.van.luijk@gmail.com> wrote in message <hf8hbc$sus$1@fred.mathworks.com>...
> Thanks for the tip,
> 
> but this gives me an Out Of Memory error on a laptop with windows XP and 4GB of RAM. Also after i use the 'pack' command.
> 
> The procedure seems to work for smaller sets of data.
> Any other ideas?

If you read help in strread you will notice info that the textscan function is intended replace strread (textread). One of the issuse is memory problem and other is that strread is time costing. 
Try to use textscan function instead.

Branko
0
Reply Branko 12/3/2009 3:13:04 PM

3 Replies
283 Views

(page loaded in 0.059 seconds)

Similiar Articles:













7/28/2012 4:22:38 PM


Reply: