Reading ASCII text file with variable number of columns

  • Follow


Hi,

I have an all-string, tab-delimited ASCII text file of the form:
% Begin File %
Xaxis1            Xaxis2       XaxisN
Name1           Name2      NameN
File1.txt          File2.txt     FileN.txt
File1a.txt        NaN           FileNa.txt
NaN               NaN           FileNn.txt
% End File %

Where:
Xaxis1, 2, N = strings which will represent the Xaxis labels (
Name1, 2, N = strings represent the overal "name" of the plot
File1, 2, N.txt = data file names of the actual numerical data
File1a, 2a, Nn.txt = data file names of more of the actual numerical
data
NaN = just an "NaN" placeholder in spots where there are no more data
file names

I'd like to read this file into MATLAB.  Keep in mind that the number
of columns can vary (there will be "N" number of columns).

Likewise, the number of data file names can vary (some Xaxis columns
might have only 1 data file name, others might have "n" data file
names).

I set it up such that when there is an uneven amount of rows, the
blank spaces are filled in with NaN.

How can I read this into MATLAB when I don't know how many columns
there will be ahead of time?

Also, once it is read in, I would like to set "Xaxis" equal to a
variable representing the Xaxis of the plot.  I would also like to go
out & open each data file (because this is where I actually get the
numerical data to make each plot).

Any help would be appreciated.

Thanks,
Rob
0
Reply robert.p.brookover (41) 9/30/2010 3:01:22 AM

On Sep 29, 8:01=A0pm, Rob <robert.p.brooko...@gmail.com> wrote:
> Hi,
>
> I have an all-string, tab-delimited ASCII text file of the form:
> % Begin File %
> Xaxis1 =A0 =A0 =A0 =A0 =A0 =A0Xaxis2 =A0 =A0 =A0 XaxisN
> Name1 =A0 =A0 =A0 =A0 =A0 Name2 =A0 =A0 =A0NameN
> File1.txt =A0 =A0 =A0 =A0 =A0File2.txt =A0 =A0 FileN.txt
> File1a.txt =A0 =A0 =A0 =A0NaN =A0 =A0 =A0 =A0 =A0 FileNa.txt
> NaN =A0 =A0 =A0 =A0 =A0 =A0 =A0 NaN =A0 =A0 =A0 =A0 =A0 FileNn.txt
> % End File %
>
> Where:
> Xaxis1, 2, N =3D strings which will represent the Xaxis labels (
> Name1, 2, N =3D strings represent the overal "name" of the plot
> File1, 2, N.txt =3D data file names of the actual numerical data
> File1a, 2a, Nn.txt =3D data file names of more of the actual numerical
> data
> NaN =3D just an "NaN" placeholder in spots where there are no more data
> file names
>
> I'd like to read this file into MATLAB. =A0Keep in mind that the number
> of columns can vary (there will be "N" number of columns).
>
> Likewise, the number of data file names can vary (some Xaxis columns
> might have only 1 data file name, others might have "n" data file
> names).
>
> I set it up such that when there is an uneven amount of rows, the
> blank spaces are filled in with NaN.
>
> How can I read this into MATLAB when I don't know how many columns
> there will be ahead of time?
>
> Also, once it is read in, I would like to set "Xaxis" equal to a
> variable representing the Xaxis of the plot. =A0I would also like to go
> out & open each data file (because this is where I actually get the
> numerical data to make each plot).
>
> Any help would be appreciated.
>
> Thanks,
> Rob

P.S.  I was looking at some of the commands, for example "fscanf" -
and I can't use that because it expects one to know the format of the
file.  Since I can have a variable set of columns, and a variable set
of rows (some Xaxis might have more than 1 data file name) - I cannot
know ahead of time the exact format of the file.

Please point me in the direction of the proper command to use, and
I'll try to adapt it to my specific needs.

Thanks,
Rob
0
Reply robert.p.brookover (41) 9/30/2010 1:55:28 PM


I developed this function, and it pretty much does what I want, but
the question is, how do I store ALL the lines of text into
"data" (right now, it only stores the last line of text into the
variable "data")??

function data = text_file_read(filename)

fid = fopen(filename)
  if fid == -1
    error(['Error Opening ',filename]);
  end

  TextLine = fgetl(fid)
  while ~feof(fid)

    % Set Line of Text = "data" variable
    data = TextLine

    % get next line of text
    TextLine = fgetl(fid)
  end

  fclose(fid);

0
Reply Rob 9/30/2010 5:58:03 PM

Rob wrote:
> I developed this function, and it pretty much does what I want, but
> the question is, how do I store ALL the lines of text into
> "data" (right now, it only stores the last line of text into the
> variable "data")??
> 
> function data = text_file_read(filename)

   idx=0;

> fid = fopen(filename)
>   if fid == -1
>     error(['Error Opening ',filename]);
>   end

>   while ~feof(fid)
       idx=idx+1;
       data{idx} = fgetl(fid);
>   end
>   fclose(fid);
> 

data will be cell array...

--
0
Reply dpb 9/30/2010 6:17:42 PM

Dpb - thank you - that does indeed store the data into a cell array.

However, when I type:  data{1} or data{1,1} it produces:

Xaxis1            Xaxis2       XaxisN

all as one "entity" (cell)?

How can I access each one individually (e.g, if I want "Xaxis1" or
"XaxisN")?

Thank  you again!
Rob

> > function data =3D text_file_read(filename)
>
> =A0 =A0idx=3D0;
>
> > fid =3D fopen(filename)
> > =A0 if fid =3D=3D -1
> > =A0 =A0 error(['Error Opening ',filename]);
> > =A0 end
> > =A0 while ~feof(fid)
>
> =A0 =A0 =A0 =A0idx=3Didx+1;
> =A0 =A0 =A0 =A0data{idx} =3D fgetl(fid);
>
> > =A0 end
> > =A0 fclose(fid);
>
> data will be cell array...
>
> --

0
Reply robert.p.brookover (41) 9/30/2010 8:15:59 PM

Rob wrote:
....top posting corrected--please don't; makes hard follow conversation to...

>>> function data = text_file_read(filename)
>>    idx=0;
>>
>>> fid = fopen(filename)
>>>   if fid == -1
>>>     error(['Error Opening ',filename]);
>>>   end
>>>   while ~feof(fid)
>>        idx=idx+1;
>>        data{idx} = fgetl(fid);
>>
>>>   end
>>>   fclose(fid);
>> data will be cell array...
>>

 > Dpb - thank you - that does indeed store the data into a cell array.
 >
 > However, when I type:  data{1} or data{1,1} it produces:
 >
 > Xaxis1            Xaxis2       XaxisN
 >
 > all as one "entity" (cell)?
 >
 > How can I access each one individually (e.g, if I want "Xaxis1" or
 > "XaxisN")?
....

That's the result of using fgetl() and reading each line independently.

You'll have to parse the fields either during the initial read or from 
the cell array.  (The old saw "Either pay me now or pay me later" comes 
to mind... :) )

For ASCII strings there's the simple-minded solution

 >> d{1}='Xaxis1            Xaxis2       XaxisN';
 >> words=tokens(d{1})
words =
Xaxis1
Xaxis2
XaxisN
 >>

You can also parse w/ sscanf() or regexp or other tools; what, 
specifically, depends on what you want or need to do w/ the results.

--
0
Reply dpb 9/30/2010 8:35:25 PM

dpb wrote:
....

> You can also parse w/ sscanf() or regexp or other tools; what, 
> specifically, depends on what you want or need to do w/ the results.

But, forgot the one you'll perhaps find most useful, at least for the 
nonnumeric entries...

 > c=strread(d{1},'%s')
c =
     'Xaxis1'
     'Xaxis2'
     'XaxisN'
 >> c{1}
ans =
Xaxis1
 >>

--
0
Reply dpb 9/30/2010 9:01:50 PM

On Sep 30, 1:35=A0pm, dpb <n...@non.net> wrote:
> Rob wrote:
>
> ...top posting corrected--please don't; makes hard follow conversation to=
....
>
> >>> function data =3D text_file_read(filename)
> >> =A0 =A0idx=3D0;
>
> >>> fid =3D fopen(filename)
> >>> =A0 if fid =3D=3D -1
> >>> =A0 =A0 error(['Error Opening ',filename]);
> >>> =A0 end
> >>> =A0 while ~feof(fid)
> >> =A0 =A0 =A0 =A0idx=3Didx+1;
> >> =A0 =A0 =A0 =A0data{idx} =3D fgetl(fid);
>
> >>> =A0 end
> >>> =A0 fclose(fid);
> >> data will be cell array...
>
> =A0> Dpb - thank you - that does indeed store the data into a cell array.
> =A0>
> =A0> However, when I type: =A0data{1} or data{1,1} it produces:
> =A0>
> =A0> Xaxis1 =A0 =A0 =A0 =A0 =A0 =A0Xaxis2 =A0 =A0 =A0 XaxisN
> =A0>
> =A0> all as one "entity" (cell)?
> =A0>
> =A0> How can I access each one individually (e.g, if I want "Xaxis1" or
> =A0> "XaxisN")?
> ...
>
> That's the result of using fgetl() and reading each line independently.
>
> You'll have to parse the fields either during the initial read or from
> the cell array. =A0(The old saw "Either pay me now or pay me later" comes
> to mind... :) )
>
> For ASCII strings there's the simple-minded solution
>
> =A0>> d{1}=3D'Xaxis1 =A0 =A0 =A0 =A0 =A0 =A0Xaxis2 =A0 =A0 =A0 XaxisN';
> =A0>> words=3Dtokens(d{1})
> words =3D
> Xaxis1
> Xaxis2
> XaxisN
> =A0>>
>
> You can also parse w/ sscanf() or regexp or other tools; what,
> specifically, depends on what you want or need to do w/ the results.
>
> --

OK, thank you.

So, that's the thing - I haven't used MATLAB in about 5 years, so I'm
a bit out of the loop when it comes to remembering what's the "best"
way.  I've even seen that there's OO in MATLAB (I think it was just
gaining popularity when I got away from it).

At any rate, in the data file example I gave, Row 1 (ie, Xaxis1,
Xaxis2, XaxisN) will be the label of the X axis for each plot ("N"
number of plots, depending on however many columns there are).  Row 2
(ie, Name1, Name2, NameN) will be the "title" of each of those plots.
Rows 3 - N (File1.txt, File2.txt, FileN.txt, File1a.txt, FileNa.txt,
FileNn.txt) are single column, numeric data ASCII files.  Ultimately,
I'd like to be able to access the numeric data in each of the
"FileX.txt" files.

For column 1, I would like to plot that data against a predetermined
Xaxis (this is given to me in Row 1 for that column).  In the Column 1
example, I have 2 data files (File1.txt, File1a.txt).  I then want to
make a subplot and plot the numeric data residing in File1a.txt on the
same plot.

For column 2, this would be a completely new plot, and it would only
have data from File2.txt.

Finally, in my example, column 3 is the last column.  However, Column
3 has 3 data files (FileN.txt, FileNa.txt, FileNn.txt).  So, I would
have 1 main figure with 3 subplots for this column.

Now, I can have the person who is populating this file for me to just
list everything in 1 column, separating the information for the next
plot by a space (that's how it was originally given to me, but I asked
him to change it to a columnar format, and to fill any blank spaces
where there are uneven columns with "NaN").  I thought that it would
be easier in MATLAB to try to read this file in as a columnar, tab
delimited file.

So, what is the best way to be able to read this file, then access the
numeric data which resides in each of the "FileX.txt" files & plot it
accordingly?

Again, I'm sorry if I'm asking rudimentary questions, but as I said,
I'm trying to jump back on the bicycle of MATLAB after having been
away from it for several years, so please bear with me.

Thanks again,
Rob
0
Reply robert.p.brookover (41) 9/30/2010 9:18:19 PM

Rob wrote:
....

> Now, I can have the person who is populating this file for me to just
> list everything in 1 column, separating the information for the next
> plot by a space (that's how it was originally given to me, but I asked
> him to change it to a columnar format, and to fill any blank spaces
> where there are uneven columns with "NaN").  I thought that it would
> be easier in MATLAB to try to read this file in as a columnar, tab
> delimited file.
....

That'll work, I'd ask for CSV instead of tab-delimited if it's all the 
same, though...

If you do that then textread (older versions, deprecated) or textscan 
should work well.

Whichever way you go, whether you choose NaN for missing values or 
simply empty fields, I'd strongly recommend building the file as a 
square, regular file rather than w/ "ragged" records.  Those are by far 
the hardest to deal with w/ Matlab and whatever you do, do _NOT_ use 
fixed column, no delimiter if there's even a hint of a missing value.

--
0
Reply dpb 9/30/2010 9:49:00 PM

On Sep 30, 2:49=A0pm, dpb <n...@non.net> wrote:
>
> Whichever way you go, whether you choose NaN for missing values or
> simply empty fields, I'd strongly recommend building the file as a
> square, regular file rather than w/ "ragged" records. =A0Those are by far
> the hardest to deal with w/ Matlab and whatever you do, do _NOT_ use
> fixed column, no delimiter if there's even a hint of a missing value.
>
> --

what do you mean "square, regular file rather than w/ "ragged"
records?"

Basically, I can have the info given to me in any format I want, as my
coworker is the one who is generating the file, and he's doing it from
Perl.

I want to clarify something though - what I meant was that originally
he gave me the information like this:

Xaxis1
Name1
File1.txt
File1a.txt

Xaxis2
Name2
File2.txt

XaxisN
NameN
FileN.txt
FileNa.txt
FileNn.txt

Where everything was listed in one long column.

0
Reply Rob 9/30/2010 10:00:41 PM

Rob wrote:
> On Sep 30, 2:49 pm, dpb <n...@non.net> wrote:
>> Whichever way you go, whether you choose NaN for missing values or
>> simply empty fields, I'd strongly recommend building the file as a
>> square, regular file rather than w/ "ragged" records.  Those are by far
>> the hardest to deal with w/ Matlab and whatever you do, do _NOT_ use
>> fixed column, no delimiter if there's even a hint of a missing value.
>>
>> --
> 
> what do you mean "square, regular file rather than w/ "ragged"
> records?"

Every record contains the same number of fields (or at least there are a 
series of blocks of such) not a variable number of fields/record (a 
fixed font, evenly formatted regular file would have even lines when 
viewed whereas a record w/ variable number of fields would have a jagged 
appearance--hence, "ragged").

...

> Where everything was listed in one long column.

I grok that -- the regular file will be easier...

--
0
Reply dpb 9/30/2010 11:15:06 PM

10 Replies
549 Views

(page loaded in 0.182 seconds)

Similiar Articles:













7/21/2012 11:03:49 PM


Reply: