I want to read a file that is organized with semicolons (;) as a delimeter. The file is organized into 4 columns where each row contains a variable name, value, units, in/out designation.
Example:
Variable.With.Various.Unique.Levels.name1;Value;Units;in
Variable.With.Various.Unique.Levels.name2;Value;Units;out
Value could be anything
1
10946.0386113404
19173.112909708834,0,-722.71884819229354
"2006.6,0; 2000.60,131.63; 1982.65,272.563;" <-- This one is my problem
or any string
Units will always be a string
The last column will always be a string, 'in' or 'out'
I would just use textscan with ; as delim, but Value can sometimes have ; that are between double quotes. " " Making textscan harder to use. I would like anything inside the double quotes to be read as a string to be processed later.
Ideas?
Please ask for clarification if needed.
|
|
0
|
|
|
|
Reply
|
Camron
|
5/26/2010 7:21:04 PM |
|
On May 27, 7:21=A0am, "Camron Call" <camronc...@gmail.cam> wrote:
> I want to read a file that is organized with semicolons (;) as a delimete=
r. =A0The file is organized into 4 columns where each row contains a variab=
le name, value, units, in/out designation. =A0
>
> Example:
>
> Variable.With.Various.Unique.Levels.name1;Value;Units;in
> Variable.With.Various.Unique.Levels.name2;Value;Units;out
>
> Value could be anything
> 1
> 10946.0386113404
> 19173.112909708834,0,-722.71884819229354
> "2006.6,0; 2000.60,131.63; 1982.65,272.563;" =A0 =A0 =A0<-- This one is m=
y problem
> or any string
>
> Units will always be a string
> The last column will always be a string, 'in' or 'out'
>
> I would just use textscan with ; as delim, but Value can sometimes have ;=
that are between double quotes. " " =A0Making textscan harder to use. =A0I=
would like anything inside the double quotes to be read as a string to be =
processed later.
>
> Ideas?
>
> Please ask for clarification if needed. =A0
So why not read everything in as a string?
BTW, your examples are as clear as mud.
The numerical example does not seem to match:
Variable.With.Various.Unique.Levels.name1;Value;Units;in
Can you just show us a few lines from the file without editorial
comments?
|
|
0
|
|
|
|
Reply
|
TideMan
|
5/26/2010 8:54:21 PM
|
|
The rows are formatted with:
variable name ; value ; units ; in/out
The data is a text file with rows like the following:
Aircraft.Mass.Calibration.Name1;1;;in
Aircraft.Mass.Calibration.Name2;1;;in
Aircraft.Mass.Calibration.Name3;12345.6789;lb;in
Aircraft.Mass.Certification.Name4;123456;lb;in
Aircraft.Mass.Certification.Name5;12345;lb;in
Aircraft.Mass.Design.Category1.Name6;12345.6789,0,-1234.92;;out
Aircraft.Mass.Design.Name7;12345.678;lb;out
Aircraft.Operations.Name8;1.234;;in
Aircraft.Operations.Name9;123;kts;in
Aircraft.Operations.Name10;1.2;;in
Aircraft.Components.Category2.Name11;7556.63837287839;gal;out
Aircraft.Components.Category3.Category4.Name12;"2006.6,0;2000.60,131.63; ";;out
FlightPerformance.DesignMission.Category5.Name13;True;;in
FlightPerformance.DesignMission.Category6.Name14;"10; 30; 50; 70; 90; 110; ";;in
I wondered if there is any way to get this data into the workspace without using fgetl in a loop and parsing each line individually. The final goal is to preserve the variable hierarchy in a cell array, or structure or dataset or something and be able to access the variable name, value, units, and in/out.
|
|
0
|
|
|
|
Reply
|
Camron
|
5/26/2010 11:26:06 PM
|
|
On May 27, 11:26=A0am, "Camron Call" <camronc...@gmail.cam> wrote:
> The rows are formatted with:
> variable name ; value ; units ; in/out
>
> The data is a text file with rows like the following:
>
> Aircraft.Mass.Calibration.Name1;1;;in
> Aircraft.Mass.Calibration.Name2;1;;in
> Aircraft.Mass.Calibration.Name3;12345.6789;lb;in
> Aircraft.Mass.Certification.Name4;123456;lb;in
> Aircraft.Mass.Certification.Name5;12345;lb;in
> Aircraft.Mass.Design.Category1.Name6;12345.6789,0,-1234.92;;out
> Aircraft.Mass.Design.Name7;12345.678;lb;out
> Aircraft.Operations.Name8;1.234;;in
> Aircraft.Operations.Name9;123;kts;in
> Aircraft.Operations.Name10;1.2;;in
> Aircraft.Components.Category2.Name11;7556.63837287839;gal;out
> Aircraft.Components.Category3.Category4.Name12;"2006.6,0;2000.60,131.63; =
";;out
> FlightPerformance.DesignMission.Category5.Name13;True;;in
> FlightPerformance.DesignMission.Category6.Name14;"10; 30; 50; 70; 90; 110=
; ";;in
>
> I wondered if there is any way to get this data into the workspace withou=
t using fgetl in a loop and parsing each line individually. =A0The final go=
al is to preserve the variable hierarchy in a cell array, or structure or d=
ataset or something and be able to access the variable name, value, units, =
and in/out.
Aah, I see the problem now.
The Value column is fairly chaotic, isn't it?
Sometimes it has a number - that's easy.
Sometimes it has text - that's easy
But sometimes it is a string enclosed in double quotes, and if that is
the case the stuff inside the double quotes is delimited by either a
semicolon or a comma. So, textscan should ignore the semicolons if
they are inside double quotes.
I'm afraid this is so idiosyncratic that I don't see any alternative
to using fgetl in a loop and parsing line by line.
Not much help I'm afraid.
|
|
0
|
|
|
|
Reply
|
TideMan
|
5/26/2010 11:48:50 PM
|
|
Camron Call wrote:
> Aircraft.Components.Category2.Name11;7556.63837287839;gal;out
> Aircraft.Components.Category3.Category4.Name12;"2006.6,0;2000.60,131.63;
> ";;out FlightPerformance.DesignMission.Category5.Name13;True;;in
> FlightPerformance.DesignMission.Category6.Name14;"10; 30; 50; 70; 90;
> 110; ";;in
> I wondered if there is any way to get this data into the workspace
> without using fgetl in a loop and parsing each line individually. The
> final goal is to preserve the variable hierarchy in a cell array, or
> structure or dataset or something and be able to access the variable
> name, value, units, and in/out.
If it were me, I would pre-process the input file into another form,
using perl or awk or sed or an editor like vi.
|
|
0
|
|
|
|
Reply
|
Walter
|
5/27/2010 3:06:31 AM
|
|
That's fine. I can write the code to do it in a loop. So I would really like to know what would be the best way to save the data and preserve it's variable names and hierarchy.
How do you make nested structures programatically if I want it to be organized like:
Aircraft.Mass.Calibration.Name1
so that I can type --> Aircraft.Mass.Calibration.Name1.value and get the value string or get the units by Aircraft.Mass.Calibration.Name1.units
or even have Name1 be a cell array with 4 entries. So as to access it by something like:
Aircraft.Mass.Calibration.Name1{1}
Would this type of data fit well into a dataset type?
|
|
0
|
|
|
|
Reply
|
Camron
|
5/27/2010 3:43:21 PM
|
|
On May 28, 3:43=A0am, "Camron Call" <camronc...@gmail.cam> wrote:
> That's fine. =A0I can write the code to do it in a loop. =A0So I would re=
ally like to know what would be the best way to save the data and preserve =
it's variable names and hierarchy. =A0
>
> How do you make nested structures programatically if I want it to be orga=
nized like:
> Aircraft.Mass.Calibration.Name1
> so that I can type --> Aircraft.Mass.Calibration.Name1.value and get the =
value string or get the units by Aircraft.Mass.Calibration.Name1.units
>
> or even have Name1 be a cell array with 4 entries. =A0So as to access it =
by something like:
> Aircraft.Mass.Calibration.Name1{1}
>
> Would this type of data fit well into a dataset type?
Yes, but even better, IMHO, is to use an array of structures.
There would be 4 fields corresponding to your 4 columns (name, value,
units, inout), and each record would be an index in the structure.
So, for example, s(5).name would be
'Aircraft.Mass.Certification.Name5' (from your file listing above).
and s(5).units would be 'lb'.
You can generate this structure in the loop as you read the file in
line by line:
for id=3D1:1000
line=3Dfgetl(fid);
if line =3D=3D -1,break,end % -1 means EOF
% Parse each line here
s(id).name=3D
s(id).value=3D
s(id).units=3D
s(id).inout=3D
end
|
|
0
|
|
|
|
Reply
|
TideMan
|
5/27/2010 8:09:40 PM
|
|
|
6 Replies
207 Views
(page loaded in 0.091 seconds)
Similiar Articles: Reading ASCII text file with variable number of columns - comp ...Hi, I have an all-string, tab-delimited ASCII text file of the form: % Begin File % Xaxis1 Xaxis2 XaxisN Name1 Name2 ... FasterCSV - write tab delimited files - comp.lang.rubyHi ho! Anyone know of a quick way, besides writing a ... Read By String From A Tab Delimited Text File Hi, Hihi I am very new in Vb hope ... Import delimited (.txt) file -- URGENT PLEASE - comp.soft-sys.sas ...Hello SAS-L, I have a delimited (|) .txt file with about 300K obs. I am trying to import it into SAS withy partial success. I can read in the... reading files with fields - comp.soft-sys.matlabHow to Read Delimited Fields in a File Using PHP | eHow.com The PHP "explode" function allows you to separate the data and the delimiter in a delimited file such as a ... Reading a Header txt file - comp.soft-sys.matlabIt is basically a /t delimited txt file exactly like a excel sheet. How do I use ... My approach would probably be to fopen() the file, fgetl() to read the header line ... Reading TXT file into MSAccess using Line Input? - comp.databases ...I need to read txt files into MSAccess and get a count of the number of records in the text files. The problem is, sometimes they're delimited and ... Arena - Reading Formatted Input from File - comp.simulation ...In this I need to read a text file ... Tab-delimited records in list-directed READ - comp.lang.fortran ..... Matlab group regarding writing text files for later reading ... Re: Import delimited (.txt) file -- URGENT PLEASE - comp.soft-sys ...... Hello SAS-L, > > > I have a delimited (|) .txt file with about 300K obs. > > > I am trying to import it into SAS withy partial success. > > > I can read in the file ... Tab Delimited Export Includes Random Quote Marks - comp.databases ...If I export a table as a tab delimited file some, but not all, of the ... Tab-delimited records in list-directed READ - comp.lang.fortran ... Tab Delimited Export Includes ... Awk help needed for tab delimited text file - comp.lang.awk ...I have an input file with five fields: Field A : 24 bytes Field B : 72 Bytes Field C : 8 Bytes Field D : 10 Bytes Field E : 10 bytes This is a tab delimited text file. TS-673: Reading Delimited Text Files into SAS 9i Reading Delimited Text Files into SAS®9 Reading Delimited Text Files into SAS®9 TS-673 Read Text File (txt, csv, log, tab, fixed length) - CodeProjectThis article is mainly focused on reading text files efficiently. It includes log, csv, tab delimited, fixed length files, etc. Instead of using StreamReader(.NET ... 7/23/2012 7:21:02 AM
|