Did either F03 or F08 extend list-directed input to make tab-delimited
records Standard-conforming?
--
|
|
0
|
|
|
|
Reply
|
dpb
|
12/22/2010 3:09:35 PM |
|
In article <1jtwaiv.v8a5891heqt6N%nospam@see.signature>,
Richard Maine <nospam@see.signature> wrote:
>dpb <none@non.net> wrote:
>
>> Did either F03 or F08 extend list-directed input to make tab-delimited
>> records Standard-conforming?
>
>F03 - no. F08 - I didn't actually check, but I seriously doubt it.
>
>It is much more fundamental than tab delimitting. Tab is not in the
>Fortran character set at all. If the character is not standard
>conforming, then you aren't going to have standard-conforming uses of
>it.
Er, no, sorry. You can already read characters in that are not part
of the character set, so that's not critical. Also, enabling tabs
as delimiters in list-directed input can be and has been done without
touching the rest of the language. There's no difficulty, I assure
you.
I would have much preferred that Fortran 2008 had sorted out the
I/O mess (and it IS a mess) as far as would be possible for minimal
hassle to implementors. There are a huge number of unnecessary
restrictions, some of which are so historical that few vendors (let
alone users!) still know why they are there, and others were (ugh)
committee compromises, inserted to placate vendors whose imagination
of problems exceeded their imagination of solutions.
It's the only major area where I advise users that calling C is often
cleaner and simpler than trying to do the job in Fortran. Actually,
I advise the use of a Python intermediate stage, to convert arbitrary
input formats into a Fortran-acceptable form.
Regards,
Nick Maclaren.
|
|
0
|
|
|
|
Reply
|
nmm1
|
12/22/2010 5:08:53 PM
|
|
dpb <none@non.net> wrote:
> Did either F03 or F08 extend list-directed input to make tab-delimited
> records Standard-conforming?
F03 - no. F08 - I didn't actually check, but I seriously doubt it.
It is much more fundamental than tab delimitting. Tab is not in the
Fortran character set at all. If the character is not standard
conforming, then you aren't going to have standard-conforming uses of
it.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam
|
12/22/2010 5:35:39 PM
|
|
Richard Maine wrote:
> dpb <none@non.net> wrote:
>
>> Did either F03 or F08 extend list-directed input to make tab-delimited
>> records Standard-conforming?
>
> F03 - no. F08 - I didn't actually check, but I seriously doubt it.
>
> It is much more fundamental than tab delimitting. Tab is not in the
> Fortran character set at all. If the character is not standard
> conforming, then you aren't going to have standard-conforming uses of
> it.
Yes; implicit in the question was whether the character set had been
expanded to include the tab character.
I thought that was the answer, btw, but there was a thread in the Matlab
group regarding writing text files for later reading by a Fortran
program and wanted to see if there were any changes that would change
the advice I had given to not rely on vendor extensions to parse such
files as another respondent had suggested.
--
|
|
0
|
|
|
|
Reply
|
dpb
|
12/22/2010 5:44:34 PM
|
|
Hello,
On 2010-12-22 12:08:53 -0500, nmm1@cam.ac.uk said:
>
> Er, no, sorry. You can already read characters in that are not part
> of the character set, so that's not critical.
Well, the standard does explicitly provide that a processor
may refuse to process control characters in formatted records.
See 9.2.2 of 10-007r1.
Aside, I tried to get a new intrinsic added last time 'round
that would indicate whether a character was acceptable
in a formatted record, but the idea failed for lack of resources
and interest.
I believe that a diligent programmer ought to be able to diagnose
processor dependent limitations, others believe DDT is good enough. :-(
--
Cheers!
Dan Nagle
|
|
0
|
|
|
|
Reply
|
Dan
|
12/22/2010 9:40:32 PM
|
|
On Dec 23, 2:09=A0am, dpb <n...@non.net> wrote:
> Did either F03 or F08 extend list-directed input to make tab-delimited
> records Standard-conforming?
>
> --
Even if it were included as an internal possible process in formatted
reads, the problem is always 'what to you mean by tabbing?'. Yes, you
are at a certain point in the input character stream, but what does
the next tab character found imply? A jump of 8 spaces? Or a jump to
the next multiple of 8 position? Or something else? The original 8
pamarmeter that I used here comes from 30 years ago and a CP/M or DOS
environment. Even that is changeable.
|
|
0
|
|
|
|
Reply
|
Terence
|
12/22/2010 10:03:40 PM
|
|
Terence wrote:
> On Dec 23, 2:09 am, dpb <n...@non.net> wrote:
>> Did either F03 or F08 extend list-directed input to make tab-delimited
>> records Standard-conforming?
>>
>> --
>
> Even if it were included as an internal possible process in formatted
> reads, the problem is always 'what to you mean by tabbing?'. ...
For list-directed read (at least in the present context) it would simply
be an allowable field delimiter (as blank and comma).
--
|
|
0
|
|
|
|
Reply
|
dpb
|
12/22/2010 10:52:31 PM
|
|
Terence <tbwright@cantv.net> wrote:
> On Dec 23, 2:09�am, dpb <n...@non.net> wrote:
>> Did either F03 or F08 extend list-directed input to make tab-delimited
>> records Standard-conforming?
> Even if it were included as an internal possible process in formatted
> reads, the problem is always 'what to you mean by tabbing?'.
For a numeric field separator in list-directed input it doesn't
matter so much as long as it counts as a field delimiter.
> Yes, you
> are at a certain point in the input character stream, but what does
> the next tab character found imply? A jump of 8 spaces? Or a jump to
> the next multiple of 8 position? Or something else? The original 8
> pamarmeter that I used here comes from 30 years ago and a CP/M or DOS
> environment. Even that is changeable.
In the days before list-directed input, the DEC Fortran compilers
had some tricks to make terminal input easier, possibly including
the use of tabs. Otherwise, tab delimited input is nice when
fields can contain commas, or other delimiters.
-- glen
|
|
0
|
|
|
|
Reply
|
glen
|
12/23/2010 4:02:40 AM
|
|
In article <ietbb5$4kq$1@gosset.csi.cam.ac.uk>, nmm1@cam.ac.uk
wrote:
> [...] inserted to placate vendors whose imagination
> of problems exceeded their imagination of solutions. [...]
That's a nice turn of a phrase, I need to remember that one.
What I would like in fortran is an easy way to exchange files (both
input and output) with microsoft excel and compatible utilities.
Part of this involves tab delimiters between the fields, but it also
needs more control over the record length. If I remember correctly,
multiple tabs should mean multiple blank fields, whereas the usual
tab-extended-list-directed-i/o convention is to treat multiple tabs
as a single delimiter. We fortran programmers have needed this
capability for 20 or 25 years now -- you have to wonder what the
heck is taking so long.
$.02 -Ron Shepard
|
|
0
|
|
|
|
Reply
|
Ron
|
12/23/2010 6:38:17 AM
|
|
On 23 dec, 07:38, Ron Shepard <ron-shep...@NOSPAM.comcast.net> wrote:
> In article <ietbb5$4k...@gosset.csi.cam.ac.uk>, n...@cam.ac.uk
> wrote:
>
> > [...] inserted to placate vendors whose imagination
> > of problems exceeded their imagination of solutions. [...]
>
> That's a nice turn of a phrase, I need to remember that one.
>
> What I would like in fortran is an easy way to exchange files (both
> input and output) with microsoft excel and compatible utilities. =A0
> Part of this involves tab delimiters between the fields, but it also
> needs more control over the record length. =A0If I remember correctly,
> multiple tabs should mean multiple blank fields, whereas the usual
> tab-extended-list-directed-i/o convention is to treat multiple tabs
> as a single delimiter. =A0We fortran programmers have needed this
> capability for 20 or 25 years now -- you have to wonder what the
> heck is taking so long.
>
> $.02 -Ron Shepard
An alternative is to use "the" CSV format (there seems to be quite
a bit of confusion about what exactly constitutes this format, hence
the quotation marks).
Another way is to use an ODBC interface, like the one in my Flibs
project (http://flibs.sf.net).
Regards,
Arjen
|
|
0
|
|
|
|
Reply
|
Arjen
|
12/23/2010 8:29:55 AM
|
|
In article <iev7g0$otj$1@news.eternal-september.org>,
Dan Nagle <dannagle@verizon.net> wrote:
>On 2010-12-23 01:38:17 -0500, Ron Shepard said:
>
>> That's a nice turn of a phrase, I need to remember that one.
>>
>> What I would like in fortran is an easy way to exchange files (both
>> input and output) with microsoft excel and compatible utilities.
Given that Microsoft Excel isn't compatible with itself, that's
a high hurdle to jump! But I take your point.
>> Part of this involves tab delimiters between the fields, but it also
>> needs more control over the record length. If I remember correctly,
>> multiple tabs should mean multiple blank fields, whereas the usual
>> tab-extended-list-directed-i/o convention is to treat multiple tabs
>> as a single delimiter. We fortran programmers have needed this
>> capability for 20 or 25 years now -- you have to wonder what the
>> heck is taking so long.
As Richard pointed out, tabs in fixed-format are a disaster area.
I disagree with Lamport[*] - I feel that capital punishment IS an
appropriate penalty. However, the one exception is when they are
equivalent to an indeterminate (but non-null) amount of white space,
which is what list-directed I/O would use. I have used a good many
compilers that would accept them in that.
>The combination of an unlimited format item
>and the additions to g0 format were intended to go some way
>towards allowing writing of CSV files. Unlimited format items
>are hidden in 10.4, but see note 10.7
Yes. Despite its ungainliness, Fortran is now adequate for writing
free-format output (including CSV etc.); there are still quite a
few unnecessary restrictions, but not crippling ones. Free-format
input is still virtually impossible, except by decoding it by hand.
[*] http://www.tex.ac.uk/tex-archive/digests/texhax/89/texhax.03.gz
Regards,
Nick Maclaren.
|
|
0
|
|
|
|
Reply
|
nmm1
|
12/23/2010 10:02:00 AM
|
|
Hello,
On 2010-12-23 01:38:17 -0500, Ron Shepard said:
> In article <ietbb5$4kq$1@gosset.csi.cam.ac.uk>, nmm1@cam.ac.uk
> wrote:
>
>> [...] inserted to placate vendors whose imagination
>> of problems exceeded their imagination of solutions. [...]
>
> That's a nice turn of a phrase, I need to remember that one.
>
> What I would like in fortran is an easy way to exchange files (both
> input and output) with microsoft excel and compatible utilities.
> Part of this involves tab delimiters between the fields, but it also
> needs more control over the record length. If I remember correctly,
> multiple tabs should mean multiple blank fields, whereas the usual
> tab-extended-list-directed-i/o convention is to treat multiple tabs
> as a single delimiter. We fortran programmers have needed this
> capability for 20 or 25 years now -- you have to wonder what the
> heck is taking so long.
The combination of an unlimited format item
and the additions to g0 format were intended to go some way
towards allowing writing of CSV files. Unlimited format items
are hidden in 10.4, but see note 10.7
The new g format stuff is described in the g format subsection.
It must be in-demand, vendors seem to be implementing it
fairly quickly.
--
Cheers!
Dan Nagle
|
|
0
|
|
|
|
Reply
|
Dan
|
12/23/2010 10:15:28 AM
|
|
Arjen Markus <arjen.markus895@gmail.com> wrote:
(snip)
> An alternative is to use "the" CSV format (there seems to be quite
> a bit of confusion about what exactly constitutes this format, hence
> the quotation marks).
CSV is fine unless the fields can contain commas. If you have
a "last name, first name" field, or even someone puts a comma in
where they shouldn't, then you are stuck. Of course one could
also put in a tab, but somewhat less likely than comma.
-- glen
|
|
0
|
|
|
|
Reply
|
glen
|
12/23/2010 12:27:06 PM
|
|
nmm1@cam.ac.uk wrote:
(snip)
> As Richard pointed out, tabs in fixed-format are a disaster area.
> I disagree with Lamport[*] - I feel that capital punishment IS an
> appropriate penalty. However, the one exception is when they are
> equivalent to an indeterminate (but non-null) amount of white space,
> which is what list-directed I/O would use. I have used a good many
> compilers that would accept them in that.
The one that I never liked, was the DEC Fortran IV compilers
that accept tabs in Fortran source, where the first tab went
to column 9, but the compilers count it as 7. You are allowed 66
characters of Fortran source after that. If the first character
after the tab is numeric, then it is treated as column 6, the
continuation column. Now, look at a printed listing where you
don't see the difference between tabs and spaces, and can't tell
which column anything is in!
My first use of tabs with computers was with WYLBUR, where tabs
are allowed on input to user specified columns, but not (normally)
stored in the file. It is convenient to set a tab at column 7
when entering Fortran statements, but the file still gets spaces.
(The IBM Fortran IV compilers don't accept tabs in source input.)
The IBM 2741 terminal has hardware (mechanical) tabs that can be
set at user definable positions. If enabled, WYLBUR will use them
in output, speeding up print outs.
-- glen
|
|
0
|
|
|
|
Reply
|
glen
|
12/23/2010 12:53:38 PM
|
|
On 23 dec, 13:27, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Arjen Markus <arjen.markus...@gmail.com> wrote:
>
> (snip)
>
> > An alternative is to use "the" CSV format (there seems to be quite
> > a bit of confusion about what exactly constitutes this format, hence
> > the quotation marks).
>
> CSV is fine unless the fields can contain commas. =A0If you have
> a "last name, first name" field, or even someone puts a comma in
> where they shouldn't, then you are stuck. =A0Of course one could
> also put in a tab, but somewhat less likely than comma.
>
> -- glen
I was referring to The Practice of Programming by Kernighan and Pike.
I do not remember the details, but I can imagine the following:
- Can a field contain an embedded newline?
- If a field contains a ", should that be doubled ("") or escaped
(\")?
Then there is of course the fact that some cultures use a decimal
comma
instead of a decimal point, so that (for instance) MS Excel in the
Dutch
locale uses a semicolon instead of a comma to separate fields.
And there are undoubtedly other issues and ambiguities.
Regards,
Arjen
|
|
0
|
|
|
|
Reply
|
Arjen
|
12/23/2010 1:51:43 PM
|
|
Hello,
On 2010-12-23 07:27:06 -0500, glen herrmannsfeldt said:
> Arjen Markus <arjen.markus895@gmail.com> wrote:
> (snip)
>
>> An alternative is to use "the" CSV format (there seems to be quite
>> a bit of confusion about what exactly constitutes this format, hence
>> the quotation marks).
>
> CSV is fine unless the fields can contain commas. If you have
> a "last name, first name" field, or even someone puts a comma in
> where they shouldn't, then you are stuck. Of course one could
> also put in a tab, but somewhat less likely than comma.
Which is why virtually every Import | CSV dialog I've seen
has a selection of which character to use for the separator.
If comma won't work, select semicolon, or whatever else fits
this time.
--
Cheers!
Dan Nagle
|
|
0
|
|
|
|
Reply
|
Dan
|
12/23/2010 2:18:40 PM
|
|
In article <ievlo0$718$1@news.eternal-september.org>,
Dan Nagle <dannagle@verizon.net> wrote:
>On 2010-12-23 07:27:06 -0500, glen herrmannsfeldt said:
>> Arjen Markus <arjen.markus895@gmail.com> wrote:
>>
>>> An alternative is to use "the" CSV format (there seems to be quite
>>> a bit of confusion about what exactly constitutes this format, hence
>>> the quotation marks).
>>
>> CSV is fine unless the fields can contain commas. If you have
>> a "last name, first name" field, or even someone puts a comma in
>> where they shouldn't, then you are stuck. Of course one could
>> also put in a tab, but somewhat less likely than comma.
>
>Which is why virtually every Import | CSV dialog I've seen
>has a selection of which character to use for the separator.
>If comma won't work, select semicolon, or whatever else fits
>this time.
Any sane CSV-producer will quote such fields, and Fortran can do
that for list-directed output. However, that doesn't help for
input, where the problem characters include space, comma, slash,
asterisk, and all control characters.
Regards,
Nick Maclaren.
|
|
0
|
|
|
|
Reply
|
nmm1
|
12/23/2010 2:32:36 PM
|
|
In message <ievf6p$4l3$1@news.eternal-september.org>, glen
herrmannsfeldt <gah@ugcs.caltech.edu> writes
>CSV is fine unless the fields can contain commas. If you have
>a "last name, first name" field, or even someone puts a comma in
>where they shouldn't, then you are stuck. Of course one could
>also put in a tab, but somewhat less likely than comma.
I've just been working on the transfer of data from a spreadsheet to
Fortran, and commas don't present a problem. I couldn't find this in
the documentation (does Excel have any documentation?) but by experiment
it seems that if a field contains a comma it is exported with a pair of
double quotes around it. One problem (for software that may be used by
a novice) is that CSV isn't one of the easiest export formats to find in
Excel, though it is there if you search hard enough. My real problem is
on the input side: if any field contains a slash then Fortran takes it
as an end-of-record marker, and there isn't any obvious way of disabling
that.
There is, of course, no established standard for CSV, so Fortran's
weirdly eccentric version is as valid as anyone's (unless you are trying
to use CSV as an *interchange* format, in which case it's truly
unusable). I ended up reading each record into a long string, and
coding up my own decoder just for this job. I expect everyone does
this.
Another problem is that Fortran doesn't have any easy way of writing
CSV, and if it did, no doubt it would allow slashes within fields, so it
wouldn't even be compatible with itself.
What a mess.
--
Clive Page
|
|
0
|
|
|
|
Reply
|
Clive
|
12/23/2010 4:35:31 PM
|
|
On 12/23/2010 10:35 AM, Clive Page wrote:
> In message <ievf6p$4l3$1@news.eternal-september.org>, glen
> herrmannsfeldt <gah@ugcs.caltech.edu> writes
>> CSV is fine unless the fields can contain commas. If you have
>> a "last name, first name" field, or even someone puts a comma in
>> where they shouldn't, then you are stuck. Of course one could
>> also put in a tab, but somewhat less likely than comma.
>
> I've just been working on the transfer of data from a spreadsheet to
> Fortran, and commas don't present a problem. I couldn't find this in the
> documentation (does Excel have any documentation?) but by experiment it
> seems that if a field contains a comma it is exported with a pair of
> double quotes around it. One problem (for software that may be used by a
> novice) is that CSV isn't one of the easiest export formats to find in
> Excel, though it is there if you search hard enough. My real problem is
> on the input side: if any field contains a slash then Fortran takes it
> as an end-of-record marker, and there isn't any obvious way of disabling
> that.
>
> There is, of course, no established standard for CSV, so Fortran's
> weirdly eccentric version is as valid as anyone's (unless you are trying
> to use CSV as an *interchange* format, in which case it's truly
> unusable). I ended up reading each record into a long string, and coding
> up my own decoder just for this job. I expect everyone does this.
>
> Another problem is that Fortran doesn't have any easy way of writing
> CSV, and if it did, no doubt it would allow slashes within fields, so it
> wouldn't even be compatible with itself.
>
> What a mess.
>
I've been writing csv for decades...I've never had a problem wring csv
compatible with excel...i'm not sure what the fuss is, but not
interested enough to read this series of posts to find out :(. CSV is
NOT a data interchange format. It isn't standardized, so it is not the
most valuable feature to be worked on for the Fortran standard.
Whereas, threads, process initiate/terminate, process priority control
would be more valuable. Another would be a well designed bit string
(for my definition of "well designed":)
|
|
0
|
|
|
|
Reply
|
Gary
|
12/23/2010 5:38:47 PM
|
|
On Dec 23, 9:02=A0pm, n...@cam.ac.uk wrote:
(edited)
> >The combination of an unlimited format item
> >and the additions to g0 format were intended to go some way
> >towards allowing writing of CSV files. =A0Unlimited format items
> >are hidden in 10.4, but see note 10.7
>
> Yes. =A0Despite its ungainliness, Fortran is now adequate for writing
> free-format output (including CSV etc.); there are still quite a
> few unnecessary restrictions, but not crippling ones. =A0Free-format
> input is still virtually impossible, except by decoding it by hand.
>
> Nick Maclaren.
Yes. I wrote a Fortran subroutine to handle CSV input and output, in
the days whan M.S. had only defined the use of tab, comma and quotes
(of either pairing). Then M.S. changed the rules to add the semicolon.
So, OK, I fixed it. But the problem is that CSV format still might be
a moving target.
And I appreciate well the observation about what multiple tabs might
mean. It some text, it's the old shorthand for more spaces; in other
newer interpretations, the first is a delimiter and any following ones
then indicate a blank field. So we need a switch to separate 'now tab
is a space measure' from 'now tab is a dilimiter'. Hum! Backtab?
Then someone had to ask 'and what about a field which is 'uncertain'
or 'not measured' which would affect statistical interpretations of
data?
Hey-ho!
Have a MERRY Xmas!
|
|
0
|
|
|
|
Reply
|
Terence
|
12/23/2010 11:03:31 PM
|
|
On 23 dec, 17:35, Clive Page <j...@nospam.net> wrote:
> In message <ievf6p$4l...@news.eternal-september.org>, glen
> herrmannsfeldt <g...@ugcs.caltech.edu> writes
>
> >CSV is fine unless the fields can contain commas. =A0If you have
> >a "last name, first name" field, or even someone puts a comma in
> >where they shouldn't, then you are stuck. =A0Of course one could
> >also put in a tab, but somewhat less likely than comma.
>
> I've just been working on the transfer of data from a spreadsheet to
> Fortran, and commas don't present a problem. =A0I couldn't find this in
> the documentation (does Excel have any documentation?) but by experiment
> it seems that if a field contains a comma it is exported with a pair of
> double quotes around it. =A0 One problem (for software that may be used b=
y
> a novice) is that CSV isn't one of the easiest export formats to find in
> Excel, though it is there if you search hard enough. =A0My real problem i=
s
> on the input side: if any field contains a slash then Fortran takes it
> as an end-of-record marker, and there isn't any obvious way of disabling
> that.
>
> There is, of course, no established standard for CSV, so Fortran's
> weirdly eccentric version is as valid as anyone's (unless you are trying
> to use CSV as an *interchange* format, in which case it's truly
> unusable). =A0 I ended up reading each record into a long string, and
> coding up my own decoder just for this job. =A0I expect everyone does
> this.
>
> Another problem is that Fortran doesn't have any easy way of writing
> CSV, and if it did, no doubt it would allow slashes within fields, so it
> wouldn't even be compatible with itself.
>
> What a mess.
>
> --
> Clive Page
I have written a small module to make that slightly easier - see
http://flibs.sf.net
Regards,
Arjen
|
|
0
|
|
|
|
Reply
|
Arjen
|
12/24/2010 8:12:33 AM
|
|
|
20 Replies
830 Views
(page loaded in 1.123 seconds)
|