Fortran I/O (long)

  • Follow


This note describes a textual representation
of Fortran sequential-access formatted files.
The representation includes visual indicators
for the file position and the left tab limit.

The positions in the file outside of a record
are represented by a sequence of five hyphens.

Two special kinds of records are represented
by names.  An endfile record is represented
by the name "endfile".  A record that does
not exist is represented by the name "null".

--------------------------------------------
Ex. 1  An empty file can be represented as

   -----
   null

There is only one position in an empty file.

Ex. 2  A file that contains only an endfile
record can be represented as

   -----
   endfile
   -----

or as

   -----
   endfile
   -----
   null

--------------------------------------------

A formatted record is represented by a
sequence of zero or more graphic characters.
The characters '>', '|', '#' and '\' shall
not be used to represent characters.  Spaces
can be used to separate characters, but are
not considered characters in the record.
The character '_' represents an instance of
the space character in the record.

The sequences

   endfile

and

   null

shall not represent formatted records.

-------------------------------------------
Ex 3.  A formatted record consisting of the
sequence of nine characters "ABCD EFGH" can
be represented as

   ABCD_EFGH

It can also be represented as

   A B CD _   E FGH

Ex. 4.  A formatted record consisting of
the sequence of characters "endfile" can
be represented as

   end file

or as

   e n d f i l e

------------------------------------------

The character '>' represents the current
file position.  If it precedes the hyphens
that represents a position outside of a
record, the current file position is not
within a record and there is no current
record.

The character '>' shall not be within the
representation of an endfile record or a
record that does not exist.

If the character '>' appears within the
representation of a formatted record, the
current file position is within that
record and the record is the current
record.

------------------------------------------
Ex. 5.  The representation

   -----
   ABCDEFGH>IJKL
   -----
   endfile

indicates that the file is positioned
within the first record between the
characters 'H' and 'I'.  The
representation

   -----
   ABCDEFGH > IJKL
   -----
   endfile

is an equivalent representation.

-----------------------------------------

The character '|' represents the left
tab limit.  The left tab limit is
defined only when the file position is
within a record.  The left tab limit
cannot be to the right of the file
position.

------------------------------------------
Ex 6.  The representation

   -----
   ABCD |> EFGH
   -----

indicates that both the left tab limit and
the file position are between the fourth
and fifth character positions within the
first record.

Ex. 7.  The representation

   -----
   ABCD | EFGH >
   -----

indicates that the left tab limit is between
the fourth and fifth character positions and
the file position is between the eighth and
ninth of the current record.
------------------------------------------

The character '#' represents a character
position that has not had a character
written to it.  It is not an end-of-record
mark.  Each formatted record is associated
with an unbounded number of character
positions.

---------------------------------------------
Ex 8.  After the statements

      REWIND (10)
      WRITE (10, 10, ADVANCE= 'NO')
   10 FORMAT (A, 4X) 'ABCD'

are executed, the file connected to unit 10
could be represented as

   -----
   ABCD####>

If the file is closed before any other
I/O operations to the file are executed,
the contents of the file after it is closed
could be represented as

   -----
   ABCD
   -----
   endfile

---------------------------------------------

The character '\' is used as an escape to
allow representation of characters that
are not represented by graphic characters.
0
Reply robert.corbett (96) 3/3/2012 7:35:40 AM

Here is an application of my notation for
representing sequential-access formatted files.

Consider the program

      PROGRAM MAIN
        CHARACTER STR2*2, STR4*4
        OPEN (10, FILE='YYY', POSITION='REWIND')
        WRITE (10, '(A, T1)', ADVANCE='NO') 'ABCD'
        READ (10, '(A)', ADVANCE='NO') STR2
        WRITE (10, '(A)') STR2
        REWIND (10)
        READ (10, '(A)') STR4
        CLOSE (10, STATUS='DELETE')
        PRINT '(A)', STR4
      END

After the first WRITE statement is executed, the
file can be represented as

   -----
   > ABCD

After the first READ statement is executed, the
variable STR2 has the value 'AB' and the file can
be represented as

   -----
   AB > CD

After the second WRITE statement is executed, the
file can be represented as

   -----
   ABAB
 > -----
   null

After the REWIND statement is executed, the file
can be represented as

 > -----
   ABAB
   -----
   endfile

After the second READ statement is executed, the
file can be represented as

   -----
   ABAB
 > -----
   endfile

After the CLOSE statement is executed, the file
no longer exists.
0
Reply robert.corbett (96) 3/3/2012 7:44:06 AM


At the risk of side-tracking this thread (and bringing the experts down 
on my head like a ton of bricks), I'd like to ask whether there is still 
any purpose in the Fortran standards of having the notion of endfile 
records?

I can just about remember when endfile records were actually written to 
magnetic tapes, and one had to cope with their existence.   But that was 
ages ago.  As far as I can see no modern file system actually writes an 
endfile record to a file on disc.  So the elaborate set of rules that 
Fortran has developed to describe the ENDFILE statement and the handling 
of (now virtual) endfile records is just an attempt to get around the 
inconvenient fact that they don't really exist.

What seems to happen in practice is that when you reach the end of a 
sequential (or I guess stream) file, the I/O system detects this, and 
any attempt to read further triggers the Fortran end-of-file condition, 
which a READ statement can detect.  There are some awkward cases 
concerning reading the last line of a text file where it does not have 
an end-of-record marker (like CR/LF), but it doesn't seem to me that the 
virtual endfile record is much help here.

In fact I cannot think of any situation in which the existence of 
virtual endfile record allows any behaviour in a Fortran system which 
would be different if we simply admitted that, like Father Christmas, 
endfile doesn't really exist.  Or have we got ourselves in the situation 
where they can't be abolished because this would alter the behaviour of 
existing code?


-- 
Clive Page
0
Reply usenet1820 (74) 3/3/2012 8:50:21 AM

Clive Page <usenet@page2.eu> wrote:
> At the risk of side-tracking this thread (and bringing the experts down 
> on my head like a ton of bricks), I'd like to ask whether there is still 
> any purpose in the Fortran standards of having the notion of endfile 
> records?

> I can just about remember when endfile records were actually written to 
> magnetic tapes, and one had to cope with their existence.   But that was 
> ages ago.  As far as I can see no modern file system actually writes an 
> endfile record to a file on disc.  

The file system used by z/OS, descendant from OS/360, writes data
to disk pretty much the same as to tape, including an EOF mark.
(The distinction is important in partitioned data sets (PDSs),
and is noticable if you open a data set (IBM speak for file),
never write an EOF, and start reading.)

> So the elaborate set of rules that Fortran has developed to 
> describe the ENDFILE statement and the handling of (now virtual) 
> endfile records is just an attempt to get around the 
> inconvenient fact that they don't really exist.

Despite my comment above, I probably agree. Years ago I wrote 
a program that wrote an unformatted file and, to be sure, used
ENDFILE when done. (There was no CLOSE in that compiler.)

> What seems to happen in practice is that when you reach the end of a 
> sequential (or I guess stream) file, the I/O system detects this, and 
> any attempt to read further triggers the Fortran end-of-file condition, 
> which a READ statement can detect.  There are some awkward cases 
> concerning reading the last line of a text file where it does not have 
> an end-of-record marker (like CR/LF), but it doesn't seem to me that the 
> virtual endfile record is much help here.

Some of those awkward cases are still there even without CRLF.

One has to do with when EOF is detected. Note that Fortran gives
no indication of EOF until you actually attempt to read past it,
where PASCAL requires one to test for EOF before reading.

> In fact I cannot think of any situation in which the existence of 
> virtual endfile record allows any behaviour in a Fortran system which 
> would be different if we simply admitted that, like Father Christmas, 
> endfile doesn't really exist.  Or have we got ourselves in the 
> situation where they can't be abolished because this would alter 
> the behaviour of existing code?

As far as I know, the latter. Even for OS/360, all the program sees
is a virtual EOF. User programs don't control tape drives at the
level that they could read or write a tape mark. OS/360 makes
it especially difficult for a Fortran program to write mutliple
files onto a tape. Specifically, you must supply a separate DD
statement for each file, with sequentially increasing DD names.

For unix, reading and writing a multiple file tape, one has
to keep track of which side of the tape mark you are on.
The tape mark may or may not be physical, but in either case
the device driver keeps track of the tape position as if it were.
As you read a tape, after you have read the last record, but
before you detect EOF you are before the tape mark (virtual
or not). After you attempt to read more, the read(2) system
call returns 0 bytes, you are then positioned after the tape mark.

I believe the OS/360 disks write a physical record with length
zero as the EOF mark. For 9-track tape, a special mark is written
that can be identified by the drive with the tape moving
at high speed. 

I suppose, though, that the documentation could be changed to
indicate that ENDFILE is more virtual than physical.

-- glen
0
Reply gah (12253) 3/3/2012 9:36:27 AM

On 3/3/2012 2:50 AM, Clive Page wrote:
> At the risk of side-tracking this thread (and bringing the experts down
> on my head like a ton of bricks), I'd like to ask whether there is still
> any purpose in the Fortran standards of having the notion of endfile
> records?
>
> I can just about remember when endfile records were actually written to
> magnetic tapes, and one had to cope with their existence. But that was
> ages ago. As far as I can see no modern file system actually writes an
> endfile record to a file on disc. So the elaborate set of rules that
> Fortran has developed to describe the ENDFILE statement and the handling
> of (now virtual) endfile records is just an attempt to get around the
> inconvenient fact that they don't really exist.
>
> What seems to happen in practice is that when you reach the end of a
> sequential (or I guess stream) file, the I/O system detects this, and
> any attempt to read further triggers the Fortran end-of-file condition,
> which a READ statement can detect. There are some awkward cases
> concerning reading the last line of a text file where it does not have
> an end-of-record marker (like CR/LF), but it doesn't seem to me that the
> virtual endfile record is much help here.
>
> In fact I cannot think of any situation in which the existence of
> virtual endfile record allows any behaviour in a Fortran system which
> would be different if we simply admitted that, like Father Christmas,
> endfile doesn't really exist. Or have we got ourselves in the situation
> where they can't be abolished because this would alter the behaviour of
> existing code?
>
If it is acceptable to ask whether endfile records have any purpose, it 
should also be acceptable to ask whether the very notion of a record has 
any purpose today. Let me state the "against" reasons.

If we look at the most popular operating systems, file systems and 
storage devices used today, the notion of a file as being made up of 
records may appear to be as natural as would wearing a whalebone corset 
to go jogging. We do not normally talk in terms of C-H-R-N (cylinder, 
head, sector, count), even though it may a better match with currently 
used storage devices.

Fortran I/O is a difficult topic for beginners, to a large extent 
because of the artificially imposed distinctions between formatted and 
unformatted files, and having to simulate record-oriented I/O on devices 
and file systems that do not employ records.

I do not think that this discussion will go anywhere, however. The 
amount of inertia is too much to overcome.

-- mecej4

0
Reply mecej46801 (41) 3/3/2012 6:44:04 PM

mecej4 <mecej4@NOSPAM.operamail.com> wrote:

> If it is acceptable to ask whether endfile records have any purpose, it
> should also be acceptable to ask whether the very notion of a record has
> any purpose today. Let me state the "against" reasons.
> 
> If we look at the most popular operating systems, file systems and 
> storage devices used today, the notion of a file as being made up of 
> records may appear to be as natural as would wearing a whalebone corset
> to go jogging. We do not normally talk in terms of C-H-R-N (cylinder,
> head, sector, count), even though it may a better match with currently
> used storage devices.

You are talking only about implementation details. That is not the only
driver. There are times when records are exactly what make sense for an
application. Doesn't have anything to do with the implementation, but
with the application. There are plenty of applications where, if the
language didn't do it for you, the application would have to build its
own record support.

I agree that there are also applications where that's not the case. I
think it is useful to have both options.

> Fortran I/O is a difficult topic for beginners, to a large extent 
> because of the artificially imposed distinctions between formatted and
> unformatted files, and having to simulate record-oriented I/O on devices
> and file systems that do not employ records.

So don't do record-oriented for applications that don't want it. That
doesn't sound like a reason to take it out of the language. Then you
just push simulating record I/O onto the application instead of the
language. That doesn't sound like an improvement to me.

Your argument sounds addressed at older versions of the language, where
record-oriented was the only choice. I strongly agree that there was a
need to provide an alternative. I was the one who wrote the formal
proposal to add stream I/O to the language. I'm even somewhat of the
opinion that much I/O would be better off switching to the stream model.

But now that we do have stream as an option, it sounds like you are
suggesting that we remove the option and make it mandatory. No, I don't
buy that at all. Even ignoring he way that it would break almost every
code in existance, I don't see it as a good thing to take away that
option.

And I 100% disagree that formatted versus unformatted is an artifical
distinction. I think it is an important, fundamental distinction that
will not go away in the forseeable future.

Formatted is what you need for human consumption turned into character
form. I expect there to be humans around for a while yet, and I think we
have several revisions of the standard to go before we need to consider
humans that find it natural to get their input as a raw bit stream
instead of in words and numbers built up from characters. Letter-based
human languages certainly aren't the only possible form of
communication, but they have been with is a while, and I think they have
a while yet to go.

But character is not a good choice for all internal representations. If
you want to force all internal representations into a character-based
form, you need to be looking somewhere other than in the Fortran
standard. Go talk to the hardware folk, not just about adding decimal
float as an option, but about changing everything. I'll not hold my
breath waiting for you to come back with a "yes" answer to that.

You need a form of I/O that directly handles the internal
representation. Formatted I/O is more complicated than unformatted -
lots more complicated. My original proposal for stream didn't even
include formatted because I wanted to keep things simple. Just about all
the questionsn that come up about stream end up being about formatted.

-- 
Richard Maine                    | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle           |  -- Mark Twain
0
Reply nospam47 (9742) 3/3/2012 7:36:04 PM

mecej4 <mecej4@nospam.operamail.com> wrote:

(snip, someone wrote)
>> In fact I cannot think of any situation in which the existence of
>> virtual endfile record allows any behaviour in a Fortran system which
>> would be different if we simply admitted that, like Father Christmas,
>> endfile doesn't really exist. 

(snip)

> If it is acceptable to ask whether endfile records have any purpose, 
> it should also be acceptable to ask whether the very notion of 
> a record has any purpose today. Let me state the "against" reasons.

The idea of "logical record" is still important. You might call
it a "line" on printed output. It used to also be a "card" for
punched card input and output.

> If we look at the most popular operating systems, file systems and 
> storage devices used today, the notion of a file as being made up of 
> records may appear to be as natural as would wearing a whalebone corset 
> to go jogging. We do not normally talk in terms of C-H-R-N (cylinder, 
> head, sector, count), even though it may a better match with currently 
> used storage devices.

When memory was more expensive, the distinction between physical
record and logical record was not needed. Keeping buffers small
was necessary. As memory became cheaper, record blocking allowed
for higher performance, and that is true independent of the 
underlying storage format.

> Fortran I/O is a difficult topic for beginners, to a large extent 
> because of the artificially imposed distinctions between formatted 
> and unformatted files, and having to simulate record-oriented 
> I/O on devices and file systems that do not employ records.

Well, records are fundamental for relational database systems.
They are logical, not physical records, but still important.
In the case of input text files, or printable output files,
record boundaries are line boundaries, a distinction that
is still useful.

> I do not think that this discussion will go anywhere, however. 
> The amount of inertia is too much to overcome.

For comparison, you can look at the PL/I output system.
There are STREAM files, which correspond to Fortran FORMATTED I/O.
Even though they were designed with a record-oriented system
in mind, they are stream oriented. In Fortran terms, 
"ADVANCE=NO" is the default, and you use SKIP to go to the
next record boundary. (Personally, I like that better than
the C system of losing a character code point.) If you are used
to either Fortran or C, it takes just a little getting used to
the difference(*).

Fairly often I/O is naturally record (line) oriented, but often
enough it is more convenient to process partial records.
Fortran ADVANCE=NO allows for that. 

The PL/I alternative is RECORD I/O, which corresponds to
Fortran UNFORMATTED. No conversion between internal representation
and human readable representation is done, and I/O is one record
per execution or a READ or WRITE statement. (I have passed files
between Fortran and PL/I this way.) On some systems, you might
be able to do character string I/O this way, to and from readable
text files, though that isn't the way it was meant to be used.

(*) PL/I has DATA directed, LIST directed, and EDIT directed 
STREAM I/O. DATA directed is similar to Fortran NAMELIST, and
EDIT directed is similar to explicit FORMAT. Unlike Fortran,
though, I/O stops after the last list item is transfered.
To generate a new record, a following PUT SKIP is required.

PUT EDIT(I)(F(10),SKIP);  

won't process the SKIP, though Fortran

      PRINT '(I10/)',I

does process the /.

-- glen
0
Reply gah (12253) 3/3/2012 8:15:49 PM

In article <jitopm$815$1@dont-email.me>,
 mecej4 <mecej4@NOSPAM.operamail.com> wrote:

> If it is acceptable to ask whether endfile records have any purpose, it 
> should also be acceptable to ask whether the very notion of a record has 
> any purpose today. Let me state the "against" reasons.

I guess you are asking if record-addressable direct access files are 
useful, if backspace is useful, and if

   read(unit)

with no I/O list is useful to skip over existing records.  To me 
they are.

If these capabilities were not built naturally into the language 
through the concept of records, then user-level programs would 
almost certainly need to be written in order to keep the same 
functionality.

For a significant period of time in the 90's and early 20's, record 
oriented I/O was a very nice feature indeed.  It allowed users to 
access files that were longer than 4GB in length, despite the 
inherent limitations of 32-bit integer addressing.  In a 
byte-addressable low-level i/o system, this would not have been 
possible.  However, now that 64-bit OS and compilers are available, 
this might not be such an obvious advantage.

$.02 -Ron Shepard
0
Reply ron-shepard (1197) 3/4/2012 12:53:50 AM

On Mar 3, 10:44=A0am, mecej4 <mec...@NOSPAM.operamail.com> wrote:
> On 3/3/2012 2:50 AM, Clive Page wrote:
>
>
>
> > At the risk of side-tracking this thread (and bringing the experts down
> > on my head like a ton of bricks), I'd like to ask whether there is stil=
l
> > any purpose in the Fortran standards of having the notion of endfile
> > records?
>
> > I can just about remember when endfile records were actually written to
> > magnetic tapes, and one had to cope with their existence. But that was
> > ages ago. As far as I can see no modern file system actually writes an
> > endfile record to a file on disc. So the elaborate set of rules that
> > Fortran has developed to describe the ENDFILE statement and the handlin=
g
> > of (now virtual) endfile records is just an attempt to get around the
> > inconvenient fact that they don't really exist.
>
> > What seems to happen in practice is that when you reach the end of a
> > sequential (or I guess stream) file, the I/O system detects this, and
> > any attempt to read further triggers the Fortran end-of-file condition,
> > which a READ statement can detect. There are some awkward cases
> > concerning reading the last line of a text file where it does not have
> > an end-of-record marker (like CR/LF), but it doesn't seem to me that th=
e
> > virtual endfile record is much help here.
>
> > In fact I cannot think of any situation in which the existence of
> > virtual endfile record allows any behaviour in a Fortran system which
> > would be different if we simply admitted that, like Father Christmas,
> > endfile doesn't really exist. Or have we got ourselves in the situation
> > where they can't be abolished because this would alter the behaviour of
> > existing code?
>
> If it is acceptable to ask whether endfile records have any purpose, it
> should also be acceptable to ask whether the very notion of a record has
> any purpose today. Let me state the "against" reasons.
>
> If we look at the most popular operating systems, file systems and
> storage devices used today, the notion of a file as being made up of
> records may appear to be as natural as would wearing a whalebone corset
> to go jogging. We do not normally talk in terms of C-H-R-N (cylinder,
> head, sector, count), even though it may a better match with currently
> used storage devices.
>
> Fortran I/O is a difficult topic for beginners, to a large extent
> because of the artificially imposed distinctions between formatted and
> unformatted files, and having to simulate record-oriented I/O on devices
> and file systems that do not employ records.
>
> I do not think that this discussion will go anywhere, however. The
> amount of inertia is too much to overcome.

Stream access was added to address this issue.  It is a better fit to
the idea of files as byte streams than the earlier access methods.

I recall how horrid the idea of byte streams seemed when I first
encountered them.  Before I used UNIX, I used machines that had
nice record management systems (and some that had bad record
management systems).  The power of those systems was amazing.
Records could contain any sequence of bits.  Binary data could be
written and read back under an A edit descriptor without
corrupting the data.  Lots of programs did that, and they stopped
working when they were moved to UNIX.  Replacing a record in the
middle of a file with a record of a different size was a trivial
operation, which did not alter the other records in the file.
Direct-access files could be sparse over the index set, which did
not have to be integer.  The record management system provided
methods for traversing the records in the file.

I believe that the internet killed good record management systems.
Different record management systems for different machines were
incompatible.  That did not matter much when machines did not
try to converse with other machines.  When file exchange between
different machines became common, the formats of those files had
to be made simple.  Byte streams were about as simple as they
could be.

Neither records nor byte streams are a good fit to most hardware
peripherals.  I/O subsystems go to a lot of trouble to mask the
mismatch, usually not entirely effectively.  Some software I/O,
such as pipes, is well suited for byte-stream I/O.  The addition
of stream access and the ACTION=3D specifier makes it much easier
to deal with pipes directly from Fortran.

Bob Corbett

0
Reply robert.corbett (96) 3/4/2012 8:47:51 AM

robert.corbett@oracle.com wrote:

(snip)

> Neither records nor byte streams are a good fit to most hardware
> peripherals.  I/O subsystems go to a lot of trouble to mask the
> mismatch, usually not entirely effectively.  

The IBM S/360 hardware goes to a lot of work to keep records
visible. That was especially useful when core was expensive,
and main memories small. Programs could use small buffers,
with I/O direct from device to buffer. As core got cheaper,
blocked records were used to increase efficiency.

As memory got larger and cheaper, and processors faster,
the unix fixed block 512 byte disk makes, with large
disk cache in memory, makes more sense, though the larger
blocks of S/360 style disks are still useful.

One S/360 feature, though, rapidly became useless. 
The disk drives can do a hardware search for a record key
(or block for blocked records). As processors got faster,
the disk search speed didn't increase much at all. Hash tables
in memory are much better.

> Some software I/O,
> such as pipes, is well suited for byte-stream I/O.  The addition
> of stream access and the ACTION= specifier makes it much easier
> to deal with pipes directly from Fortran.

Now, how about something like the unix popen() for pipe
access from Fortran? I once had one that would work with
HP-UX, but it wasn't portable to other unix variants.

-- glen
0
Reply gah (12253) 3/4/2012 12:53:49 PM

On 3/4/2012 2:47 AM, robert.corbett@oracle.com wrote:
> I believe that the internet killed good record management systems.

That was probably necessary to make services such as FTP work. However, 
I believe that the killing started earlier. For example, in the early 
1970s I used a CDC-6400 system which supported timesharing under KRONOS. 
Since I/O was done remotely using Teletype terminals, it appeared to the 
user as if program source was in ASCII. I believe that the files were 
stored at the remote central site on disk drives in the CDC character 
set, and converted behind the scenes to and from ASCII. A user could 
function on this system without knowing a file to be  anything other 
than a byte stream.

A few months ago we saw a series of posts in CLF by someone who wanted 
to process FCC data. If I remember correctly, the problem was that the 
original files were RMS files on VAX/VMS. Each RMS file when downloaded 
by FTP using a utility resulted in two files: one file with the data, 
and another file with the metadata. The poster did not quite realize 
that the first file could not be correctly read without using the 
information in the second file.

-- mecej4
0
Reply mecej46801 (41) 3/4/2012 1:43:23 PM

On 2012-03-03, Clive Page <usenet@page2.eu> wrote:
> At the risk of side-tracking this thread (and bringing the experts down 
> on my head like a ton of bricks), I'd like to ask whether there is still 
> any purpose in the Fortran standards of having the notion of endfile 
> records?

Backwards compatibility..

> I can just about remember when endfile records were actually written to 
> magnetic tapes, and one had to cope with their existence.   But that was 
> ages ago.  As far as I can see no modern file system actually writes an 
> endfile record to a file on disc.  So the elaborate set of rules that 
> Fortran has developed to describe the ENDFILE statement and the handling 
> of (now virtual) endfile records is just an attempt to get around the 
> inconvenient fact that they don't really exist.
>
> What seems to happen in practice is that when you reach the end of a 
> sequential (or I guess stream) file, the I/O system detects this, and 
> any attempt to read further triggers the Fortran end-of-file condition, 
> which a READ statement can detect.  There are some awkward cases 
> concerning reading the last line of a text file where it does not have 
> an end-of-record marker (like CR/LF), but it doesn't seem to me that the 
> virtual endfile record is much help here.
>
> In fact I cannot think of any situation in which the existence of 
> virtual endfile record allows any behaviour in a Fortran system which 
> would be different if we simply admitted that, like Father Christmas, 
> endfile doesn't really exist.  Or have we got ourselves in the situation 
> where they can't be abolished because this would alter the behaviour of 
> existing code?

Unfortunately, yes. Though I'm not sure how much real code out there
would break, but it's easy to construct testcases that behave
differently if the endfile record wouldn't exist. 

Consider if we have read all "real" data in the file, and the file is
thus now positioned just before the endfile record. Now, if you try
one more READ statement, the endfile record will be read, and an
end-of-file condition will occur. Now the file is positioned just
AFTER the endfile record. If you try yet further READ statements at
this time, you won't get end-of-file conditions but rather error
conditions.

If you'd abolish endfile records, then you'd keep getting end-of-file
conditions instead of error conditions, and thus the behavior for
existing standard conforming code would change.

For more info, see F2008 9.11.

Similar considerations apply e.g. for the BACKSPACE statement, in that
if the file is positioned after the endfile record (e.g. a previous
read signaled an end-of-file condition), then executing a backspace
statement will only back over the endfile record, leaving the file
position between the last "real" record and the endfile record. Again,
if the concept of an endfile record would be removed, this behavior
would change.

-- 
JB
0
Reply foo33 (1360) 3/5/2012 9:38:59 AM

On 2012-03-04, Ron Shepard <ron-shepard@NOSPAM.comcast.net> wrote:
> In article <jitopm$815$1@dont-email.me>,
>  mecej4 <mecej4@NOSPAM.operamail.com> wrote:
>
>> If it is acceptable to ask whether endfile records have any purpose, it 
>> should also be acceptable to ask whether the very notion of a record has 
>> any purpose today. Let me state the "against" reasons.
>
> I guess you are asking if record-addressable direct access files are 
> useful, if backspace is useful, and if
>
>    read(unit)
>
> with no I/O list is useful to skip over existing records.  To me 
> they are.
>
> If these capabilities were not built naturally into the language 
> through the concept of records, then user-level programs would 
> almost certainly need to be written in order to keep the same 
> functionality.

Of course record-oriented IO is useful in some cases; The real
question, IMHO, is whether it's general purpose enough to be worth
including at the language level. Naturally, for Fortran this is all
water under the bridge due to backwards compatibility requirements,
but for a hypothetical new language starting from a blank sheet. It
seems telling that the vast majority of languages designed in the past
40+ years have gone for a simple stream IO model, leaving various
forms of record IO to libraries.

IMHO, for most simple use cases, stream IO is all that is required,
and for more complicated ones the relatively simple record-oriented IO
supported by Fortran isn't really enough either, so they have to roll
their own anyway. Consider, for instance, a database library such as
Berkeley DB or sqlite: You want to support variable sized records, so
direct access is out. OTOH, you need fast (indexed) access to
arbitrary records, so sequential access is out as well. Leaving you
with rolling your own record system on top of the basic stream IO
interface, or worse, having to implement your own record system on top
of the direct or sequential record systems.

> For a significant period of time in the 90's and early 20's, record 
> oriented I/O was a very nice feature indeed.  It allowed users to 
> access files that were longer than 4GB in length, despite the 
> inherent limitations of 32-bit integer addressing.  In a 
> byte-addressable low-level i/o system, this would not have been 
> possible.  However, now that 64-bit OS and compilers are available, 
> this might not be such an obvious advantage.

Huh, there's plenty of 32-bit OS'es that are perfectly capable of
handling files larger than 4 GB using byte offsets. They just use
64-bit integers for the offset, either native 64-bit integers, or by
emulating them.


-- 
JB
0
Reply foo33 (1360) 3/5/2012 10:34:25 AM

On 3/5/2012 4:34 AM, JB wrote:
> for most simple use cases, stream IO is all that is required,
> and for more complicated ones the relatively simple record-oriented IO
> supported by Fortran isn't really enough either, so they have to roll
> their own anyway.

That is a compact and clear statement of what I had in mind. Except for 
Direct Access files, many of the I/O features of Fortran appear to best 
fit the characteristics of tape I/O.

Similar issues occur with regard to matching Fortran declaration 
statements with different types of RAM. We had Extended Core on CDC 
machines, we had EMM on MSDOS/Windows for a few years, and we have 
several levels of cache memory now.

Extended Core had to be accessed with special lines of code; in the 
main, Fortran has not needed any but the simplest model of memory. Can 
we extend the same simplicity to file I/O? If not, what is the I/O model 
that is as simple as can be for the 99 percenters? That is the challenge.

-- mecej4
0
Reply mecej46801 (41) 3/5/2012 1:53:29 PM

mecej4 <mecej4@nospam.operamail.com> wrote:
> On 3/5/2012 4:34 AM, JB wrote:
>> for most simple use cases, stream IO is all that is required,
>> and for more complicated ones the relatively simple record-oriented IO
>> supported by Fortran isn't really enough either, so they have to roll
>> their own anyway.

> That is a compact and clear statement of what I had in mind. Except for 
> Direct Access files, many of the I/O features of Fortran appear to best 
> fit the characteristics of tape I/O.

For a reasonable fraction of programs, the traditional record I/O
along with the Fortran list-directed and NAMELIST are fine.

I suppose there is one more thing that could be done: 
Add an OPEN option that makes ADVANCE=NO the default, unless
an ADVANCE=YES is included on the I/O statement. That, then,
is similar to Java's println, C's printf("\n") and PL/I PUT SKIP.

> Similar issues occur with regard to matching Fortran declaration 
> statements with different types of RAM. We had Extended Core on CDC 
> machines, we had EMM on MSDOS/Windows for a few years, and we have 
> several levels of cache memory now.

and HIARCHY for OS/360.

-- glen
0
Reply gah (12253) 3/5/2012 2:32:10 PM

In article <slrnjl95lh.27s.foo@hugo.hut.fi>, JB <foo@bar.invalid> 
wrote:

> IMHO, for most simple use cases, stream IO is all that is required,
> and for more complicated ones the relatively simple record-oriented IO
> supported by Fortran isn't really enough either, so they have to roll
> their own anyway. Consider, for instance, a database library such as
> Berkeley DB or sqlite: You want to support variable sized records, so
> direct access is out. OTOH, you need fast (indexed) access to
> arbitrary records, so sequential access is out as well. Leaving you
> with rolling your own record system on top of the basic stream IO
> interface, or worse, having to implement your own record system on top
> of the direct or sequential record systems.

This was a major issue in the discussions in the 1980's regarding 
the next revision of the language.  In addition to features such as 
derived types and array syntax, which did make it into f90, there 
were two groups of people wanting to revise fortran I/O.  One group 
wanted to make all I/O look like a simple stream of bytes modeled 
after C I/O.  The other group wanted to transform fortran I/O into a 
high-level database access language.  In the end, neither faction 
won their argument, and fortran I/O remained largely the same as it 
was in f77.  Namelist I/O was added in a standardized form in f90, 
but other important features such as asynchronous I/O and internal 
list-directed I/O had to wait another decade to be added.  
Meanwhile, stream I/O was standardized separately in the fortran 
POSIX standard in 1991, but the high-level database programming 
models were never standardized, not by ISO, not by ANSI, and not by 
POSIX, so that issue remains a mess in fortran.  And the POSIX 
standard was not as popular as it should have been during the decade 
of the 90's, IMO.

> 
> > For a significant period of time in the 90's and early 20's, record 
> > oriented I/O was a very nice feature indeed.  It allowed users to 
> > access files that were longer than 4GB in length, despite the 
> > inherent limitations of 32-bit integer addressing.  In a 
> > byte-addressable low-level i/o system, this would not have been 
> > possible.  However, now that 64-bit OS and compilers are available, 
> > this might not be such an obvious advantage.
> 
> Huh, there's plenty of 32-bit OS'es that are perfectly capable of
> handling files larger than 4 GB using byte offsets. They just use
> 64-bit integers for the offset, either native 64-bit integers, or by
> emulating them.

Now perhaps, but not in the 1990's.  This was a major issue back 
then as disks were becoming larger and cheaper and as RAID 
functionality became available.  There were several fortran+hardware 
combinations that allowed I/O on large files while access in other 
languages was only through vendor-specific libraries.  The 
limitation in fortran usually was that each record had to be less 
than 2GB in length (or sometimes 4GB, or sometimes even a smaller 
limit), but the record access was through the default integer data 
type in the standard fortran way.

$.02 -Ron Shepard
0
Reply ron-shepard (1197) 3/5/2012 5:38:39 PM

On Sunday, March 4, 2012 5:43:23 AM UTC-8, mecej4 wrote:
[...]
 
> A few months ago we saw a series of posts in CLF by someone who wanted 
> to process FCC data. If I remember correctly, the problem was that the 
> original files were RMS files on VAX/VMS. Each RMS file when downloaded 
> by FTP using a utility resulted in two files: one file with the data, 
> and another file with the metadata. The poster did not quite realize 
> that the first file could not be correctly read without using the 
> information in the second file.

VMS supports a rich variety of record formats. :^) An RMS indexed
file is essentially a (poor-mans's, bare-bones) database.  Seems
to me risky business to think one can blindly FTP any database
file and expect to retain its integrity...although possibly higher
probability of success if resident on a unix-like file system.

I'd agree with Bob Corbett's comment that the internet had a
huge impact on record oriented file systems.  It was pretty 
difficult to deal with LF-delimited stream files on VMS during 
the 80's.  By the the early 90's, VMS (RMS) fully supported
LF-delimited (unix), CR-delimited (Mac) and CRLF-delimieted (DOS)
files, and conversions between them as well as native VMS 
record structures.

   -Ken
0
Reply Ken.Fairfield (491) 3/6/2012 10:25:46 PM

16 Replies
57 Views

(page loaded in 0.424 seconds)

Similiar Articles:


















7/27/2012 2:14:21 AM


Reply: