Is there a portable way to detect the end of a direct access file
while at the same time distinguishing it from other kinds of read
errors? I can recall discussions from long ago about whether
reading past the end of a direct access file should be an error
condition or an end-of-file condition. As I recall, some F77
compilers would use a negative IOSTAT value when reading past the
end of a direct access file, just like for sequential access files,
while other compilers would use a positive value. For one specific
application, I would like to detect the end-of-file condition so
that a very specific action can be taken, one that I would not want
to take if other kinds of read errors occur. With CVF 6.6c, I'm
having it look for an IOSTAT value of 36, but that's not portable.
Is there a more portable way, possibly with some new intrinsic
introduced in later versions of the standard?
|
|
0
|
|
|
|
Reply
|
tholen
|
3/27/2011 3:12:52 AM |
|
On 27 mrt, 05:12, tho...@antispam.ham wrote:
> Is there a portable way to detect the end of a direct access file
> while at the same time distinguishing it from other kinds of read
> errors? =A0I can recall discussions from long ago about whether
> reading past the end of a direct access file should be an error
> condition or an end-of-file condition. =A0As I recall, some F77
> compilers would use a negative IOSTAT value when reading past the
> end of a direct access file, just like for sequential access files,
> while other compilers would use a positive value. =A0For one specific
> application, I would like to detect the end-of-file condition so
> that a very specific action can be taken, one that I would not want
> to take if other kinds of read errors occur. =A0With CVF 6.6c, I'm
> having it look for an IOSTAT value of 36, but that's not portable.
> Is there a more portable way, possibly with some new intrinsic
> introduced in later versions of the standard?
I am not sure there is - F2008 might define one - but why not do
the following:
1. Write a new (temporary) direct-access file with one or two records
2. Close it and open it, to mimick the situation you have as close
as possible
3. Read record no. 2 or 3 - just beyond the end of the file and
register the error code.
4. Remove the temporary file
5. Use this empirically found error - you should be independent
of compiler-specific values now.
Regards,
Arjen
|
|
0
|
|
|
|
Reply
|
Arjen
|
3/28/2011 10:01:30 AM
|
|
Arjen Markus <arjen.markus895@gmail.com> wrote:
> On 27 mrt, 05:12, tho...@antispam.ham wrote:
> > Is there a portable way to detect the end of a direct access file
> > while at the same time distinguishing it from other kinds of read
> > errors?
No, there is not. In fact, there is no portable way to guarantee that
such a read will even get an error at all. It might just suceed and
return bogus data. It is easy to imagine implentations and situations
where that would happen; I believe such might even have existed.
(Picture implementations that allocate such files in blocks that could
be bigger than the record size).
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam
|
3/28/2011 3:17:31 PM
|
|
tholen@antispam.ham wrote:
> Is there a portable way to detect the end of a direct access file
> while at the same time distinguishing it from other kinds of read
> errors? I can recall discussions from long ago about whether
> reading past the end of a direct access file should be an error
> condition or an end-of-file condition.
The usual way I have seen is to write the first record as a
special record with, among others, the number of records in
the file. More generally, with other file-specific information
the reader would need to know. Then update as appropriate.
That is pretty much what most file systems do to keep track
of how many blocks there are, (though often not in the first
record.)
You might also need to keep a free list, of available but not
used yet blocks.
-- glen
|
|
0
|
|
|
|
Reply
|
glen
|
3/28/2011 6:20:20 PM
|
|
In article <imqjh4$i8o$1@dont-email.me>,
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> tholen@antispam.ham wrote:
>
> > Is there a portable way to detect the end of a direct access file
> > while at the same time distinguishing it from other kinds of read
> > errors? I can recall discussions from long ago about whether
> > reading past the end of a direct access file should be an error
> > condition or an end-of-file condition.
>
> The usual way I have seen is to write the first record as a
> special record with, among others, the number of records in
> the file. More generally, with other file-specific information
> the reader would need to know. Then update as appropriate.
>
> That is pretty much what most file systems do to keep track
> of how many blocks there are, (though often not in the first
> record.)
>
> You might also need to keep a free list, of available but not
> used yet blocks.
It isn't really clear what an end of file would mean for a direct
access file is it? The records can be written or read in any order,
so the only thing that can really happen is that you try to read a
record that has not yet been written. But there could be records
after that one that have been created, so should that be an end of
file condition? Should it be treated differently to read an
uncreated record in the middle of the file somewhere than to try to
read a uncreated record after that last written record?
Maybe on some file systems, or with particular implementations of
direct access it might make sense, but it seems that within the
fortran model itself, it isn't clear what exactly an end of file
means in this case.
Is there anything new in the most recent fortran standards that
clarify this issue?
$.02 -Ron Shepard
|
|
0
|
|
|
|
Reply
|
ron-shepard (1197)
|
3/29/2011 6:02:17 AM
|
|
Ron Shepard <ron-shepard@nospam.comcast.net> wrote:
(after someone wrote)
>> > Is there a portable way to detect the end of a direct access file
>> > while at the same time distinguishing it from other kinds of read
>> > errors?
(snip, then I wrote)
>> The usual way I have seen is to write the first record as a
>> special record with, among others, the number of records in
>> the file. More generally, with other file-specific information
>> the reader would need to know. Then update as appropriate.
(snip)
> It isn't really clear what an end of file would mean for a direct
> access file is it? The records can be written or read in any order,
> so the only thing that can really happen is that you try to read a
> record that has not yet been written. But there could be records
> after that one that have been created, so should that be an end of
> file condition? Should it be treated differently to read an
> uncreated record in the middle of the file somewhere than to try to
> read a uncreated record after that last written record?
I believe that is what the usual unix implementations will do
with the popular unix file systems. Many allow for sparse files,
where data can be written without all the bytes (or blocks) before
that point. Only blocks with data are stored. I believe the
system keeps track of the last byte written, so there could be
a test. If you read blocks from before the last written block
you get zeros. I am not sure what unix does if you read later bytes.
> Maybe on some file systems, or with particular implementations of
> direct access it might make sense, but it seems that within the
> fortran model itself, it isn't clear what exactly an end of file
> means in this case.
For z/OS BDAM (still around from OS/360 days), when the file is
first created, all the blocks have to be written sequentially,
and in contiguous disk space. It can be extended later only if
contiguous tracks are available. There will be an error if you
try to read past EOF, but the error might not be EOF.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
3/29/2011 7:17:21 AM
|
|
On 2011-03-29, glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> Ron Shepard <ron-shepard@nospam.comcast.net> wrote:
>
> (after someone wrote)
>>> > Is there a portable way to detect the end of a direct access file
>>> > while at the same time distinguishing it from other kinds of read
>>> > errors?
>
> (snip, then I wrote)
>>> The usual way I have seen is to write the first record as a
>>> special record with, among others, the number of records in
>>> the file. More generally, with other file-specific information
>>> the reader would need to know. Then update as appropriate.
>
> (snip)
>> It isn't really clear what an end of file would mean for a direct
>> access file is it? The records can be written or read in any order,
>> so the only thing that can really happen is that you try to read a
>> record that has not yet been written. But there could be records
>> after that one that have been created, so should that be an end of
>> file condition? Should it be treated differently to read an
>> uncreated record in the middle of the file somewhere than to try to
>> read a uncreated record after that last written record?
>
> I believe that is what the usual unix implementations will do
> with the popular unix file systems. Many allow for sparse files,
> where data can be written without all the bytes (or blocks) before
> that point. Only blocks with data are stored. I believe the
> system keeps track of the last byte written, so there could be
> a test.
> If you read blocks from before the last written block
> you get zeros.
Bytes, not blocks. The file system does work with blocks, yes, but
that's an implementation detail not visible in the read/write syscall
semantics.
> I am not sure what unix does if you read later bytes.
If you try to read data starting from beyond the last written byte,
the read(2) function will return 0. Similarly, if the size of a file
is, say, 1000 bytes, and you try to read 2000 bytes from the
beginning, read(2) will return 1000. Unless an error occurs (end of
file is not considered an error), in that case the return value is -1
and errno will be set. See
http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html
It is up to the filesystem implementation whether holes in a file (aka
a sparse file) are implemented by writing zero-filled blocks, or
whether the fs is capable of keeping track of holes in some other
manner. In the latter case, a further question is whether space for
the hole(s) is preallocated or not; assuming that the holes will be
filled, preallocating ensures a more contigous block layout. In either
case, none if this affects the semantics of the read(2) syscall.
Wrt whether what happens if one tries to read an unwritten direct
access record (before or after EOF), IIRC the Fortran standard says
this is implementation-defined behavior. In case Fortran wishes to
outlive POSIX and/or Windows, keeping it that way probably makes sense
lest it makes implementing direct access files with decent performance
impossible on some other system with different semantics.
--
JB
|
|
0
|
|
|
|
Reply
|
foo33 (1360)
|
3/29/2011 7:58:57 AM
|
|
Richard Maine writes:
>> Is there a portable way to detect the end of a direct access file
>> while at the same time distinguishing it from other kinds of read
>> errors?
> No, there is not. In fact, there is no portable way to guarantee that
> such a read will even get an error at all. It might just suceed and
> return bogus data. It is easy to imagine implentations and situations
> where that would happen; I believe such might even have existed.
> (Picture implementations that allocate such files in blocks that could
> be bigger than the record size).
But wouldn't that also be true of sequential access files? I'm
picturing file systems that allocate space in blocks, such that
a file containing only a single byte actually consumes something
like 4096 bytes of disk space.
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
3/29/2011 10:04:17 PM
|
|
tholen@antispam.ham wrote:
> Richard Maine writes:
(snip)
>> No, there is not. In fact, there is no portable way to guarantee that
>> such a read will even get an error at all. It might just suceed and
>> return bogus data.
(snip)
> But wouldn't that also be true of sequential access files? I'm
> picturing file systems that allocate space in blocks, such that
> a file containing only a single byte actually consumes something
> like 4096 bytes of disk space.
It is true for CP/M. For text files, they put X'1B' (control-z)
at the end so that the actual EOF could be found. For no good
reason, that convention went into MS-DOS and then windows.
As the MS-DOS and Windows file systems know where the last
byte is, there is no need for the control-Z.
(That is separate from the control-z to indicate EOF when
reading from the terminal.)
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
3/29/2011 11:23:40 PM
|
|
Arjen Markus <arjen.markus895@gmail.com> writes:
>> Is there a portable way to detect the end of a direct access file
>> while at the same time distinguishing it from other kinds of read
>> errors? =A0I can recall discussions from long ago about whether
>> reading past the end of a direct access file should be an error
>> condition or an end-of-file condition. =A0As I recall, some F77
>> compilers would use a negative IOSTAT value when reading past the
>> end of a direct access file, just like for sequential access files,
>> while other compilers would use a positive value. =A0For one specific
>> application, I would like to detect the end-of-file condition so
>> that a very specific action can be taken, one that I would not want
>> to take if other kinds of read errors occur. =A0With CVF 6.6c, I'm
>> having it look for an IOSTAT value of 36, but that's not portable.
>> Is there a more portable way, possibly with some new intrinsic
>> introduced in later versions of the standard?
> I am not sure there is - F2008 might define one - but why not do
> the following:
>
> 1. Write a new (temporary) direct-access file with one or two records
> 2. Close it and open it, to mimick the situation you have as close
> as possible
> 3. Read record no. 2 or 3 - just beyond the end of the file and
> register the error code.
> 4. Remove the temporary file
> 5. Use this empirically found error - you should be independent
> of compiler-specific values now.
Interesting idea. Makes me think of a physics professor from
35 years ago who, in evaluating problem set solutions, was
known for saying "very ponderous, very clumsy". But I can't
think of a reason why it wouldn't work. Thanks for the suggestion!
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
3/30/2011 12:56:09 AM
|
|
<tholen@antispam.ham> wrote:
> Richard Maine writes:
>
> >> Is there a portable way to detect the end of a direct access file
> >> while at the same time distinguishing it from other kinds of read
> >> errors?
>
> > No, there is not. In fact, there is no portable way to guarantee that
> > such a read will even get an error at all. It might just suceed and
> > return bogus data. It is easy to imagine implentations and situations
> > where that would happen; I believe such might even have existed.
> > (Picture implementations that allocate such files in blocks that could
> > be bigger than the record size).
>
> But wouldn't that also be true of sequential access files? I'm
> picturing file systems that allocate space in blocks, such that
> a file containing only a single byte actually consumes something
> like 4096 bytes of disk space.
No.
Sequential access files are guaranteed by the standard to have a
detectable end-of-file. That can well be by some indication within the
file if the OS doesn't keep track of an exact file length for you. There
are ways to do that, and all standard-conforming compilers manage. It
can by by a special EOF character (such as control-Z) or by some
convention in the record headers.
Note that the standard does not guarantee that you can necessarily open
and read any random non-Fortran file. In general, pretty much all
operating systems have a "standard" (for that OS) text file format, so
you can count on being able to read text files from non-Fortran sources.
The Fortran standard doesn't guaramtee anything like that, but in
practice it is always the case. Unformatted files are a very different
matter. Fortran unformatted files often have things like record hearders
to support functionality required by the Fortran standard. Non-text
files created by non-Fortran applications typically don't have the same
header structure and are not generally readable by standard Fortran
means - at least prior to f2003's introduction of stream I/O. (Prior
compilers usually provided some simillar mechanism, but not in a
portable way).
--
Richard Maine
email: last name at domain . net
domain: summer-triangle
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
3/30/2011 2:17:01 AM
|
|
<tholen@antispam.ham> wrote:
> > 1. Write a new (temporary) direct-access file with one or two records
> > 2. Close it and open it, to mimick the situation you have as close
> > as possible
> > 3. Read record no. 2 or 3 - just beyond the end of the file and
> > register the error code.
> > 4. Remove the temporary file
> > 5. Use this empirically found error - you should be independent
> > of compiler-specific values now.
>
> Interesting idea. Makes me think of a physics professor from
> 35 years ago who, in evaluating problem set solutions, was
> known for saying "very ponderous, very clumsy". But I can't
> think of a reason why it wouldn't work.
I can. Plenty of them.
As noted in my other post, you might not get an error at all. Or you
might get one sometimes, but not other times. An example of that would
be an OS that allocated in blocks and gave you an error only if you
happened to go past the end of the last allocated block.
Or you might get one of multiple errors, depending on various details.
For example, I'm aware of at least one compiler that doesn't even give
consistent results for a case that *IS* guaranteed by the standard -
namely for sequential end-of-file. The standard specifies that there is
a unique status value to indicate end-of-file. I know about this because
when f2003 was adding a standard way to inquire about the end-of-file
status value,one vendor wanted to modify the proposal to allow a vector
of value sto be returned because they had multiple end-of-file status
values. They were not particularly pleased when I pointed out that the
standard explicitly disallowed that (and has ever since f77), and thus
that there compiler was non-conforming in that regard.
I believe that they returned different status values depending on things
like whether the eof was from a formated or unformatted read, and
possibly for internal versus external reads. I never quite figured out
what the benefit of such a distinction was supposed to be, as the user
presumably already knew whether the read was formated or unformatted
(that being compile-time determined by the syntax of the read).
In any case, when there exist vendors who return multiple status values
for a case where the standard explicitly disallows such behavior, I
don't think you can count on consistency in a case where the standard
has no guarantee at all - and it doesn't.
You might be able to do something that works at least most of the time.
But you will *NOT* find anything that is guaranteed by the standard to
work; I can assure you of that.
--
Richard Maine
email: last name at domain . net
domain: summer-triangle
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
3/30/2011 2:30:19 AM
|
|
On Mar 27, 1:12=A0pm, tho...@antispam.ham wrote:
> Is there a portable way to detect the end of a direct access file
> while at the same time distinguishing it from other kinds of read
> errors? =A0I can recall discussions from long ago about whether
> reading past the end of a direct access file should be an error
> condition or an end-of-file condition. =A0As I recall, some F77
> compilers would use a negative IOSTAT value when reading past the
> end of a direct access file, just like for sequential access files,
> while other compilers would use a positive value. =A0For one specific
> application, I would like to detect the end-of-file condition so
> that a very specific action can be taken, one that I would not want
> to take if other kinds of read errors occur. =A0With CVF 6.6c, I'm
> having it look for an IOSTAT value of 36, but that's not portable.
> Is there a more portable way, possibly with some new intrinsic
> introduced in later versions of the standard?
I have always found that there are two methods that work, one wholely
Fortran and one requiring a service call (SYSTEM).
1) open the random access file and perform a binary search for the
last record, because although there may be unwritten records WITHIN
the extent of the file, reading beyond the LAST written record will
return a signal of EOF or ERROR.
2) call SYSTEM to read the directory where the file is stored and
obtain the byte length of the file. Divide the file length by the
record length to get the number of records and hence the index to the
last reord.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
3/31/2011 12:03:55 AM
|
|
Terence <tbwright@cantv.net> wrote:
> On Mar 27, 1:12 pm, tho...@antispam.ham wrote:
> > Is there a portable way to detect the end of a direct access file
...
> I have always found that there are two methods that work, one wholely
> Fortran and one requiring a service call (SYSTEM).
>
> 1) open the random access file and perform a binary search for the
> last record, because although there may be unwritten records WITHIN
> the extent of the file, reading beyond the LAST written record will
> return a signal of EOF or ERROR.
>
> 2) call SYSTEM to read the directory where the file is stored and
> obtain the byte length of the file. Divide the file length by the
> record length to get the number of records and hence the index to the
> last reord.
He explicitly asked for portable. Neither of those ways qualify.
System is not portable; even more so when you then want to use it to
determine the file length. Your requirements might not include
portability, but that's explicitly what the OP asked for.
The Fortran method is not portable because your statement that "reading
beyond the last written record will return a signal of EOF or ERROR" is
wrong on multiple counts. First, as mentioned elsethread, no it simply
is not so that it will necessarly return an EOF or ERROR. It will on
*SOME* systems. Emphasis on the "some". It will not work on others;
counterexamples exist. Something that works on some systems might indeed
be fine for some particular applications, but the OP didn't ask for
something that might be fine for some particular applications. He asked
for something portable; that method isnt.
Second, you appear to assume that reading an unwritten record within the
extent of the file file will not return an error. That assumption also
is not portable.
--
Richard Maine
email: last name at domain . net
domain: summer-triangle
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
3/31/2011 1:00:02 AM
|
|
On Mar 31, 11:00=A0am, nos...@see.signature (Richard Maine) wrote:
> Terence <tbwri...@cantv.net> wrote:
> > On Mar 27, 1:12 pm, tho...@antispam.ham wrote:
> > > Is there a portable way to detect the end of a direct access file
> ..
> > I have always found that there are two methods that work, one wholely
> > Fortran and one requiring a service call (SYSTEM).
>
> > 1) open the random access file and perform a binary search for the
> > last record, because although there may be unwritten records WITHIN
> > the extent of the file, reading beyond the LAST written record will
> > return a signal of EOF or ERROR.
>
> > 2) call SYSTEM to read the directory where the file is stored and
> > obtain the byte length of the file. Divide the file length by the
> > record length to get the number of records and hence the index to the
> > last reord.
>
> He explicitly asked for portable. Neither of those ways qualify.
>
> System is not portable; even more so when you then want to use it to
> determine the file length. Your requirements might not include
> portability, but that's explicitly what the OP asked for.
>
> The Fortran method is not portable because your statement that "reading
> beyond the last written record will return a signal of EOF or ERROR" is
> wrong on multiple counts. First, as mentioned elsethread, no it simply
> is not so that it will necessarly return an EOF or ERROR. It will on
> *SOME* systems. Emphasis on the "some". It will not work on others;
> counterexamples exist. Something that works on some systems might indeed
> be fine for some particular applications, but the OP didn't ask for
> something that might be fine for some particular applications. He asked
> for something portable; that method isnt.
>
> Second, you appear to assume that reading an unwritten record within the
> extent of the file file will not return an error. That assumption also
> is not portable.
>
> --
> Richard Maine
> email: last name at domain . net
> domain: summer-triangle
> Second, you appear to assume that reading an unwritten record within the
> extent of the file file will not return an error. That assumption also
> is not portable.
>
No, I didn't assume anything. I sidestepped what might happen with
trying to read read unwritten random access record spaces not
containing data. I never allow the case in software I write (by
checking any write index against highest written record index plus 1).
I HAVE seen different responses (e.g. dummmy or old data and 'bad
read' signals) when reading someone else's data. I suspect some
implementations actually write some signal in the track area for
"assigned, not written". However I would not consider that writing
random access files with unused record gaps to be part of portability
designs.
As to the "Fortran" method. As I said, I've always found that to work.
My commercial software depends on it and I haven't had a failure
report about it (1972 to 2010).
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
3/31/2011 5:54:59 AM
|
|
<tholen@antispam.ham> wrote in message news:imm9vk$amp$1@speranza.aioe.org...
| Is there a portable way to detect the end of a direct access file
| while at the same time distinguishing it from other kinds of read
| errors?
Where is the end of a direct access file?
Records in a DA file are accessed by means of a record number.
Access is not sequential. If you supply the number of a record
that does not exist, and attempt to read such a record, then all it can
tell you is that the record doesn't exist. It doesn't matter whether
the record number is higher than any existing record.
If you supply the number of a record that does not exist,
and you attempt to write that record, then the record is written.
It doesn't matter whether the record number is higher than
number of any record already in the file.
| I can recall discussions from long ago about whether
| reading past the end of a direct access file should be an error
| condition or an end-of-file condition. As I recall, some F77
| compilers would use a negative IOSTAT value when reading past the
| end of a direct access file, just like for sequential access files,
| while other compilers would use a positive value. For one specific
| application, I would like to detect the end-of-file condition so
| that a very specific action can be taken, one that I would not want
| to take if other kinds of read errors occur. With CVF 6.6c, I'm
| having it look for an IOSTAT value of 36, but that's not portable.
| Is there a more portable way, possibly with some new intrinsic
| introduced in later versions of the standard?
|
|
0
|
|
|
|
Reply
|
robin512 (309)
|
3/31/2011 6:56:46 AM
|
|
Terence <tbwright@cantv.net> wrote:
> On Mar 31, 11:00 am, nos...@see.signature (Richard Maine) wrote:
> > Terence <tbwri...@cantv.net> wrote:
> > > On Mar 27, 1:12 pm, tho...@antispam.ham wrote:
> > > > Is there a portable way to detect the end of a direct access file
> > ..
> > > I have always found that there are two methods that work, one wholely
> > > Fortran and one requiring a service call (SYSTEM).
> > He explicitly asked for portable. Neither of those ways qualify.
[I elaborate about why they are not portable and don't necessarily
detect an end-of-file, including...]
> > Second, you appear to assume that reading an unwritten record within the
> > extent of the file file will not return an error. That assumption also
> > is not portable.
> >
> No, I didn't assume anything.
[and explains why he never runs into the problem]
Alas, I did assume things. I assumed that if someone asks a specific
question, a response is probably intended to answer the question.
In particular, if someone asks if there is a portable way of detecting
an end of file, I tend to assume that replies are meant to suggest a
portable way of detecting an end of file rather than ways that aren't
portable and/or don't actually distinguish an end-of-file from other
conditions. While the answers might be flawed, for example in that
people often are not aware that some things aren't portable, I do tend
to assume that at least the intent is to answer the question asked.
My assumptions sometimes turn out to be wrong.
--
Richard Maine
email: last name at domain . net
domain: summer-triangle
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
3/31/2011 7:05:20 AM
|
|
"robin" <robin51@dodo.mapson.com.au> writes:
>> Is there a portable way to detect the end of a direct access file
>> while at the same time distinguishing it from other kinds of read
>> errors?
> Where is the end of a direct access file?
As far as the operating system is concerned, it doesn't know the
access method that a program might use on a file. Therefore the
end of the file when accessed directly ought to be the same place
as the end of the file when accessed sequentially.
> Records in a DA file are accessed by means of a record number.
> Access is not sequential.
But access can be done sequentially by reading records in a DO loop
where the loop index is being incremented by just one each pass
through the loop and the loop index being used for the record
number. That's not an uncommon use.
> If you supply the number of a record
> that does not exist, and attempt to read such a record, then all it can
> tell you is that the record doesn't exist. It doesn't matter whether
> the record number is higher than any existing record.
It should be able to tell you that the record is beyond the end of
the file, based on the size of the file on the disk.
> If you supply the number of a record that does not exist,
> and you attempt to write that record, then the record is written.
> It doesn't matter whether the record number is higher than
> number of any record already in the file.
Not particularly relevant to the case at hand.
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
4/1/2011 1:31:27 PM
|
|
<tholen@antispam.ham> wrote in message news:in4k3e$1l8$1@speranza.aioe.org...
| "robin" <robin51@dodo.mapson.com.au> writes:
|
| >> Is there a portable way to detect the end of a direct access file
| >> while at the same time distinguishing it from other kinds of read
| >> errors?
|
| > Where is the end of a direct access file?
|
| As far as the operating system is concerned, it doesn't know the
| access method that a program might use on a file.
It does, as soon as the first direct access READ is executed.
| Therefore the
| end of the file when accessed directly ought to be the same place
| as the end of the file when accessed sequentially.
That doesn't follow.
Read sequentially, there is an end of the file.
But we're talking about direct (random) access.
| > Records in a DA file are accessed by means of a record number.
| > Access is not sequential.
|
| But access can be done sequentially by reading records in a DO loop
| where the loop index is being incremented by just one each pass
| through the loop and the loop index being used for the record
| number. That's not an uncommon use.
I very much doubt that. Let's say that the file is 2Gb in size.
Do you really think that the OS is going to do a linear search
of that file for every record access?
| > If you supply the number of a record
| > that does not exist, and attempt to read such a record, then all it can
| > tell you is that the record doesn't exist. It doesn't matter whether
| > the record number is higher than any existing record.
|
| It should be able to tell you that the record is beyond the end of
| the file, based on the size of the file on the disk.
But there isn't an end.
| > If you supply the number of a record that does not exist,
| > and you attempt to write that record, then the record is written.
| > It doesn't matter whether the record number is higher than
| > number of any record already in the file.
|
| Not particularly relevant to the case at hand.
|
|
0
|
|
|
|
Reply
|
robin512 (309)
|
4/2/2011 12:57:42 PM
|
|
"robin" <robin51@dodo.mapson.com.au> writes:
>>>> Is there a portable way to detect the end of a direct access file
>>>> while at the same time distinguishing it from other kinds of read
>>>> errors?
>>> Where is the end of a direct access file?
>> As far as the operating system is concerned, it doesn't know the
>> access method that a program might use on a file.
> It does, as soon as the first direct access READ is executed.
The point that I tried to make is that the file has an end
associated with it quite independently of any program that
might access the file. Therefore the end of the file is a
property that shouldn't suddenly disappear because a program
opened that file for direct access.
>> Therefore the
>> end of the file when accessed directly ought to be the same place
>> as the end of the file when accessed sequentially.
> That doesn't follow.
> Read sequentially, there is an end of the file.
> But we're talking about direct (random) access.
No, I'm talking about the file independently of any access
method. It has an end. Once a program opens that file for
direct access, why should the known end of the file suddenly
become unknown? It's like that old National Lampoon album
about Watergate: "What did the President know, and when did
he stop knowing it?"
>>> Records in a DA file are accessed by means of a record number.
>>> Access is not sequential.
>> But access can be done sequentially by reading records in a DO loop
>> where the loop index is being incremented by just one each pass
>> through the loop and the loop index being used for the record
>> number. That's not an uncommon use.
> I very much doubt that. Let's say that the file is 2Gb in size.
> Do you really think that the OS is going to do a linear search
> of that file for every record access?
I said nothing about accessing every record. I said that records
could be accessed sequentially. Suppose I need to read 100
consecutive records starting at record 200000. I'd use something
like:
RecNum = 2000000
DO I=1,100,1
READ (UnitNum,REC=RecNum) Record
RecNum = RecNum + I
END DO
But what if there are only 2000050 records in the file? When you
try to read record 2000051, you have an error condition. What I
would like to see is a portable way to distinguish an end-of-file
condition from an attempt to read character data with an integer
format edit descriptor (assuming formatted access). I have an
application where the remaining 50 records that need to be read
would be the ones starting at record number 1 (think of a circular
data file), but I need to know when I reached the end of the file
so that I can start over at the beginning of it.
>>> If you supply the number of a record
>>> that does not exist, and attempt to read such a record, then all it can
>>> tell you is that the record doesn't exist. It doesn't matter whether
>>> the record number is higher than any existing record.
>> It should be able to tell you that the record is beyond the end of
>> the file, based on the size of the file on the disk.
> But there isn't an end.
Of course there is. How else is the operating system going to
know that there is additional free space on the disk that can
be used for other files? Create a file that doesn't have any
end to it, and that file is infinitely big, therefore consuming
all available disk space.
>>> If you supply the number of a record that does not exist,
>>> and you attempt to write that record, then the record is written.
>>> It doesn't matter whether the record number is higher than
>>> number of any record already in the file.
>> Not particularly relevant to the case at hand.
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
4/2/2011 9:22:26 PM
|
|
tholen@antispam.ham wrote:
(snip)
> The point that I tried to make is that the file has an end
> associated with it quite independently of any program that
> might access the file. Therefore the end of the file is a
> property that shouldn't suddenly disappear because a program
> opened that file for direct access.
It is possible for a file system to create a block on an attempt
to access it. In that case, you wouldn't be able to find the
end by reading.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
4/2/2011 11:04:53 PM
|
|
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> tholen@antispam.ham wrote:
> (snip)
> > The point that I tried to make is that the file has an end
> > associated with it quite independently of any program that
> > might access the file. Therefore the end of the file is a
> > property that shouldn't suddenly disappear because a program
> > opened that file for direct access.
>
> It is possible for a file system to create a block on an attempt
> to access it. In that case, you wouldn't be able to find the
> end by reading.
And in general, no a direct access file doesn't necessarily have an
identifiable "end". For some implementations, it will, but the concept
of an end of the file is not inherent in the notion of a direct acess
file, and isn't necessarily inherent in an implementation of one either.
For example, it is perfectly possible and standard conforming to
implement a direct access file with a key structure, using the record
numbers as keys. In that case, the records of the file might well not be
stored in the order of their keys. The record with the highest-numbered
key could be physically in the middle of the file.
That's not the usual implementation (well, some sparse storage
mechanisms can look something like that, but usually at a level lower
than what Fortran "sees"), but it is a valid one.
That really is an essential point here - that no, a direct acess file
does not have an end as far as the Fortran standard is concerned. It
isn't just that the means for detecting the end got left out; rather
that the concept of a file end is not part of the definition. "End" is
inherently a sequential concept; not all things are sequential. Such a
thing might or might not be part of a particular implementation.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/2/2011 11:29:15 PM
|
|
Richard Maine <nospam@see.signature> wrote:
(snip)
> For example, it is perfectly possible and standard conforming to
> implement a direct access file with a key structure, using the record
> numbers as keys. In that case, the records of the file might well not be
> stored in the order of their keys. The record with the highest-numbered
> key could be physically in the middle of the file.
Like arrays in AWK. For OS/360 BDAM, and its VSAM equivalent,
RRDS (Relative Record Data Set) with fixed length records, records
are indentified by position. VSAM/RRDS allows for variable length
records which, for obvious reasons, are not indexed by position,
but by a system supplied (and hidden from user programs) index.
> That's not the usual implementation (well, some sparse storage
> mechanisms can look something like that, but usually at a level lower
> than what Fortran "sees"), but it is a valid one.
I don't know VSAM well enough to know, but if you could convince
Fortran to write a variable length VSAM/RRDS file, even though
all records do have the same length, then it would do that.
> That really is an essential point here - that no, a direct acess file
> does not have an end as far as the Fortran standard is concerned. It
> isn't just that the means for detecting the end got left out; rather
> that the concept of a file end is not part of the definition. "End" is
> inherently a sequential concept; not all things are sequential. Such a
> thing might or might not be part of a particular implementation.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
4/3/2011 12:11:12 AM
|
|
On 4/2/2011 3:22 PM, tholen@antispam.ham wrote:
<snip>
> RecNum = 2000000
> DO I=1,100,1
> READ (UnitNum,REC=RecNum) Record
> RecNum = RecNum + I
> END DO
>
> But what if there are only 2000050 records in the file? When you
> try to read record 2000051, you have an error condition. What I
> would like to see is a portable way to distinguish an end-of-file
> condition from an attempt to read character data with an integer
> format edit descriptor (assuming formatted access). I have an
> application where the remaining 50 records that need to be read
> would be the ones starting at record number 1 (think of a circular
> data file), but I need to know when I reached the end of the file
> so that I can start over at the beginning of it.
>
Interesting. Can you say more about what you're doing?
For a practical and portable solution, could you add an EOF flag to your
record and whenever you add records write a sentinel record -- one with
EOF set and possibly nothing else -- when you're done? When you read
the file, you could test the flag instead of the end-of-file condition
you were hoping for, and then use the record index of the sentinel as
the place to start when you add more records. (You'd have to initialize
the file by writing a sentinel record.)
Louis
|
|
0
|
|
|
|
Reply
|
lkrupp_nospam1 (64)
|
4/3/2011 2:05:05 AM
|
|
On 4/2/2011 6:29 PM, Richard Maine wrote:
> glen herrmannsfeldt<gah@ugcs.caltech.edu> wrote:
>
>> tholen@antispam.ham wrote:
>> (snip)
>>> The point that I tried to make is that the file has an end
>>> associated with it quite independently of any program that
>>> might access the file. Therefore the end of the file is a
>>> property that shouldn't suddenly disappear because a program
>>> opened that file for direct access.
>>
>> It is possible for a file system to create a block on an attempt
>> to access it. In that case, you wouldn't be able to find the
>> end by reading.
>
> And in general, no a direct access file doesn't necessarily have an
> identifiable "end". For some implementations, it will, but the concept
> of an end of the file is not inherent in the notion of a direct acess
> file, and isn't necessarily inherent in an implementation of one either.
> For example, it is perfectly possible and standard conforming to
> implement a direct access file with a key structure, using the record
> numbers as keys. In that case, the records of the file might well not be
> stored in the order of their keys. The record with the highest-numbered
> key could be physically in the middle of the file.
>
> That's not the usual implementation (well, some sparse storage
> mechanisms can look something like that, but usually at a level lower
> than what Fortran "sees"), but it is a valid one.
>
> That really is an essential point here - that no, a direct acess file
> does not have an end as far as the Fortran standard is concerned. It
> isn't just that the means for detecting the end got left out; rather
> that the concept of a file end is not part of the definition. "End" is
> inherently a sequential concept; not all things are sequential. Such a
> thing might or might not be part of a particular implementation.
>
Maybe we can just have a facility that returns the highest numbered
written "key"/record.
|
|
0
|
|
|
|
Reply
|
garylscott (1357)
|
4/3/2011 3:43:37 AM
|
|
tholen@antispam.ham wrote:
(snip)
> The point that I tried to make is that the file has an end
> associated with it quite independently of any program that
> might access the file. Therefore the end of the file is a
> property that shouldn't suddenly disappear because a program
> opened that file for direct access.
While on many systems one can read the same file as either
sequential or direct access, and that is a favorite way to
get around some other problems, that isn't required.
One should assume that a file is one or the other, and not both.
(snip)
> No, I'm talking about the file independently of any access
> method. It has an end. Once a program opens that file for
> direct access, why should the known end of the file suddenly
> become unknown? It's like that old National Lampoon album
> about Watergate: "What did the President know, and when did
> he stop knowing it?"
(snip, someone wrote)
>> I very much doubt that. Let's say that the file is 2Gb in size.
>> Do you really think that the OS is going to do a linear search
>> of that file for every record access?
> I said nothing about accessing every record. I said that records
> could be accessed sequentially. Suppose I need to read 100
> consecutive records starting at record 200000. I'd use something
> like:
> RecNum = 2000000
> DO I=1,100,1
> READ (UnitNum,REC=RecNum) Record
> RecNum = RecNum + I
> END DO
As I said before, a common solution is to put special data
in the first record, such as the number of records in the file.
Note that all records normally have the same length, but that
doesn't mean that they must have the same data fields.
(And you don't have to read every bit, either.)
> But what if there are only 2000050 records in the file? When you
> try to read record 2000051, you have an error condition. What I
> would like to see is a portable way to distinguish an end-of-file
> condition from an attempt to read character data with an integer
> format edit descriptor (assuming formatted access). I have an
> application where the remaining 50 records that need to be read
> would be the ones starting at record number 1 (think of a circular
> data file), but I need to know when I reached the end of the file
> so that I can start over at the beginning of it.
Put the number in the first record. Read it first, then use it
when reading to know when you are at the end. Portable to all
implementations, unless your record length is too small to hold
the number.
(snip)
>>> It should be able to tell you that the record is beyond the end of
>>> the file, based on the size of the file on the disk.
>> But there isn't an end.
> Of course there is. How else is the operating system going to
> know that there is additional free space on the disk that can
> be used for other files? Create a file that doesn't have any
> end to it, and that file is infinitely big, therefore consuming
> all available disk space.
As Richard said, that isn't necessary with a keyed implementation.
Note that the AWK array assignment statement:
a[1000000000000000000000000000000000000000000000000000000]=3;
works just fine even on computers with small memories.
(I just tried it to be sure. Interestingly, the subscript
is actually 1e+54.)
>>>> If you supply the number of a record that does not exist,
>>>> and you attempt to write that record, then the record is written.
>>>> It doesn't matter whether the record number is higher than
>>>> number of any record already in the file.
>>> Not particularly relevant to the case at hand.
It also would work with keyed access records.
And even if the standard was changed in 2013, you might not see
it in compilers until 2020 or so.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
4/3/2011 4:58:17 AM
|
|
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> tholen@antispam.ham wrote:
> > Create a file that doesn't have any
> > end to it, and that file is infinitely big, therefore consuming
> > all available disk space.
>
> As Richard said, that isn't necessary with a keyed implementation.
>
> Note that the AWK array assignment statement:
>
> a[1000000000000000000000000000000000000000000000000000000]=3;
>
> works just fine even on computers with small memories.
I've actually used a Fortran implementation with simillar properties. As
I recall, Apple Fortran for the UCSD Pascal system on my Apple 2e used
sparse storage for direct access files. I once wrote a small benchmark
program just to illustrate that it was possible for my home Apple system
to beat the big machines in isolated cases. I created a direct access
file and wrote data to some huge record number. Then I read that data
back in and did something with it. My Apple ran the thing in negligable
time, creating a tiny disk file. The "big" machines I ran it on did
things like try to fill up more disk space than they had.
Not having an end does not mean that a file is infinite in size. It just
means that the concept is not applicable. If I have a cat, a dog, a
fish, and a bird, that set of things doesn't have an "end." That's
because it is not sequential; it is just a set of 4 things without an
ordering implied. But it isn't an infinite set, or even particularly
big. The concept of end just doesn't apply.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/3/2011 7:25:33 AM
|
|
I wanted to comment on Robin's response, but as noted many times here,
his e-mail adddress is not accepted by the Google filter, so the usual
response auto-header is not valid and responding to his postings is
not possible via Google.
My comment is that in my experience, as mentionead already, IF you try
to read a direct access record with a number beyond the physical file
space assignment, you WILL get an EOF signal returned to the read
attempt, because the operating system (whatever it is) for which the
Fortran compiler and linker were constructed, WILL know if that
operation is NOT possible. I cannot say the same for unwritten records
within the assigned space. Some implementations will return whatever
was last written to that space, others will return an error signal,
because, as part of that implementation of the assignment process,
special "empty" records are written.
This last situation is of course not defined as part of the Fortran
Standards for any version, but the EOF signal on exceding the assignd
space IS implied in the standard, by the meaning of "End of File" and
when it is to be returned.
Lastly, I repeat: if you have a method of asking the operating system
for the assigned space created for a file written in direct mode,
(e.g. SYSTEM and the use of the DIR command) then that file length can
be used to first determine a check the last record number from the
file size divided by record length, then to use this number as a check
against any attempts to acess any data record. But as said, whether
you get junk or data will depend on the implemention and the care in
the programming that wrote the file.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/4/2011 7:25:10 AM
|
|
Terence <tbwright@cantv.net> wrote:
> This last situation is of course not defined as part of the Fortran
> Standards for any version, but the EOF signal on exceding the assignd
> space IS implied in the standard, by the meaning of "End of File" and
> when it is to be returned.
If you actually read the standard's words on the subject instead of
guessing what it might say, you'll find that the words don't even come
close to implying anything like that.
The standard's actual definition of when EOF is returned is that it
happens when an endfile record is read. It doesn't actually talk about
any concept of the end of the file as being the trigger - just about
reading an endfile record. The physical representation (or lack thereof)
of the endfile record is another question, which I'll not go into.
The standard also explicitly says that direct access files don't have
endfile records. No implication is needed; it is quite explicit. "If the
sequential access method is also a member of the set of allowed direct
access methods for the file, its endfile record, if any, is not
considered to be part of the file while it is connected for direct
access. If the sequential access mehod is not a member of the set of
direct acess methods for a file, the file shall not contain an endfile
record."
Not only does the standard not imply that an EOF should be returned for
a direct acess file, but doing so is essentially a violation of the
standard. The *ONLY* way that a compiler can return an EOF on a direct
access file and still claim standard conformance is via the "universal
out" that the program violates the standard in a way that allows the
compiler to do absolutely anything. Vendors don't usually try that line.
If you have a compiler that returns an EOF for a direct acess file, you
should probably submit a bug report (if the vendor still exists to take
them). If it is that compiler that you have claimed in the past to be
bug free, then that's one. I think there have existed compilers where
that bug was in early versions, but got fixed later.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/4/2011 3:08:06 PM
|
|
Richard Maine <nospam@see.signature> wrote:
> Terence <tbwright@cantv.net> wrote:
>
> > This last situation is of course not defined as part of the Fortran
> > Standards for any version, but the EOF signal on exceding the assignd
> > space IS implied in the standard, by the meaning of "End of File" and
> > when it is to be returned.
>
> If you actually read the standard's words on the subject instead of
> guessing what it might say, you'll find that the words don't even come
> close to implying anything like that....
> Not only does the standard not imply that an EOF should be returned for
> a direct acess file, but doing so is essentially a violation of the
> standard. The *ONLY* way that a compiler can return an EOF on a direct
> access file and still claim standard conformance is via the "universal
> out" that the program violates the standard in a way that allows the
> compiler to do absolutely anything. Vendors don't usually try that line.
Hmm. It ocurred to me to check a detail on that. Yep, I missed one bit.
At least if you are using EOF=, the compiler can't get by with even that
claim. Using EOF= with a direct access read violates a constraint, which
is one of the things that compilers are required to be able to diagnose.
Constraint C920 in f2003: "If the REC= specifier appears, an END=
specifier shall not appear,..."
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/4/2011 3:26:11 PM
|
|
On Apr 5, 1:26=A0am, nos...@see.signature (Richard Maine) wrote:
> Richard Maine <nos...@see.signature> wrote:
> > Terence <tbwri...@cantv.net> wrote:
>
> > > =A0This last situation is of course not defined as part of the Fortra=
n
> > > Standards for any version, but the EOF signal on exceding the assignd
> > > space IS implied in the standard, by the meaning of "End of File" and
> > > when it is to be returned.
>
> > If you actually read the standard's words on the subject instead of
> > guessing what it might say, you'll find that the words don't even come
> > close to implying anything like that....
> > Not only does the standard not imply that an EOF should be returned for
> > a direct acess file, but doing so is essentially a violation of the
> > standard. The *ONLY* way that a compiler can return an EOF on a direct
> > access file and still claim standard conformance is via the "universal
> > out" that the program violates the standard in a way that allows the
> > compiler to do absolutely anything. Vendors don't usually try that line=
..
>
> Hmm. It ocurred to me to check a detail on that. Yep, I missed one bit.
> At least if you are using EOF=3D, the compiler can't get by with even tha=
t
> claim. Using EOF=3D with a direct access read violates a constraint, whic=
h
> is one of the things that compilers are required to be able to diagnose.
>
> Constraint C920 in f2003: "If the REC=3D specifier appears, an END=3D
> specifier shall not appear,..."
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
There you go! The answer to one of my caveats noted is "F2003 is such
a compiler".
I bet that prior to that date, the compilers were quite happy with EOF
signals after a sequential file being read in direct accesss mode. I
can't say I tried all of them, but I processed a Heck of a lot of
files provided by clients to suggest the truth of that.
I have about 66 programs in general distribution, compiled with CVF/
DVF V 6.6c, that seem to work (no reported complaints), most of which
DO use that mixed usage of formatted sequential and unformatted
direct.
One of my techniques, when a "data" file entering a program, could be
any known format of Market Reasearch data, including all text modes
and record sizes and formats, SAP 16-bit, IBM 12-bit binary and Qantum
coding, was to read the first block of data as random access
unformatted, than check out the many possible formats to identify just
what was out there, prior to closing and re-opening the file in the
really most suitable mode (even if again random access, but with a
different effective RECL size).
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/5/2011 4:55:55 AM
|
|
Terence <tbwright@cantv.net> wrote:
> There you go! The answer to one of my caveats noted is "F2003 is such
> a compiler".
> I bet that prior to that date, the compilers were quite happy with EOF
> signals after a sequential file being read in direct accesss mode. I
> can't say I tried all of them, but I processed a Heck of a lot of
> files provided by clients to suggest the truth of that.
I quoted f2003, but the restriction applies since f77. I have used
plenty of compilers that did *NOT* misbehave as you describe; in fact,
I'd say it was most of the compilers I've used.
> I have about 66 programs in general distribution, compiled with CVF/
> DVF V 6.6c, that seem to work (no reported complaints), most of which
> DO use that mixed usage of formatted sequential and unformatted
> direct.
I'm not 100% sure, but I think Intel fixed that bug.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/5/2011 6:48:17 AM
|
|
Terence <tbwright@cantv.net> wrote:
> On Apr 5, 1:26 am, nos...@see.signature (Richard Maine) wrote:
> > Richard Maine <nos...@see.signature> wrote:
> > Constraint C920 in f2003: "If the REC= specifier appears, an END=
> > specifier shall not appear,..."
> There you go! The answer to one of my caveats noted is "F2003 is such
> a compiler".
> I bet that prior to that date, the compilers were quite happy with EOF
> signals after a sequential file being read in direct accesss mode. I
> can't say I tried all of them, but I processed a Heck of a lot of
> files provided by clients to suggest the truth of that.
I was in a bit of a rush when I wrote my prior reply, so I didn't go get
citations. I did previously mention that the restrictions date to f77.
Guess I'll need to drag my f77 standard out again, as you don't seem to
believe me. The same restriction in f77 is in 12.8.1
"A control information list must not contain both a record identifier
and an end-of-file specifier."
Just for kicks, I tried the following two programs on both compilers
that I have handy at the moment. (I was going to do 3, but discovered
that the license for my 3rd one had expired). Note that neither of these
are f2003 compilers. In fact, as I have regularly grumbled, I've never
seen an f2003 compiler. Both have the same behavior. They properly won't
even compile with the end= specifier. When I use iostat=, they both
properly indicate an error - not an eof.
I deliberately did these sticking almost completely to f77 (except that
I just can't bring myself to type in all caps). Thus, for example, I
hard-wired the record length to 4 instead of using inquire to find an
appropriate length. In this case, the 4 ought to work ok even for
compilers that measure in words; the file will just be a little longer
than needed. Do be careful about pre-existing files named clf.dat (for
example, if you run it with compilers that use different measure, there
might actually be a valid record 2 for one of them). I'd have used
status='replace' in f90 to avoid that problem.
algol:~/temp> cat clf.f
program clf
integer stuff
open(10,file='clf.dat',form='unformatted',access='direct',recl=4)
read(10,rec=2,end=100) stuff
100 continue
end
algol:~/temp> g95 clf.f
In file clf.f:4
read(10,rec=2,end=100) stuff
1
Error: REC tag at (1) is incompatible with END tag
algol:~/temp> gfortran clf.f
clf.f:4.27:
read(10,rec=2,end=100) stuff
1
Error: An END tag is not allowed with a REC= specifier at (1)
algol:~/temp> cat clf2.f
program clf2
integer stuff,iostat
open(10,file='clf.dat',form='unformatted',access='direct',recl=4)
stuff = 42
write(10,rec=1) stuff
stuff = 0
read(10,rec=1) stuff
write (*,*) 'record 1 read ok. Stuff = ', stuff
read(10,rec=2,iostat=iostat) stuff
if (iostat.eq.0) then
write (*,*) 'Non-existant record 2 read ok.'
else if (iostat.lt.0) then
write(*,*) 'Buggy compiler got an eof.'
else
write (*,*) 'Iostat error code = ', iostat
end if
end
algol:~/temp> g95 clf2.f
algol:~/temp> ./a.out
record 1 read ok. Stuff = 42
Iostat error code = 213
algol:~/temp> gfortran clf2.f
algol:~/temp> ./a.out
record 1 read ok. Stuff = 42
Iostat error code = 5002
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/5/2011 3:26:24 PM
|
|
On 4/5/2011 10:26 AM, Richard Maine wrote:
....
> I was in a bit of a rush when I wrote my prior reply, so I didn't go get
> citations. I did previously mention that the restrictions date to f77.
> Guess I'll need to drag my f77 standard out again, as you don't seem to
> believe me. The same restriction in f77 is in 12.8.1
>
> "A control information list must not contain both a record identifier
> and an end-of-file specifier."
>
> Just for kicks, I tried the following two programs on both compilers
> that I have handy at the moment. (I was going to do 3, but discovered
> that the license for my 3rd one had expired). Note that neither of these
> are f2003 compilers. In fact, as I have regularly grumbled, I've never
> seen an f2003 compiler. Both have the same behavior. They properly won't
> even compile with the end= specifier. When I use iostat=, they both
> properly indicate an error - not an eof.
....
And, CVF 6.6c barfs as well...
C:\Temp> df clf.f90
clf.f90
clf.f90(4) : Error: An END= specifier is not valid in a direct access
READ statement.
read(10,rec=2,end=100) stuff
--
|
|
0
|
|
|
|
Reply
|
none1568 (6639)
|
4/5/2011 5:03:28 PM
|
|
dpb <none@non.net> wrote:
> On 4/5/2011 10:26 AM, Richard Maine wrote:
> > The same restriction in f77 is in 12.8.1
> >
> > "A control information list must not contain both a record identifier
> > and an end-of-file specifier."
> >
> And, CVF 6.6c barfs as well...
>
> C:\Temp> df clf.f90
> clf.f90
> clf.f90(4) : Error: An END= specifier is not valid in a direct access
> READ statement.
> read(10,rec=2,end=100) stuff
Odd. Not that CVF barfs as it should, but that Terence specifically
cited CVF 6.6C as a compiler that didn't. I wonder whether he
misunderstood the distinction being made between eof versus err. If he
was using iostat and just looking for a non-zero iostat, without being
picky about whether it was positive or negative, I would indeed expect
that to work. In fact, as I mentioned elsewhere, I'd expect that to work
on darn near all compilers, which would match Terence's claim. The
standard doesn't require it, and exceptions probably exist, but I'd
expect the exceptions to be hard to find. That would not be detecting an
eof; it would be an error condition.
I suppose it is still vaguely possible that the compiler refuses the
end= specifier, but returns a negative iostat value. That would surely
be a bug, both in not following the standard and also in not even being
internally consistent, but it is at least possible. Did you try my other
sample as well?
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/5/2011 7:51:40 PM
|
|
On 4/5/2011 2:51 PM, Richard Maine wrote:
....
> I suppose it is still vaguely possible that the compiler refuses the
> end= specifier, but returns a negative iostat value. That would surely
> be a bug, both in not following the standard and also in not even being
> internally consistent, but it is at least possible. Did you try my other
> sample as well?
I hadn't, but won't take but a sec...
C:\Temp> clf2
record 1 read ok. Stuff = 42
Iostat error code = 36
C:\Temp>
Nope, it's ERR>0 so appears that by 6.6c it was fixed if it were an
earlier "feature".
Full text is--
"36 severe (36): Attempt to access non-existent record
FOR$IOS_ATTACCNON.
A direct-access READ or FIND statement attempted to access beyond the
end of a relative file (or a sequential file on disk with fixed-length
records) or access a record that was previously deleted from a relative
file."
The documentation doesn't enumerate the proscription specifically and I
had my eyes crossed and thought for a moment I'd found a bad example,
but it was ERR= not END= in the example. So, afaict, CVF 6.6c is
conforming in this regard altho it doesn't actually document the
restriction in the LRF (but, then again, a LRF isn't a Standards
document of course).
--
|
|
0
|
|
|
|
Reply
|
none1568 (6639)
|
4/5/2011 9:46:09 PM
|
|
On Apr 6, 1:26=A0am, nos...@see.signature (Richard Maine) wrote:
> Terence <tbwri...@cantv.net> wrote:
> > On Apr 5, 1:26 am, nos...@see.signature (Richard Maine) wrote:
> > > Richard Maine <nos...@see.signature> wrote:
> > > Constraint C920 in f2003: "If the REC=3D specifier appears, an END=3D
> > > specifier shall not appear,..."
> > There you go! The answer to one of my caveats noted is "F2003 is such
> > a compiler".
> > I bet that prior to that date, the compilers were quite happy with EOF
> > signals after a sequential file being read in direct accesss mode. I
> > can't say I tried all of them, but I processed a Heck of a lot of
> > files provided by clients to suggest the truth of that.
>
> I was in a bit of a rush when I wrote my prior reply, so I didn't go get
> citations. I did previously mention that the restrictions date to f77.
> Guess I'll need to drag my f77 standard out again, as you don't seem to
> believe me. The same restriction in f77 is in 12.8.1
>
> =A0 "A control information list must not contain both a record identifier
> and an end-of-file specifier."
>
> Just for kicks, I tried the following two programs on both compilers
> that I have handy at the moment. (I was going to do 3, but discovered
> that the license for my 3rd one had expired). Note that neither of these
> are f2003 compilers. In fact, as I have regularly grumbled, I've never
> seen an f2003 compiler. Both have the same behavior. They properly won't
> even compile with the end=3D specifier. When I use iostat=3D, they both
> properly indicate an error - not an eof.
>
> I deliberately did these sticking almost completely to f77 (except that
> I just can't bring myself to type in all caps). Thus, for example, I
> hard-wired the record length to 4 instead of using inquire to find an
> appropriate length. In this case, the 4 ought to work ok even for
> compilers that measure in words; the file will just be a little longer
> than needed. Do be careful about pre-existing files named clf.dat (for
> example, if you run it with compilers that use different measure, there
> might actually be a valid record 2 for one of them). I'd have used
> status=3D'replace' in f90 to avoid that problem.
>
> algol:~/temp> cat clf.f
> =A0 =A0 =A0 program clf
> =A0 =A0 =A0 integer stuff
> =A0 =A0 =A0 open(10,file=3D'clf.dat',form=3D'unformatted',access=3D'direc=
t',recl=3D4)
> =A0 =A0 =A0 read(10,rec=3D2,end=3D100) stuff
> =A0100 =A0continue
> =A0 =A0 =A0 end
> algol:~/temp> g95 clf.f
> In file clf.f:4
>
> =A0 =A0 =A0 read(10,rec=3D2,end=3D100) stuff
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 1
> Error: REC tag at (1) is incompatible with END tag
> algol:~/temp> gfortran clf.f
> clf.f:4.27:
>
> =A0 =A0 =A0 read(10,rec=3D2,end=3D100) stuff
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A01
> Error: An END tag is not allowed with a REC=3D specifier at (1)
> algol:~/temp> cat clf2.f
> =A0 =A0 =A0 program clf2
> =A0 =A0 =A0 integer stuff,iostat
> =A0 =A0 =A0 open(10,file=3D'clf.dat',form=3D'unformatted',access=3D'direc=
t',recl=3D4)
> =A0 =A0 =A0 stuff =3D 42
> =A0 =A0 =A0 write(10,rec=3D1) stuff
> =A0 =A0 =A0 stuff =3D 0
> =A0 =A0 =A0 read(10,rec=3D1) stuff
> =A0 =A0 =A0 write (*,*) 'record 1 read ok. Stuff =3D ', stuff
> =A0 =A0 =A0 read(10,rec=3D2,iostat=3Diostat) stuff
> =A0 =A0 =A0 if (iostat.eq.0) then
> =A0 =A0 =A0 =A0 write (*,*) 'Non-existant record 2 read ok.'
> =A0 =A0 =A0 else if (iostat.lt.0) then
> =A0 =A0 =A0 =A0 =A0write(*,*) 'Buggy compiler got an eof.'
> =A0 =A0 =A0 else
> =A0 =A0 =A0 =A0 =A0write (*,*) 'Iostat error code =3D ', iostat
> =A0 =A0 =A0 end if
> =A0 =A0 =A0 end
> algol:~/temp> g95 clf2.f
> algol:~/temp> ./a.out
> =A0record 1 read ok. Stuff =3D =A042
> =A0Iostat error code =3D =A0213
> algol:~/temp> gfortran clf2.f
> algol:~/temp> ./a.out
> =A0record 1 read ok. Stuff =3D =A0 =A0 =A0 =A0 =A0 42
> =A0Iostat error code =3D =A0 =A0 =A0 =A0 5002
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
Hey! That's not quite the same arguement!
Yes, CVF/DVF does not accept EOF=3D in a direct access read, even in my
compiler, but that is NOT what I was suggesting.
I was stating that you can WRITE an unformatted direct access file,
close it and reopen it and READ it as a sequential formatted file, or
the reverse. Because in both cases if you try to read past the last
record you will get an error signal (and clarified if using EOF=3D in
the sequential mode). Yes, I know the files ends should be marked
differently according to later standards, but in theory (and in my
case practice) you should not reach that point.
Since we were discussing how to find the last record of an unformatted
direct access file, I suggested a binary search. There was some doubt
as ro whether the error code you got back should be treated as an
indication you did not have a written record at that record number, or
whether un-written intermediate records would give an error signal. If
you are using an IOSTAT variable in F90 etc, to check why you got an
error, you may be able to find out.
I also suggested finding the file size and calculating the number of
records from knowledge of the physical record length. Incidently, that
division process can indicate if you DO have any extra final
characters in the file, if the result is not an integer.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/6/2011 7:30:32 AM
|
|
On Apr 6, 5:51=A0am, nos...@see.signature (Richard Maine) wrote:
> dpb <n...@non.net> wrote:
> > On 4/5/2011 10:26 AM, Richard Maine wrote:
> > > The same restriction in f77 is in 12.8.1
>
> > > =A0 =A0"A control information list must not contain both a record ide=
ntifier
> > > and an end-of-file specifier."
>
> > And, CVF 6.6c barfs as well...
>
> > C:\Temp> df clf.f90
> > clf.f90
> > clf.f90(4) : Error: An END=3D specifier is not valid in a direct access
> > READ statement.
> > =A0 =A0 =A0 =A0read(10,rec=3D2,end=3D100) stuff
>
> Odd. Not that CVF barfs as it should, but that Terence specifically
> cited CVF 6.6C as a compiler that didn't. I wonder whether he
> misunderstood the distinction being made between eof versus err. If he
> was using iostat and just looking for a non-zero iostat, without being
> picky about whether it was positive or negative, I would indeed expect
> that to work. In fact, as I mentioned elsewhere, I'd expect that to work
> on darn near all compilers, which would match Terence's claim. The
> standard doesn't require it, and exceptions probably exist, but I'd
> expect the exceptions to be hard to find. That would not be detecting an
> eof; it would be an error condition.
>
> I suppose it is still vaguely possible that the compiler refuses the
> end=3D specifier, but returns a negative iostat value. That would surely
> be a bug, both in not following the standard and also in not even being
> internally consistent, but it is at least possible. Did you try my other
> sample as well?
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
See previous response! CVF/DFV DOES work in the modes I specified, NOT
against any standard, because Richard misunderstood the process I
wrote about. I don't myself use IOSTAT, just the ERR=3D and EOF=3D branchs
if and where permitted.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/6/2011 7:34:08 AM
|
|
On Apr 6, 7:46=A0am, dpb <n...@non.net> wrote:
> On 4/5/2011 2:51 PM, Richard Maine wrote:
> ...
>
> > I suppose it is still vaguely possible that the compiler refuses the
> > end=3D specifier, but returns a negative iostat value. That would surel=
y
> > be a bug, both in not following the standard and also in not even being
> > internally consistent, but it is at least possible. Did you try my othe=
r
> > sample as well?
>
> I hadn't, but won't take but a sec...
>
> C:\Temp> clf2
> =A0 record 1 read ok. Stuff =3D =A0 =A0 =A0 =A0 =A0 42
> =A0 Iostat error code =3D =A0 =A0 =A0 =A0 =A0 36
>
> C:\Temp>
>
> Nope, it's ERR>0 so appears that by 6.6c it was fixed if it were an
> earlier "feature".
>
> Full text is--
>
> "36 severe (36): Attempt to access non-existent record
> FOR$IOS_ATTACCNON.
>
> A direct-access READ or FIND statement attempted to access beyond the
> end of a relative file (or a sequential file on disk with fixed-length
> records) or access a record that was previously deleted from a relative
> file."
>
> The documentation doesn't enumerate the proscription specifically and I
> had my eyes crossed and thought for a moment I'd found a bad example,
> but it was ERR=3D not END=3D in the example. =A0So, afaict, CVF 6.6c is
> conforming in this regard altho it doesn't actually document the
> restriction in the LRF (but, then again, a LRF isn't a Standards
> document of course).
>
> --
That's nice, I suspected that you could get better information from
IOSTAT=3D, and this would probably allow you to detect unwritten direct
access records, as opposed to record numbers that don't exist as
being beyond the last actual record when performing a binary search
for the actual number of reserved record spaces, whether written or
not. The objective was to find the LAST record, remember!
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/6/2011 7:38:41 AM
|
|
Terence <tbwright@cantv.net> wrote:
> Hey! That's not quite the same arguement!
> Yes, CVF/DVF does not accept EOF= in a direct access read, even in my
> compiler, but that is NOT what I was suggesting.
I figured there might be some confusion. I guess actual code helped.
That might not be what you thought you were suggesting, but that's what
it sounded like. In particular, you said
> IF you try
> to read a direct access record with a number beyond the physical file
> space assignment, you WILL get an EOF signal returned to the read
> attempt,
That's what I was disagreeing with. I suppose, on rereading it, I can
see a possible ambiguity in the meaning in that you didn't actually say
that you were talking about a direct access read. But when you talk
about reading a direct access record, that's normally what I'd assume
unless it was really explicitly clear otherwise. I suppose you had
mentioned otherwise in a prior post, but then lots of things in the
prior posts just got selectively ignored (such as pretty much everything
that actualy cited the standard), so it is hard for me to guess which
parts are being assumed to be the subject and which parts are being
ignored.
For that matter
> I bet that prior to that date, the compilers were quite happy with EOF
> signals after a sequential file being read in direct accesss mode.
sounds just the opposite. Sure sounds to me like that was talking about
getting an EOF signal when reading in direct acess mode. I still haven't
figured out how to read that otherwise, but I suppose it doesn't matter
now that I think I understand your point from this last post.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/6/2011 9:09:32 AM
|
|
Louis Krupp <lkrupp_nospam@indra.com.invalid> writes:
> Interesting. Can you say more about what you're doing?
Consider a list of locations (longitude and latitude) on the surface
of a sphere, sorted in order of longitude. Suppose you need a
program to extract the set of locations that fall within some
specified distance of one particular location. For example, suppose
you want a list of locations within 1 degree of longitude 359.9 and
latitude +48. Because the file is sorted by longitude, some of your
desired locations will be at the very beginning of the file between
0 and 0.9 longitude, and the rest will be at the end of the file
between 358.9 and whatever is at the very end of the file.
Now, for the vast majority of such extraction cases, the region
will not straddle the 359.99999 to 0 boundary, so one can start
reading the data file at some predefined record (using an
accelerator table, for example), testing for the longitude limit
and once the lower limit has been reached, testing for the latitude
constraint. One then keeps reading in a sequential fashion, though
using direct access methods, until the longitude exceeds the upper
limit, at which point the read loop exits.
The easiest way (at least that I've been able to come up with) to
handle the rare cases that straddle the wrap boundary is to simply
detect when the end of file has been reached (360), reset the record
number to 1 and resume the sequential reading until the upper limit
has been exceeded. Of course, one needs to include specials tests
for the cases where the "upper" limit of 0.9 is numerically less
than the "lower" limit of 358.9 (using the example above).
> For a practical and portable solution, could you add an EOF flag to your
> record and whenever you add records write a sentinel record -- one with
> EOF set and possibly nothing else -- when you're done? When you read
> the file, you could test the flag instead of the end-of-file condition
> you were hoping for, and then use the record index of the sentinel as
> the place to start when you add more records. (You'd have to initialize
> the file by writing a sentinel record.)
The case I'm dealing with involves a data file provided by someone
else, and I'm not enthusiastic about mucking around with 80 GB
worth of data files.
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
4/7/2011 10:47:56 PM
|
|
On 4/7/2011 4:47 PM, tholen@antispam.ham wrote:
> Louis Krupp<lkrupp_nospam@indra.com.invalid> writes:
>
>> Interesting. Can you say more about what you're doing?
>
> Consider a list of locations (longitude and latitude) on the surface
> of a sphere, sorted in order of longitude. Suppose you need a
> program to extract the set of locations that fall within some
> specified distance of one particular location. For example, suppose
> you want a list of locations within 1 degree of longitude 359.9 and
> latitude +48. Because the file is sorted by longitude, some of your
> desired locations will be at the very beginning of the file between
> 0 and 0.9 longitude, and the rest will be at the end of the file
> between 358.9 and whatever is at the very end of the file.
>
> Now, for the vast majority of such extraction cases, the region
> will not straddle the 359.99999 to 0 boundary, so one can start
> reading the data file at some predefined record (using an
> accelerator table, for example), testing for the longitude limit
> and once the lower limit has been reached, testing for the latitude
> constraint. One then keeps reading in a sequential fashion, though
> using direct access methods, until the longitude exceeds the upper
> limit, at which point the read loop exits.
>
> The easiest way (at least that I've been able to come up with) to
> handle the rare cases that straddle the wrap boundary is to simply
> detect when the end of file has been reached (360), reset the record
> number to 1 and resume the sequential reading until the upper limit
> has been exceeded. Of course, one needs to include specials tests
> for the cases where the "upper" limit of 0.9 is numerically less
> than the "lower" limit of 358.9 (using the example above).
>
>> For a practical and portable solution, could you add an EOF flag to your
>> record and whenever you add records write a sentinel record -- one with
>> EOF set and possibly nothing else -- when you're done? When you read
>> the file, you could test the flag instead of the end-of-file condition
>> you were hoping for, and then use the record index of the sentinel as
>> the place to start when you add more records. (You'd have to initialize
>> the file by writing a sentinel record.)
>
> The case I'm dealing with involves a data file provided by someone
> else, and I'm not enthusiastic about mucking around with 80 GB
> worth of data files.
>
The fact that the data file comes from elsewhere is as important as the
fact that you can't hit EOF reading from a direct file. You don't waste
your time asking the language to do something it's not going to, and
Glen and I don't waste your time with clever ideas on how to decorate a
data file that isn't yours to, as you put it, muck with.
If the file is dense -- i.e., no deleted or unwritten records followed
by written records -- you might be able to assume (or at least hope)
that a READ that returns an error just hit EOF. IOSTAT values might or
might not help you.
Do you (or could you) create that accelerator table yourself? You'd
have to read the file to create that table; could you count the records
in the file while you do that? You're going to have to keep track of
which record you're on to build the table anyway.
(It occurs to me that you may have figured all that out a long time ago,
but you're looking for a cleaner solution in the form of EOF detection.)
Louis
|
|
0
|
|
|
|
Reply
|
lkrupp_nospam1 (64)
|
4/8/2011 8:57:36 AM
|
|
Louis Krupp<lkrupp_nospam@indra.com.invalid> writes:
> The fact that the data file comes from elsewhere is as important as the
> fact that you can't hit EOF reading from a direct file.
But you can try to read a record number that is higher than the highest
record number in the file. Detecting that condition is what I would
like to do in a portable way. The fact that it can't be done suggests
a deficiency in the language that perhaps the standards committee
should address. I don't know what sort of prior discussions have
taken place regarding possibilities like forcing the use of IOSTAT = -1
for end-of-file (as opposed to simply any negative integer) or
IOSTAT = -2 for reading a non-existent record in a direct access file.
It's easy to imagine that it has been discussed, but compiler vendors
didn't want to be forced to change the behavior of their product (they
might already use -2 for something else), which could cause problems
with customers who have written code to use certain IOSTAT values.
> You don't waste
> your time asking the language to do something it's not going to,
But it wasn't a waste of time to ask whether it's possible to do it.
Apparently it can't be done in a portable way now, but that doesn't
necessarily mean that the standard won't ever be changed to do it.
> and
> Glen and I don't waste your time with clever ideas on how to decorate a
> data file that isn't yours to, as you put it, muck with.
Oh, I could muck with it. I don't want to, however. Too easy to
screw things up, and 80 GB worth of data is too much to screw up.
> If the file is dense -- i.e., no deleted or unwritten records followed
> by written records -- you might be able to assume (or at least hope)
> that a READ that returns an error just hit EOF.
The data file is not just dense, but full. There are zero unwritten
records.
> IOSTAT values might or might not help you.
As I noted in my original post, I am currently using IOSTAT = 36 to
detect the condition, but that is only guaranteed to work with
CVF 6.6c and isn't portable. I'd prefer something portable. Or
is writing portable code one of those overrated ideals?
> Do you (or could you) create that accelerator table yourself?
The accelerator table was provided with the data file. I did not
create it myself.
> You'd
> have to read the file to create that table; could you count the records
> in the file while you do that? You're going to have to keep track of
> which record you're on to build the table anyway.
I suppose I could create yet another data file that contains the
known number of records, using brute force methods to determine
that number (like dividing the file size in bytes by the known number
of bytes per record). That way I wouldn't have to muck around with
the original data file or its accelerator table file, and I wouldn't
need to rely on an IOSTAT value, so it's looking like a portable way
to reach the goal, though not exactly my idea of elegant.
> (It occurs to me that you may have figured all that out a long time ago,
> but you're looking for a cleaner solution in the form of EOF detection.)
Or what I called EOF detection. Now that I've learned about keyed
files and how the concept of EOF isn't the same as what I was thinking
of, I need to call it something else. How about Beyond the upper limit
detection: BUL?
|
|
0
|
|
|
|
Reply
|
tholen (16649)
|
4/8/2011 3:07:46 PM
|
|
In article <inn8c2$9th$1@speranza.aioe.org>, tholen@antispam.ham
wrote:
> The data file is not just dense, but full. There are zero unwritten
> records.
This simplifies things as far as portable detection.
Open a separate file with the same characteristics as your actual
file, write record 1, read record 2, and look at the resulting
IOSTAT value. That is the value that you need to detect in your
actual file. You will get different values of this value on
different compilers (and also perhaps different file systems etc.),
but it should be a portable way to detect the error of reading a
record that has not been created.
This might not be portable to sparse files. A vendor might return
different IOSTAT values for the different types of missing records
that it might be able to detect. That is currently allowed, but not
required, by the fortran standard.
> As I noted in my original post, I am currently using IOSTAT = 36 to
> detect the condition, but that is only guaranteed to work with
> CVF 6.6c and isn't portable. I'd prefer something portable. Or
> is writing portable code one of those overrated ideals?
It should be a variable determined at runtime, not a constant.
Otherwise, what you are doing should work for dense files.
[...]
> I suppose I could create yet another data file that contains the
> known number of records, using brute force methods to determine
> that number (like dividing the file size in bytes by the known number
> of bytes per record).
If you want to do this in a portable way, you also need to do some
stuff at runtime. You can use the INQUIRE statement to determine
the file size and you can use the INQUIRE statement to determine the
size of a record. The units of those results are machine dependent
(maybe bytes, maybe words, maybe other possibilities), but I think
they are required to be the same. Whatever they are, once you have
them you can divide one by the other to determine the number of
records.
The filesize thing is a recent addition to the standard (f2003, I
think). So if it is not supported in a particular compiler, then
you have to do a little extra work. You can use the above IOSTAT
value to test whether a particular record exists. With that you
could either search sequentially through the file or, better for an
80GB file, you can do a binary search for the end of the file. You
start off testing records 1, 2, 4, 8, 16... and so on until you get
an error. Then you do a binary search between the largest record
that exists to the smallest record that does not exist to locate the
last record. This effort scales as log(nrec) rather than (nrec), so
it is better for large searches. But, if your compiler supports the
filesize enquiry, that is the best option.
Another possibility if filesize is not supported is to use some
machine-specific subroutine call, perhaps using C interop to access
the posix function. That would be portable among posix operating
systems, but not portable in general.
> Or what I called EOF detection. Now that I've learned about keyed
> files and how the concept of EOF isn't the same as what I was thinking
> of, I need to call it something else. How about Beyond the upper limit
> detection: BUL?
Currently, you can test whether the record exists or it doesn't.
For your file, dense with no missing records, that is sufficient.
If you want IOSTAT values to be standardized, I don't know how much
progress you can make with the vendors. This all depends on file
system characteristics, they are different for local and network
files, they differ depending on buffering of records, and lots of
other details. If something like this is hardwired into the fortran
standard, that means that fortran may not be portable to the next
hot file system that is developed. That's what the vendors, and
also the users to some extent, have to worry about.
$.02 -Ron Shepard
|
|
0
|
|
|
|
Reply
|
ron-shepard (1197)
|
4/8/2011 4:22:34 PM
|
|
Ron Shepard <ron-shepard@NOSPAM.comcast.net> wrote:
> In article <inn8c2$9th$1@speranza.aioe.org>, tholen@antispam.ham
> wrote:
>
> > The data file is not just dense, but full. There are zero unwritten
> > records.
>
> This simplifies things as far as portable detection.
Ron then gives several suggestions which are very likely (even almost
certain) to work well. None of them are actually guaranteed by the
standard, but their odds in practice are high.
Another possibility is stream access. It is an f2003 feature (although
one implemented on most current compilers), so that does limit its
portability. In practice, the direct access tricks almost surely work on
more compilers, but stream access fits much better in principle. This is
much closer to the kind of thing stream was designed for.
Although I wasn't there during the design (heck, direct access was
introduced to the standard in f77 when I was still a young whelp) it
seems clear to me that direct access was not really designed as a way to
do transparent acess to arbitrary files. The usual implementations
happen to be done in a way that allows that, but the standard doesn't
really support that kind of use, and I don't think it was intended to.
It has always looked to me, from a standards perspective, as though
Fortran direct access was designed as a simplified form of keyed access.
The keys are always integer and the records are fixed length. With those
two simplifications, it turns out to have an obvious and simple
implementation on most systems. I'd speculate that some people might
have wanted keyed access, but what we got was a compromise that was
simple enough to pass, while providing at least a base to build on. I
suppose my reverse crystal ball might be wrong, but that's the image I
see in it. Some people have obviously gotten used to the particular
implementation to the extent that they think of it instead of the specs
in the standard as being what Fortran direct acess means. That's
apparent whenever anyone asks questions relating to end-of-file.
Stream access, on the other hand, specifically had a design goal of
reading files from other sources, including for example, files specified
by non-Fortran means. It was introduced under the general umbrella of C
interop, but has wider application than that. Stream files do have a
concept of end-of-file, and it does match the concept mentioned
elsethread. Direct access doesn't involve an end of file concept at all,
the sequential access of an end-of-file condition involves reading a
specific record called an endfile record, but for stream files, an
end-of-file condition occurs "When an attempt is made to read beyond the
end of a stream file."
I'l repeat that I agree it sounds like direct access is the most
portable way in practice today. But in terms of mentioning design
defects in the language and suggesting future "fixes" for direct access,
I think most of those "fixes" are in the wrong direction in that they
don't really fit with the apparent design goals of direct access.
Extending direct access to handle variable length records or non-integer
keys seems like a more sensible direction to me. I'm not particularly
pushing for that kind of extension at the moment, but it makes more
sense to me as a direction. We don't need to try to mold direct access
into something that assumes implementation details in order to become
more like stream; we have stream for that.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/8/2011 4:58:09 PM
|
|
tholen@antispam.ham wrote:
> Louis Krupp<lkrupp_nospam@indra.com.invalid> writes:
>> The fact that the data file comes from elsewhere is as important as the
>> fact that you can't hit EOF reading from a direct file.
> But you can try to read a record number that is higher than the highest
> record number in the file. Detecting that condition is what I would
> like to do in a portable way. The fact that it can't be done suggests
> a deficiency in the language that perhaps the standards committee
> should address.
The standards people are pretty good at looking at what the hardware
can do and writing the standard around that. Often one hardware
implementation is enough to change what might otherwise have gone
into the standard. Note, for example, the Fortran (and C) still
both allow for radix complement, digit complement, and sign
magnitude representation integer representation. It is said
that there are still ones complement machines in production,
but sign magnitude is pretty rare.
If there are implementations where the hardware doesn't report
EOF at the end of a direct-access file, then it is unlikely that
the standard will change. Since programs have been written for
years that didn't need it, there likely isn't much demand for it.
> I don't know what sort of prior discussions have
> taken place regarding possibilities like forcing the use of IOSTAT = -1
> for end-of-file (as opposed to simply any negative integer) or
> IOSTAT = -2 for reading a non-existent record in a direct access file.
> It's easy to imagine that it has been discussed, but compiler vendors
> didn't want to be forced to change the behavior of their product (they
> might already use -2 for something else), which could cause problems
> with customers who have written code to use certain IOSTAT values.
Well, that might make a lot of current implementations non-conforming,
which seems unlikely.
>> You don't waste
>> your time asking the language to do something it's not going to,
> But it wasn't a waste of time to ask whether it's possible to do it.
> Apparently it can't be done in a portable way now, but that doesn't
> necessarily mean that the standard won't ever be changed to do it.
If it is changed, you won't see compilers for about 10 years.
>> and
>> Glen and I don't waste your time with clever ideas on how to decorate a
>> data file that isn't yours to, as you put it, muck with.
> Oh, I could muck with it. I don't want to, however. Too easy to
> screw things up, and 80 GB worth of data is too much to screw up.
As I wrote a few times, the usual choice is a new record at the
beginning indicating the number of records. As big disks are
common and affordable now, it isn't that hard to rewrite the file
with one new record. (It will take less than 10 years.)
(snip)
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
4/8/2011 5:45:14 PM
|
|
Richard Maine <nospam@see.signature> wrote:
(snip)
> Although I wasn't there during the design (heck, direct access was
> introduced to the standard in f77 when I was still a young whelp) it
> seems clear to me that direct access was not really designed as a way to
> do transparent acess to arbitrary files. The usual implementations
> happen to be done in a way that allows that, but the standard doesn't
> really support that kind of use, and I don't think it was intended to.
I don't know so much how it got into the standard. OS/360 Fortran
supports it using BDAM, which is the OS/360 support for direct
access by record number. OS/360 ISAM supports keyed access, I
believe originally for COBOL. (I still have never written a COBOL
program.)
> It has always looked to me, from a standards perspective, as though
> Fortran direct access was designed as a simplified form of keyed access.
> The keys are always integer and the records are fixed length. With those
> two simplifications, it turns out to have an obvious and simple
> implementation on most systems.
Well, PL/I has both record number and keyed access, presumably
derived from COBOL.
> I'd speculate that some people might
> have wanted keyed access, but what we got was a compromise that was
> simple enough to pass, while providing at least a base to build on. I
> suppose my reverse crystal ball might be wrong, but that's the image I
> see in it. Some people have obviously gotten used to the particular
> implementation to the extent that they think of it instead of the specs
> in the standard as being what Fortran direct acess means. That's
> apparent whenever anyone asks questions relating to end-of-file.
OS/360 Fortran had direct access, HP BASIC had it before 1972,
(with records only up to 256 bytes), I believe VAX Fortran had it
before 1977.
For OS/360, you put the number of records on the DEFINE FILE statement
as a constant. I suppose one could still ask for EOF detection,
but at that point one is supposed to know it. I presume this
was one of the implementations used to guide its addition
to Fortran 77.
> Stream access, on the other hand, specifically had a design goal of
> reading files from other sources, including for example, files specified
> by non-Fortran means. It was introduced under the general umbrella of C
> interop, but has wider application than that. Stream files do have a
> concept of end-of-file, and it does match the concept mentioned
> elsethread. Direct access doesn't involve an end of file concept at all,
> the sequential access of an end-of-file condition involves reading a
> specific record called an endfile record, but for stream files, an
> end-of-file condition occurs "When an attempt is made to read beyond the
> end of a stream file."
Well, it seems to me that stream access models the Unix/C
file access conventions.
> I'l repeat that I agree it sounds like direct access is the most
> portable way in practice today. But in terms of mentioning design
> defects in the language and suggesting future "fixes" for direct access,
> I think most of those "fixes" are in the wrong direction in that they
> don't really fit with the apparent design goals of direct access.
> Extending direct access to handle variable length records or non-integer
> keys seems like a more sensible direction to me. I'm not particularly
> pushing for that kind of extension at the moment, but it makes more
> sense to me as a direction. We don't need to try to mold direct access
> into something that assumes implementation details in order to become
> more like stream; we have stream for that.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12241)
|
4/8/2011 6:19:17 PM
|
|
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> Well, it seems to me that stream access models the Unix/C
> file access conventions.
which sounds like an exact match for what the OP was expecting. Thus,
why I mentioned it.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/8/2011 6:27:30 PM
|
|
On Apr 3, 12:58=A0am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> tho...@antispam.ham wrote:
>
> (snip)
>
> > The point that I tried to make is that the file has an end
> > associated with it quite independently of any program that
> > might access the file. =A0Therefore the end of the file is a
> > property that shouldn't suddenly disappear because a program
> > opened that file for direct access.
>
> While on many systems one can read the same file as either
> sequential or direct access, and that is a favorite way to
> get around some other problems, that isn't required.
>
> One should assume that a file is one or the other, and not both.
>
> (snip)
>
> > No, I'm talking about the file independently of any access
> > method. =A0It has an end. =A0Once a program opens that file for
> > direct access, why should the known end of the file suddenly
> > become unknown? =A0It's like that old National Lampoon album
> > about Watergate: =A0"What did the President know, and when did
> > he stop knowing it?"
>
> (snip, someone wrote)
>
> >> I very much doubt that. =A0Let's say that the file is 2Gb in size.
> >> Do you really think that the OS is going to do a linear search
> >> of that file for every record access?
> > I said nothing about accessing every record. =A0I said that records
> > could be accessed sequentially. =A0Suppose I need to read 100
> > consecutive records starting at record 200000. =A0I'd use something
> > like:
> > RecNum =3D 2000000
> > DO I=3D1,100,1
> > =A0READ (UnitNum,REC=3DRecNum) Record
> > =A0RecNum =3D RecNum + I
> > END DO
>
> As I said before, a common solution is to put special data
> in the first record, such as the number of records in the file.
> Note that all records normally have the same length, but that
> doesn't mean that they must have the same data fields.
> (And you don't have to read every bit, either.)
>
> > But what if there are only 2000050 records in the file? =A0When you
> > try to read record 2000051, you have an error condition. =A0What I
> > would like to see is a portable way to distinguish an end-of-file
> > condition from an attempt to read character data with an integer
> > format edit descriptor (assuming formatted access). =A0I have an
> > application where the remaining 50 records that need to be read
> > would be the ones starting at record number 1 (think of a circular
> > data file), but I need to know when I reached the end of the file
> > so that I can start over at the beginning of it.
>
> Put the number in the first record. =A0Read it first, then use it
> when reading to know when you are at the end. =A0Portable to all
> implementations, unless your record length is too small to hold
> the number.
>
> (snip)
>
> >>> It should be able to tell you that the record is beyond the end of
> >>> the file, based on the size of the file on the disk.
> >> But there isn't an end.
> > Of course there is. =A0How else is the operating system going to
> > know that there is additional free space on the disk that can
> > be used for other files? =A0Create a file that doesn't have any
> > end to it, and that file is infinitely big, therefore consuming
> > all available disk space.
>
> As Richard said, that isn't necessary with a keyed implementation.
>
> Note that the AWK array assignment statement:
>
> =A0 =A0a[1000000000000000000000000000000000000000000000000000000]=3D3;
>
> works just fine even on computers with small memories.
> (I just tried it to be sure. =A0Interestingly, the subscript
> is actually 1e+54.) =A0
>
> >>>> If you supply the number of a record that does not exist,
> >>>> and you attempt to write that record, then the record is written.
> >>>> It doesn't matter whether the record number is higher than
> >>>> number of any record already in the file.
> >>> Not particularly relevant to the case at hand.
>
> It also would work with keyed access records.
>
> And even if the standard was changed in 2013, you might not see
> it in compilers until 2020 or so.
>
> -- glen
I had what I believe was a similar problem recently where I had to
work with files that had no self-describing elements
that I include in any file I create myself, as others have
recommended. I have found in past situations that most compiler/OS
combinations had some convention or extension to get around this
problem. But I thought that, given I know the record size of the file
and that it is a
typical data file that I could get it's size by opening it as a stream
file and positioning to then end on the open (avoiding
having to read the entire file assuming it is a common "disk-resident"
file) using f2003 standard calls. It worked everywhere I
tried it. A simplified example follows. Is the method I used to get
the size of the file a portable solution? If so, I think it
fits the original poster's requirements. Please ignore the assumption
that a REAL is 4 bytes in the example for now; that is just
to keep the example simple.
program size_of_file
!------------------------------------
integer,parameter :: ilines=3D5 ! number of lines in file
!------------------------------------
ioplen=3D4*4 ! the size of the record required can vary.
! commonly you know this size when working with direct access
files.
! for this simple example it is assumed you know this value.
! INQUIRE , STORAGE_SIZE(), C_SIZEOF(), ... can handle making
that more generic.
!------------------------------------
! create small direct-access file and close it to simulate
having an existing file to read
=20
open(unit=3D16,iostat=3Dioparc,access=3D'direct',form=3D'unformatted',recl=
=3Dioplen)
do i10=3D1,ilines
r10=3Dreal(i10)
write(16,rec=3Di10),r10,sin(r10),cos(r10),sqrt(r10)
enddo
write(*,*)' created small direct-access file with number of
records=3D',ilines
close(16)
!------------------------------------
! find size of file by positioning to end of file and assuming
record size is known,
! calculate number of records in file
=20
open(unit=3D16,iostat=3Dioparc,access=3D'stream',form=3D'unformatted',posit=
ion=3D'append')
inquire(16,pos=3Disize)
irecs=3D(isize-1)/ioplen
write(*,*)' size of file in bytes is ',isize-1
write(*,*)' assuming size of record is ',ioplen
write(*,*)' number of records in file is ',irecs
close(16)
!------------------------------------
=20
open(unit=3D16,iostat=3Dioparc,access=3D'direct',form=3D'unformatted',recl=
=3Dioplen)
do i20=3D1,30
i20m=3Dmod(i20-1,irecs)+1
read(16,rec=3Di20m) r1,r2,r3,r4
write(*,*),' count=3D',i20,' record read is ',i20m,' and values
are ',r1,r2,r3,r4
enddo
!------------------------------------
end program size_of_file
!
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
! created small direct-access file with number of records=3D 5
! size of file in bytes is 80
! assuming size of record is 16
! number of records in file is 5
! count=3D 1 record read is 1 and values are 1. 0.841471 0.5403023
1.
! count=3D 2 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 3 record read is 3 and values are 3. 0.14112 -0.9899925
1.7320508
! count=3D 4 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 5 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
! count=3D 6 record read is 1 and values are 1. 0.841471 0.5403023
1.
! count=3D 7 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 8 record read is 3 and values are 3. 0.14112 -0.9899925
1.7320508
! count=3D 9 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 10 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
! count=3D 11 record read is 1 and values are 1. 0.841471
0.5403023 1.
! count=3D 12 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 13 record read is 3 and values are 3. 0.14112
-0.9899925 1.7320508
! count=3D 14 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 15 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
! count=3D 16 record read is 1 and values are 1. 0.841471
0.5403023 1.
! count=3D 17 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 18 record read is 3 and values are 3. 0.14112
-0.9899925 1.7320508
! count=3D 19 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 20 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
! count=3D 21 record read is 1 and values are 1. 0.841471
0.5403023 1.
! count=3D 22 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 23 record read is 3 and values are 3. 0.14112
-0.9899925 1.7320508
! count=3D 24 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 25 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
! count=3D 26 record read is 1 and values are 1. 0.841471
0.5403023 1.
! count=3D 27 record read is 2 and values are 2. 0.9092974
-0.4161468 1.4142135
! count=3D 28 record read is 3 and values are 3. 0.14112
-0.9899925 1.7320508
! count=3D 29 record read is 4 and values are 4. -0.7568025
-0.6536436 2.
! count=3D 30 record read is 5 and values are 5. -0.9589243
0.28366217 2.236068
|
|
0
|
|
|
|
Reply
|
urbanjost (37)
|
4/9/2011 5:16:01 AM
|
|
On Apr 9, 2:22=A0am, Ron Shepard <ron-shep...@NOSPAM.comcast.net> wrote:
> In article <inn8c2$9t...@speranza.aioe.org>, tho...@antispam.ham
> wrote:
>
> > The data file is not just dense, but full. =A0There are zero unwritten
> > records.
>
> This simplifies things as far as portable detection.
>
> Open a separate file with the same characteristics as your actual
> file, write record 1, read record 2, and look at the resulting
> IOSTAT value. =A0That is the value that you need to detect in your
> actual file. =A0You will get different values of this value on
> different compilers (and also perhaps different file systems etc.),
> but it should be a portable way to detect the error of reading a
> record that has not been created.
>
> This might not be portable to sparse files. =A0A vendor might return
> different IOSTAT values for the different types of missing records
> that it might be able to detect. =A0That is currently allowed, but not
> required, by the fortran standard.
>
> > As I noted in my original post, I am currently using IOSTAT =3D 36 to
> > detect the condition, but that is only guaranteed to work with
> > CVF 6.6c and isn't portable. =A0I'd prefer something portable. =A0Or
> > is writing portable code one of those overrated ideals?
>
> It should be a variable determined at runtime, not a constant. =A0
> Otherwise, what you are doing should work for dense files.
>
> [...]
>
> > I suppose I could create yet another data file that contains the
> > known number of records, using brute force methods to determine
> > that number (like dividing the file size in bytes by the known number
> > of bytes per record).
>
> If you want to do this in a portable way, you also need to do some
> stuff at runtime. =A0You can use the INQUIRE statement to determine
> the file size and you can use the INQUIRE statement to determine the
> size of a record. =A0The units of those results are machine dependent
> (maybe bytes, maybe words, maybe other possibilities), but I think
> they are required to be the same. =A0Whatever they are, once you have
> them you can divide one by the other to determine the number of
> records.
>
> The filesize thing is a recent addition to the standard (f2003, I
> think). =A0So if it is not supported in a particular compiler, then
> you have to do a little extra work. =A0You can use the above IOSTAT
> value to test whether a particular record exists. =A0With that you
> could either search sequentially through the file or, better for an
> 80GB file, you can do a binary search for the end of the file. =A0You
> start off testing records 1, 2, 4, 8, 16... and so on until you get
> an error. =A0Then you do a binary search between the largest record
> that exists to the smallest record that does not exist to locate the
> last record. =A0This effort scales as log(nrec) rather than (nrec), so
> it is better for large searches. =A0But, if your compiler supports the
> filesize enquiry, that is the best option.
>
> Another possibility if filesize is not supported is to use some
> machine-specific subroutine call, perhaps using C interop to access
> the posix function. =A0That would be portable among posix operating
> systems, but not portable in general.
>
> > Or what I called EOF detection. =A0Now that I've learned about keyed
> > files and how the concept of EOF isn't the same as what I was thinking
> > of, I need to call it something else. =A0How about Beyond the upper lim=
it
> > detection: =A0BUL?
>
> Currently, you can test whether the record exists or it doesn't. =A0
> For your file, dense with no missing records, that is sufficient. =A0
> If you want IOSTAT values to be standardized, I don't know how much
> progress you can make with the vendors. =A0This all depends on file
> system characteristics, they are different for local and network
> files, they differ depending on buffering of records, and lots of
> other details. =A0If something like this is hardwired into the fortran
> standard, that means that fortran may not be portable to the next
> hot file system that is developed. =A0That's what the vendors, and
> also the users to some extent, have to worry about.
>
> $.02 -Ron Shepard
I'll give you another practical method.
1) Find the file size from the directory, before you execute your
program.
2) modify the program to read the current file size as a manual input
response to a question, or as a parameter passed on execution (most of
my programs ask for missing parameters and therefore do it both ways).
3) in the program (as note by many) divide the file size by the record
size and calculate exactly how many records there are currently in the
file.
4) do what you need.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/9/2011 6:24:36 AM
|
|
On Apr 9, 2:58=A0am, nos...@see.signature (Richard Maine) wrote:
> Ron Shepard <ron-shep...@NOSPAM.comcast.net> wrote:
> > In article <inn8c2$9t...@speranza.aioe.org>, tho...@antispam.ham
> > wrote:
>
> > > The data file is not just dense, but full. =A0There are zero unwritte=
n
> > > records.
>
> > This simplifies things as far as portable detection.
>
> Ron then gives several suggestions which are very likely (even almost
> certain) to work well. None of them are actually guaranteed by the
> standard, but their odds in practice are high.
>
> Another possibility is stream access. It is an f2003 feature (although
> one implemented on most current compilers), so that does limit its
> portability. In practice, the direct access tricks almost surely work on
> more compilers, but stream access fits much better in principle. This is
> much closer to the kind of thing stream was designed for.
>
> Although I wasn't there during the design (heck, direct access was
> introduced to the standard in f77 when I was still a young whelp) it
> seems clear to me that direct access was not really designed as a way to
> do transparent acess to arbitrary files. The usual implementations
> happen to be done in a way that allows that, but the standard doesn't
> really support that kind of use, and I don't think it was intended to.
> It has always looked to me, from a standards perspective, as though
> Fortran direct access was designed as a simplified form of keyed access.
> The keys are always integer and the records are fixed length. With those
> two simplifications, it turns out to have an obvious and simple
> implementation on most systems. I'd speculate that some people might
> have wanted keyed access, but what we got was a compromise that was
> simple enough to pass, while providing at least a base to build on. I
> suppose my reverse crystal ball might be wrong, but that's the image I
> see in it. Some people have obviously gotten used to the particular
> implementation to the extent that they think of it instead of the specs
> in the standard as being what Fortran direct acess means. That's
> apparent whenever anyone asks questions relating to end-of-file.
>
> Stream access, on the other hand, specifically had a design goal of
> reading files from other sources, including for example, files specified
> by non-Fortran means. It was introduced under the general umbrella of C
> interop, but has wider application than that. Stream files do have a
> concept of end-of-file, and it does match the concept mentioned
> elsethread. Direct access doesn't involve an end of file concept at all,
> the sequential access of an end-of-file condition involves reading a
> specific record called an endfile record, but for stream files, an
> end-of-file condition occurs "When an attempt is made to read beyond the
> end of a stream file."
>
> I'l repeat that I agree it sounds like direct access is the most
> portable way in practice today. But in terms of mentioning design
> defects in the language and suggesting future "fixes" for direct access,
> I think most of those "fixes" are in the wrong direction in that they
> don't really fit with the apparent design goals of direct access.
> Extending direct access to handle variable length records or non-integer
> keys seems like a more sensible direction to me. I'm not particularly
> pushing for that kind of extension at the moment, but it makes more
> sense to me as a direction. We don't need to try to mold direct access
> into something that assumes implementation details in order to become
> more like stream; we have stream for that.
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
The problem with this suggestion, is that if you nee to read this
stream-written file any other way later, you have to choose how you
will terminate the file, for as Richard has pointed out, the new
standards won't let you have your cake and eat it.
If you just close the file, you possibly could open it as direct
access; if you add an EOF marker you may be abusing the standard, even
if it lets you read the file as a sequential file.
I think we have arrived at a conclusion that ALL FILES SHOULD HAVE EOF
MARKERS. It's the only way that makes sense and allows all the access
modes to work wih all implementations that can anticipate such a
marker.
So change the standard!
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/9/2011 6:30:46 AM
|
|
John <urbanjost@comcast.net> wrote:
> ...given I know the record size of the file and that it is a
> typical data file that I could get it's size by opening it as a stream
> file and positioning to then end on the open (avoiding
> having to read the entire file assuming it is a common "disk-resident"
> file) using f2003 standard calls. It worked everywhere I
> tried it. A simplified example follows. Is the method I used to get
> the size of the file a portable solution?
[code elided]
Yes, I'd say so, at least to the extent that f2003 stream access is
implemented (getting pretty widespread in current compilers, but older
ones could be an issue).
The standard doesn't guarantee that all stream files are positionable,
as indeed some physically are not. But then, you aren't trying to do
this to all stream files, but to a "common disk-resident" one.
If one assumes availability of f2003 features, another option is a
direct inquire of the file size.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/9/2011 3:21:32 PM
|
|
> I think we have arrived at a conclusion that ALL FILES SHOULD HAVE EOF
> MARKERS. It's the only way that makes sense and allows all the access
> modes to work wih all implementations that can anticipate such a
> marker.
Maybe one of us has arived at that conclusion. At least one of us
continues to maintain that the concept makes no sense for some kinds of
files and that trying to shoehorn it in where it does not fit is ill
advised.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
|
|
0
|
|
|
|
Reply
|
nospam47 (9742)
|
4/9/2011 3:21:32 PM
|
|
On Apr 10, 1:21=A0am, nos...@see.signature (Richard Maine) wrote:
> > I think we have arrived at a conclusion that ALL FILES SHOULD HAVE EOF
> > MARKERS. It's the only way that makes sense and allows all the access
> > modes to work wih all implementations that can anticipate such a
> > marker.
>
> Maybe one of us has arived at that conclusion. At least one of us
> continues to maintain that the concept makes no sense for some kinds of
> files and that trying to shoehorn it in where it does not fit is ill
> advised.
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
Even backwards readable mainframe tape files had EOF markers.
And so did append-type files on any computers I can remember using
them.
I cannot think of any file type and mode of access that could not work
if an EOF marker was present. Since all my Fortran-written files still
do have EOF #1a markers, even pure binary files (hence the Microsoft /
A and /B copy parameters to indicate how to treat them), where reading
after the last record in direct access mode will return error, stream
mode will return EOF.
Note: I am not suggesting that you should or can TEST for such in
every mode of Fortran Read operation, unless this is permitted, but it
won't harm you if such is present.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/11/2011 7:08:07 AM
|
|
On 2011-04-11, Terence <tbwright@cantv.net> wrote:
> On Apr 10, 1:21�am, nos...@see.signature (Richard Maine) wrote:
>> > I think we have arrived at a conclusion that ALL FILES SHOULD HAVE EOF
>> > MARKERS. It's the only way that makes sense and allows all the access
>> > modes to work wih all implementations that can anticipate such a
>> > marker.
>>
>> Maybe one of us has arived at that conclusion. At least one of us
>> continues to maintain that the concept makes no sense for some kinds of
>> files and that trying to shoehorn it in where it does not fit is ill
>> advised.
>>
>> --
>> Richard Maine � � � � � � � � � �| Good judgment comes from experience;
>> email: last name at domain . net | experience comes from bad judgment.
>> domain: summertriangle � � � � � | �-- Mark Twain
>
> Even backwards readable mainframe tape files had EOF markers.
> And so did append-type files on any computers I can remember using
> them.
AFAICT, no files that I have read/written using Fortran has had EOF
markers. Just to show that neither your nor mine experience is
particularly universal; OTOH, I don't refuse to believe that files
where the Fortran runtime inserts EOF markers exist.
> I cannot think of any file type and mode of access that could not work
> if an EOF marker was present.
Ah, "argument from personal incredulity". How convincing.
> Since all my Fortran-written files still
> do have EOF #1a markers,
There's of course nothing wrong if you always write some kind of
personal EOF marker in your own files to mark the end. It's a bit
different to argue that the standard should mandate that the Fortran
processor always makes sure there is an EOF marker at the "end"
(wherever that is defined to be).
--
JB
|
|
0
|
|
|
|
Reply
|
foo33 (1360)
|
4/11/2011 7:55:23 AM
|
|
On Apr 9, 11:21=A0am, nos...@see.signature (Richard Maine) wrote:
> John <urbanj...@comcast.net> wrote:
> > ...given I know the record size of the file and that it is a
> > typical data file that I could get it's size by opening it as a stream
> > file and positioning to then end on the open (avoiding
> > having to read the entire file assuming it is a common "disk-resident"
> > file) using f2003 standard calls. It worked everywhere I
> > tried it. A simplified example follows. Is the method I used to get
> > the size of the file a portable solution?
>
> [code elided]
>
> Yes, I'd say so, at least to the extent that f2003 stream access is
> implemented (getting pretty widespread in current compilers, but older
> ones could be an issue).
>
> The standard doesn't guarantee that all stream files are positionable,
> as indeed some physically are not. But then, you aren't trying to do
> this to all stream files, but to a "common disk-resident" one.
>
> If one assumes availability of f2003 features, another option is a
> direct inquire of the file size.
>
> --
> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment come=
s from experience;
> email: last name at domain . net | experience comes from bad judgment.
> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
Thanks for pointing that out --
I had missed that INQUIRE had that feature in f2003+. This worked
everywhere I tried it:
inquire(unit=3DIREAD,SIZE=3Disize_inquired)
That is indeed much "cleaner"; and I did not see it mentioned in any
of the other related posts.
I gave up compilers that don't have all or almost all f2003 features
for New Year's so this works for me;
and works with g95(1) and gfortran(1).
|
|
0
|
|
|
|
Reply
|
urbanjost (37)
|
4/11/2011 6:41:14 PM
|
|
On Apr 11, 5:55=A0pm, JB <f...@bar.invalid> wrote:
> On 2011-04-11, Terence <tbwri...@cantv.net> wrote:
>
>
>
>
>
> > On Apr 10, 1:21=A0am, nos...@see.signature (Richard Maine) wrote:
> >> > I think we have arrived at a conclusion that ALL FILES SHOULD HAVE E=
OF
> >> > MARKERS. It's the only way that makes sense and allows all the acces=
s
> >> > modes to work wih all implementations that can anticipate such a
> >> > marker.
>
> >> Maybe one of us has arived at that conclusion. At least one of us
> >> continues to maintain that the concept makes no sense for some kinds o=
f
> >> files and that trying to shoehorn it in where it does not fit is ill
> >> advised.
>
> >> --
> >> Richard Maine =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| Good judgment c=
omes from experience;
> >> email: last name at domain . net | experience comes from bad judgment.
> >> domain: summertriangle =A0 =A0 =A0 =A0 =A0 | =A0-- Mark Twain
>
> > Even backwards readable mainframe tape files had EOF markers.
> > And so did append-type files on any computers I can remember using
> > them.
>
> AFAICT, no files that I have read/written using Fortran has had EOF
> markers. Just to show that neither your nor mine experience is
> particularly universal; OTOH, I don't refuse to believe that files
> where the Fortran runtime inserts EOF markers exist.
>
> > I cannot think of any file type and mode of access that could not work
> > if an EOF marker was present.
>
> Ah, "argument from personal incredulity". How convincing.
>
> > Since all my Fortran-written files still
> > do have EOF #1a markers,
>
> There's of course nothing wrong if you always write some kind of
> personal EOF marker in your own files to mark the end. It's a bit
> different to argue that the standard should mandate that the Fortran
> processor always makes sure there is an EOF marker at the "end"
> (wherever that is defined to be).
>
> --
> JB
No, I don't write the EOF markers, unless I am writing stream files.
(And direct files don't get pne placed there).
But I find that closing any sequential field gets a #1a put at the
end.
I close my files, I don't rely on the default doing it for me on
program exit.
|
|
0
|
|
|
|
Reply
|
tbwright (1098)
|
4/12/2011 1:44:03 AM
|
|
On 4/11/11 8:44 PM, Terence wrote:
> On Apr 11, 5:55 pm, JB<f...@bar.invalid> wrote:
>> On 2011-04-11, Terence<tbwri...@cantv.net> wrote:
>>
>>
>>
>>
>>
>>> On Apr 10, 1:21 am, nos...@see.signature (Richard Maine) wrote:
>>>>> I think we have arrived at a conclusion that ALL FILES SHOULD HAVE EOF
>>>>> MARKERS. It's the only way that makes sense and allows all the access
>>>>> modes to work wih all implementations that can anticipate such a
>>>>> marker.
>>
>>>> Maybe one of us has arived at that conclusion. At least one of us
>>>> continues to maintain that the concept makes no sense for some kinds of
>>>> files and that trying to shoehorn it in where it does not fit is ill
>>>> advised.
>>
>>>> --
>>>> Richard Maine | Good judgment comes from experience;
>>>> email: last name at domain . net | experience comes from bad judgment.
>>>> domain: summertriangle | -- Mark Twain
>>
>>> Even backwards readable mainframe tape files had EOF markers.
>>> And so did append-type files on any computers I can remember using
>>> them.
>>
>> AFAICT, no files that I have read/written using Fortran has had EOF
>> markers. Just to show that neither your nor mine experience is
>> particularly universal; OTOH, I don't refuse to believe that files
>> where the Fortran runtime inserts EOF markers exist.
>>
>>> I cannot think of any file type and mode of access that could not work
>>> if an EOF marker was present.
>>
>> Ah, "argument from personal incredulity". How convincing.
>>
>>> Since all my Fortran-written files still
>>> do have EOF #1a markers,
>>
>> There's of course nothing wrong if you always write some kind of
>> personal EOF marker in your own files to mark the end. It's a bit
>> different to argue that the standard should mandate that the Fortran
>> processor always makes sure there is an EOF marker at the "end"
>> (wherever that is defined to be).
>>
>> --
>> JB
>
> No, I don't write the EOF markers, unless I am writing stream files.
> (And direct files don't get pne placed there).
> But I find that closing any sequential field gets a #1a put at the
> end.
> I close my files, I don't rely on the default doing it for me on
> program exit.
You have an interesting/contradictory attitude here. On the one hand,
the standard is clear (to me, anyhow) that sometimes there isn't an EOF
on some kinds of files--yet you insist that there ALWAYS is one. On the
other hand, the standard is clear (to me) when it says in e. g. F77
"12.10.2.1 Implicit Close at Termination of Execution
At termination of execution of an executable program for reasons other
than an error condition, all units that are connected are closed."
and yet you don't trust the files to be closed. It's an odd view of
what standards are trying to promote.
Dick Hendrickson
|
|
0
|
|
|
|
Reply
|
dick.hendrickson (1286)
|
4/12/2011 2:39:23 AM
|
|
|
57 Replies
212 Views
(page loaded in 0.828 seconds)
Similiar Articles: Detect end-of-file of Binary file - comp.unix.programmer ...In this how could I detect the end-of-file at the ... Detect end-of-file of Binary file - comp.unix.programmer ... Detect if a Record is in use - comp.databases.ms-access ... Could anyone give me the spice-mode.el - comp.emacsFrom the website, I found a package called spice-mode.el. But I could not access to it. Could anyone kindly help me send that file to me? Regards Roger Accessing file while it is being copied - comp.lang.java ...Somehow, I need to be able to detect when the copy ... mydir\product.xml (The process cannot access the file because ... format for the file contents (checksum, end-of-file ... tar truncates end of file name - comp.unix.solaris... tar truncates end of file name - comp.unix.solaris how to direct ... tar(C) - DocView: Access to SCO OpenServer Documentation The named files are written to the end of ... comp.emacs - page 2simple question - how to move point to top/end of file 4 67 (7/3/2003 8:54:58 AM) tried ... Thanks Andre -- Direct access to this group with http://web2news.com http ... Set a selection of a uibutton group element (Changing users choice ...... SelectedObject',h); but I have no idea how access h. ... Executes when selected object is changed in File ... Preselection' % do something different end Masked block, Simulink - force initialization commands on resize ...> > Thanks for any hints, > Josar Hi, The MoveFcn block callback is triggered everytime a block is resized: http://www.mathworks.com/access/helpdesk/help/toolbox ... extract single file without read all tar - comp.unix.solaris ...... independent if the file is at start, middle or end of file. ... you can't compress the data if you want random access. ... how to direct solaris tar to extract to different ... Detecting if a variable is defined - comp.soft-sys.matlab ...Detecting end of page in iText - comp.text.pdf Detecting if a ... if a variable is defined - Newsreader - MATLAB Central File exchange, MATLAB Answers, newsgroup access ... "Export to Filemaker" script step on server - comp.databases ...... that uses Export Records to create a temp file in ... I don't have direct access to the server, but I am reasonably ... Detecting last record in a found set in a script step ... Zip data descriptor (bit 3 flag) - comp.compression... way to access it directly either from > the front or back of the file. No. You should first seek to the very end and read backward until you detect ... at the end of the file ... Terminal Services and Kiosk Mode - comp.databases.filemaker ...... work, because you want to share the same set of files. ... update the "real" database on FMserver at the end of ... so I *could* use Kiosk mode, users wouldn't have direct access ... Reading an unformatted file - comp.lang.fortranIf you had to use it before, that would most likely have been for direct access ... * Records are concatenated sequentially until end of file. * There is no extra ... set a tikzpicture to a specific width - comp.text.texFind NT disk size with script/bat file - comp.os ... How to detect begin and end of a resize operation - comp.lang.tcl ... Restricting VPN access to certain IP ranges - comp ... Fortran 95 equivalent of read(..., POS=...) - comp.lang.fortran ...... is ', c read(10,pos=7) c write (*,*) 'c at pos 7 is ', c end ... all the other parts of the code that have anything to do with the file in question. "Direct access ... How to Detect the End of a File in Visual Basic | eHow.comHow to Detect the End of a File in Visual Basic. The .Net framework inside Visual Basic ... to the End of a File in Visual Basic 6; How to Create a Random Access File End-of-file Detection - High-Performance Computing Center of RSUNote also that there is no concept of end-of-file on direct-access files: it is simply an ... the Fortran Standard only requires Fortran systems to detect the end-of-file ... 7/25/2012 10:18:13 AM
|