Read a list with undefined length

  • Follow


When I use the READ statement, I use data file of this kind:

npts
25

x   y
2   5
7   9
10 13
.....

In other words in the file the number of records is explicit (an
INTEGER variable).
I would not specify the number of records.
I am reading the CVF guide and I found this:
"err, end, eor
Are branch specifiers if an error (ERR=label), end-of-file
(END=label), or end-of-record (EOR=label) condition occurs.
EOR can only be specified for nonadvancing READ statements."

Could I use this optional input for my purpose?
I have just posed this question, but I need a more complete answer.
Thanks
0
Reply Allamarein 3/25/2011 7:25:48 PM

On 2011-03-25 16:25:48 -0300, Allamarein said:

> When I use the READ statement, I use data file of this kind:
> 
> npts
> 25
> 
> x   y
> 2   5
> 7   9
> 10 13
> ....
> 
> In other words in the file the number of records is explicit (an
> INTEGER variable).
> I would not specify the number of records.
> I am reading the CVF guide and I found this:
> "err, end, eor
> Are branch specifiers if an error (ERR=label), end-of-file
> (END=label), or end-of-record (EOR=label) condition occurs.
> EOR can only be specified for nonadvancing READ statements."
> 
> Could I use this optional input for my purpose?
> I have just posed this question, but I need a more complete answer.
> Thanks

An answer to your question would be a long list of questions. :-(

That is because now you do not care what happens in the "26th" record.
The question is what do you want the "26th" record to be? If nothing
then you will find the END to be useful as it will allow you to read
nothing gracefully. EOR and ERR are much more for do-it-yourself error
processing.

If you want the "26th" record to be another set of data then the discussion
gets longer.

I have give two answers to the question of what the "26th" record might be.

I expect that there will be multiple responses to your question where each
will supply thier answer to the what next and then proeceed from there.




0
Reply Gordon 3/25/2011 7:41:48 PM


On 3/25/11 2:25 PM, Allamarein wrote:
> When I use the READ statement, I use data file of this kind:
>
> npts
> 25
>
> x   y
> 2   5
> 7   9
> 10 13

I'm not sure I completely understand your question, but the normal way 
to do things like this is
      integer  ::  npts, x(1000), y(1000)
      read (unit,*) npts
      if (npts > 1000) stop
      read (unit, *)  (x(i), y(i), i=1,npts)

   [  or even
          do i = 1,npts
             read (unit,*) x(i), y(i)
          enddo    ]

Where you first read in the npts value and then use this to control a 
loop.  You pick dimensions for x and y that you are sure should be big 
enough.  If you actually have blank lines and lines with the characters 
"npts" and "x  y", you'll have to skip those lines.  You might also have 
to use an explicit format, rather than the * I used.  A lot depends on 
exactly what you have in the file.

Lines like
        read(unit,format) npts, (x(i), y(i), i=1, npts)
are very common if you know for sure that the values will be reasonable.

Dick Hendrickson
> ....
>
> In other words in the file the number of records is explicit (an
> INTEGER variable).
> I would not specify the number of records.
> I am reading the CVF guide and I found this:
> "err, end, eor
> Are branch specifiers if an error (ERR=label), end-of-file
> (END=label), or end-of-record (EOR=label) condition occurs.
> EOR can only be specified for nonadvancing READ statements."
>
> Could I use this optional input for my purpose?
> I have just posed this question, but I need a more complete answer.
> Thanks

0
Reply Dick 3/25/2011 8:14:46 PM

Allamarein <matteo.diplomacy@gmail.com> wrote:
> When I use the READ statement, I use data file of this kind:
 
> npts
> 25
 
> x   y
> 2   5
> 7   9
> 10 13
> ....
 
(snip)
> Are branch specifiers if an error (ERR=label), end-of-file
> (END=label), or end-of-record (EOR=label) condition occurs.
> EOR can only be specified for nonadvancing READ statements."
 
> Could I use this optional input for my purpose?

This has resulted in long discussions in the past.
There is no easy answer.

Well, in the days of static allocation it was easy.  One allocated
the arrays as large as possible (on the single task machine) and
read in as much as one could.

With dynamic allocation, one has to know how big to allocate
the array to read in the data, but one doesn't know how big
until the data has been read.  There are ways around that, but
not so easy to describe in one post.

My least favorite is to read the whole file counting records
but not storing data, REWIND, allocate, and read again.  I am
not recommending it, but that is what some do.

-- glen
0
Reply glen 3/25/2011 8:15:19 PM

> I am reading the CVF guide and I found this:
> "err, end, eor


I have a plotting utility that uses "END=" to detect when its task has
finished. Data are not stored, but each point is processed on the fly.
In this way the utility can process any size of file.

Arjan
0
Reply Arjan 3/25/2011 10:14:17 PM

Arjan <arjan.van.dijk@rivm.nl> wrote:
(snip, and previous snip, regarding  "err, end, eor")
 
> I have a plotting utility that uses "END=" to detect when its task has
> finished. Data are not stored, but each point is processed on the fly.
> In this way the utility can process any size of file.

Yes that is my favorite way.  Way too many programs read all
the data into an array when it can easily be processed one at
a time and not stored.

One of my favorite is the algorithm for doing linear least
squares fits with sums over the appropriate combinations of the
input data, (and with a little luck to avoid intermediate overflow).

For Unix/C, with a little luck, you can have one array increase
in size without moving (after it gets reasonably large).  

Otherwise, you need to do an allocate/copy/deallocate as the
data grows, which can be inefficient in time and memory.

-- glen
0
Reply glen 3/25/2011 10:25:01 PM

In article <imj4nt$p65$1@dont-email.me>,
 glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:

> Otherwise, you need to do an allocate/copy/deallocate as the
> data grows, which can be inefficient in time and memory.

Another approach is to allocate space for temporary blocks of data, 
say 100 elements at a time.  When a block is full, then allocate 
another one.  When the input data is exhausted, and you know how 
many elements were read in, then allocate an array of the correct 
size, copy the blocks into that array, deallocate the temporary 
blocks, and your're done.  You do need double the memory this way, 
at least temporarily, but you don't need to read through the input 
file twice.  These days, memory is cheaper than I/O time, especially 
for formatted data.

$.02 -Ron Shepard
0
Reply Ron 3/26/2011 5:07:18 AM

Ok, could you purpose a code of example?
In this moment I use this code:

read(10,*) n !number of records, explicit in dat file
allocate(x(1:n),y(1:n))
do i=1,n
 read(10,*) x(i), y(i)
end do

As I explained, I would avoit to read n.
Since I often change the number of records in my data file, I would
not count out at each modification.
0
Reply Allamarein 3/26/2011 11:23:55 AM

Dick Hendrickson wrote in message <8v4bdnF5laU1@mid.individual.net>...
>On 3/25/11 2:25 PM, Allamarein wrote:
>> When I use the READ statement, I use data file of this kind:
>>
>> npts
>> 25
>>
>> x   y
>> 2   5
>> 7   9
>> 10 13
>
>I'm not sure I completely understand your question, but the normal way
>to do things like this is
>      integer  ::  npts, x(1000), y(1000)
>      read (unit,*) npts
>      if (npts > 1000) stop
>      read (unit, *)  (x(i), y(i), i=1,npts)
>
>   [  or even
>          do i = 1,npts
>             read (unit,*) x(i), y(i)
>          enddo    ]
>
>Where you first read in the npts value and then use this to control a
>loop.  You pick dimensions for x and y that you are sure should be big
>enough.

That is the old-fashioned FORTRAN 77 and earlier way.
Better to allocate an array of the exact size from npts.
Then the arrays x and y are handy to pass to subroutines.


0
Reply robin 3/26/2011 1:31:01 PM

Allamarein wrote in message <6f328f60-5e30-4d90-b8cb-7ac2fe78a732@n1g2000yqm.googlegroups.com>...
>When I use the READ statement, I use data file of this kind:
>
>npts
>25
>
>x   y
>2   5
>7   9
>10 13
>....
>
>In other words in the file the number of records is explicit (an
>INTEGER variable).
>I would not specify the number of records.
>I am reading the CVF guide and I found this:
>"err, end, eor
>Are branch specifiers if an error (ERR=label), end-of-file
>(END=label), or end-of-record (EOR=label) condition occurs.
>EOR can only be specified for nonadvancing READ statements."

Why do you want to do that?
With the number of points given first in the data,
you can read the data points into an exact-size array,
which is perfect for passing to subroutines etc.

real, allocatable :: x(:), y(:)
read (u,*) npts
read (u,*)
allocate (x(1:npts),  y(1:npts))
read (u,*) (x(i), y(i), i=1, npts)
call sub(x, y)




0
Reply robin 3/26/2011 1:32:10 PM

On 3/26/2011 6:23 AM, Allamarein wrote:
> Ok, could you purpose a code of example?
> In this moment I use this code:
>
> read(10,*) n !number of records, explicit in dat file
> allocate(x(1:n),y(1:n))
> do i=1,n
>   read(10,*) x(i), y(i)
> end do
>
> As I explained, I would avoit to read n.
> Since I often change the number of records in my data file, I would
> not count out at each modification.

If you're changing the number of records written, might as well change 
the record count as well.

IMO, if you have the file under your control, it's much cleaner to use 
the record size and ALLOCATE then read rather than the machinations to 
allocate some initial size, read until either run out of data or need to 
reallocate some more room (rinse and repeat) and then clean up 
afterwords by truncating (or having to keep around a final count of how 
many records were written into a larger than necessary array for the 
lifetime of the program).

All in all, what you're now doing is optimal in many respects and where 
you're headed is less so.

--

0
Reply dpb 3/26/2011 2:35:56 PM

On Mar 25, 3:25=A0pm, Allamarein <matteo.diplom...@gmail.com> wrote:
> When I use the READ statement, I use data file of this kind:
>
> npts
> 25
>
> x =A0 y
> 2 =A0 5
> 7 =A0 9
> 10 13
> ....
>
> In other words in the file the number of records is explicit (an
> INTEGER variable).
> I would not specify the number of records.
> I am reading the CVF guide and I found this:
> "err, end, eor
> Are branch specifiers if an error (ERR=3Dlabel), end-of-file
> (END=3Dlabel), or end-of-record (EOR=3Dlabel) condition occurs.
> EOR can only be specified for nonadvancing READ statements."
>
> Could I use this optional input for my purpose?
> I have just posed this question, but I need a more complete answer.
> Thanks

Another possibility for reading a file without explicitly knowing the
number of data points is to inquire about the file size. Knowing the
file size and data type you can deduce the number of variables.
Example:

Say you have 2 variables of data written as 8-byte floating=3D>

integer,parameter      :: dp =3D selected_real_kind(15) !Gives double
precision 8-byte
real(db),allocatable   :: x(:),y(:)
integer                :: fsize,fid,npnts
character(200)         :: fname

open(newunit=3Dfid,file=3Dtrim(fname),access=3D'stream',form=3D'unformatted=
')

inquire(fid,size=3Dfsize) ! find file size. Typically file size is given
in bytes but check your compiler doc

npnts =3D fsize/(8*2) ! Divide by 8-bytes per data point * 2 variables -
> gives # points per variable

allocate(x(npnts))
allocate(y(npnts))

Of course you will have to get ride of the "header" part of the file
that tells how many variables you have.

Mr. Herrmannsfeld idea about reading and processing individual points
is interesting. I typically read in all the data at once. With really
large data sets I could see how this could be very useful. Is there a
performance advantage (other than memory) to doing this?

Eric
0
Reply Eric 3/26/2011 4:59:20 PM

On 3/25/2011 2:15 PM, glen herrmannsfeldt wrote:
> Allamarein<matteo.diplomacy@gmail.com>  wrote:
>> When I use the READ statement, I use data file of this kind:
>
>> npts
>> 25
>
>> x   y
>> 2   5
>> 7   9
>> 10 13
>> ....
>
> (snip)
>> Are branch specifiers if an error (ERR=label), end-of-file
>> (END=label), or end-of-record (EOR=label) condition occurs.
>> EOR can only be specified for nonadvancing READ statements."
>
>> Could I use this optional input for my purpose?
>
> This has resulted in long discussions in the past.
> There is no easy answer.
>
<snip>
>
> My least favorite is to read the whole file counting records
> but not storing data, REWIND, allocate, and read again.  I am
> not recommending it, but that is what some do.
>

No, it's not going to get any style points, but this might be the 
quickest and easiest way to solve the problem.  Unless we're talking 
about files with millions of records, it will take longer to code a 
solution -- any solution -- than it will take to run the program.  A 
program that takes ten minutes to write will be quicker in the end than 
one that takes an hour to code.

(It's the same thing that makes scripting languages like Perl so 
popular.  Sure, they're slow, but if that fact is irrelevant to the task 
at hand, it's a fact that's easily forgotten.)

Louis


0
Reply Louis 3/26/2011 5:49:00 PM

Louis Krupp <lkrupp_nospam@indra.com.invalid> wrote:

(snip)
>> My least favorite is to read the whole file counting records
>> but not storing data, REWIND, allocate, and read again.  I am
>> not recommending it, but that is what some do.
 
> No, it's not going to get any style points, but this might be the 
> quickest and easiest way to solve the problem.  Unless we're talking 
> about files with millions of records, it will take longer to code a 
> solution -- any solution -- than it will take to run the program.  A 
> program that takes ten minutes to write will be quicker in the end than 
> one that takes an hour to code.

Sometimes I do have files with millions, or even billions of points.
 
> (It's the same thing that makes scripting languages like Perl so 
> popular.  Sure, they're slow, but if that fact is irrelevant to the task 
> at hand, it's a fact that's easily forgotten.)

Yes.  For smaller problems interpreted languages, which often
include the ability to read data sets without knowning the
size in advance, makes programming faster.  Until the really
big problems come along.

The other problem with read, REWIND, read is that it doesn't
work with non-seekable sources, such as the output of a pipe.

-- glen
0
Reply glen 3/26/2011 6:47:55 PM

This is the answer that I prefer:

Say you have 2 variables of data written as 8-byte floating=>
integer,parameter      :: dp = selected_real_kind(15) !Gives double
precision 8-byte
real(db),allocatable   :: x(:),y(:)
integer                :: fsize,fid,npnts
character(200)         :: fname
open(newunit=fid,file=trim(fname),access='stream',form='unformatted')
inquire(fid,size=fsize) ! find file size. Typically file size is
given
in bytes but check your compiler doc
npnts = fsize/(8*2) ! Divide by 8-bytes per data point * 2 variables
-

In my data file I could have some headers (e.g. name of files,
description of variables, creation data)
Could these headers alter the measuring of my DAT file?

0
Reply Allamarein 3/27/2011 12:31:42 AM

Allamarein <matteo.diplomacy@gmail.com> wrote:

> In my data file I could have some headers (e.g. name of files,
> description of variables, creation data)
> Could these headers alter the measuring of my DAT file?

I think I must be misunderstanding the question here. I suppose I could
give the obvious (and correct) answer of "yes", but that's a little
*TOO* obvious. Yes, if you have things in a file, they tend to take up
space in it, and thus affect the size. Surely that could not be the
question you intended to ask, though, could it? That's what you did ask,
though.

-- 
Richard Maine                    | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle           |  -- Mark Twain
0
Reply nospam 3/27/2011 2:01:28 AM

Richard Maine <nospam@see.signature> wrote:
> Allamarein <matteo.diplomacy@gmail.com> wrote:
 
>> In my data file I could have some headers (e.g. name of files,
>> description of variables, creation data)
>> Could these headers alter the measuring of my DAT file?

From the file size, you could get an estimate of the size
of array needed.  That might help, though, in the case of a text
file it could be pretty far off.  For UNFORMATTED you might be able
to predict the array size more accurately.  

But in the case of a pipe, there is no size.
 
> I think I must be misunderstanding the question here. I suppose I could
> give the obvious (and correct) answer of "yes", but that's a little
> *TOO* obvious. Yes, if you have things in a file, they tend to take up
> space in it, and thus affect the size. Surely that could not be the
> question you intended to ask, though, could it? That's what you did ask,
> though.

There might have been a suggestion that you could get the exact
array size from the file size.  That doesn't seem likely.

-- glen
 
0
Reply glen 3/27/2011 5:06:33 AM

Allamarein wrote in message ...

>This is the answer that I prefer:
>
>Say you have 2 variables of data written as 8-byte floating=>
>integer,parameter      :: dp = selected_real_kind(15) !Gives double
>precision 8-byte
>real(db),allocatable   :: x(:),y(:)
>integer                :: fsize,fid,npnts
>character(200)         :: fname
>open(newunit=fid,file=trim(fname),access='stream',form='unformatted')
>inquire(fid,size=fsize) ! find file size. Typically file size is
>given
>in bytes but check your compiler doc
>npnts = fsize/(8*2) ! Divide by 8-bytes per data point * 2 variables

The only problem with this is that your file is text, while the
poster of that example assumes that the data is written in internal form,
typically with binary headers and trailers.
And each number in your text file is definitely not 8 bytes.

>In my data file I could have some headers (e.g. name of files,
>description of variables, creation data)
>Could these headers alter the measuring of my DAT file?

Any values in a file take up space.


0
Reply robin 3/27/2011 3:02:12 PM

"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message news:imit4n$9am$3@dont-email.me...

| Well, in the days of static allocation it was easy.  One allocated
| the arrays as large as possible (on the single task machine) and
| read in as much as one could.
|
| With dynamic allocation, one has to know how big to allocate
| the array to read in the data, but one doesn't know how big
| until the data has been read.  There are ways around that, but
| not so easy to describe in one post.
|
| My least favorite is to read the whole file counting records
| but not storing data, REWIND, allocate, and read again.  I am
| not recommending it, but that is what some do.

That's not a great way of doing it. 


0
Reply robin512 (309) 3/31/2011 7:22:08 AM

"Ron Shepard" <ron-shepard@NOSPAM.comcast.net> wrote in message
news:ron-shepard-79B439.00071826032011@news60.forteinc.com...

| Another approach is to allocate space for temporary blocks of data,
| say 100 elements at a time.  When a block is full, then allocate
| another one.

That's something that's possible in PL/I, because successive allocations
of the same array are stacked (as in a push-down pop-up stack).

But not in Fortran, as array allocations cannot be stacked.

However, it is possible using a subroutine, thus:

!! Reads in an arbitrary quantity of values, and stores them in an array B.
PROGRAM INPUT
   IMPLICIT NONE
   INTEGER :: K
   INTEGER, ALLOCATABLE :: B(:)

   OPEN (UNIT=10, FILE = 'ARRAY-IN.DAT')
   K = 0
   CALL GET
   PRINT *, SIZE(B), ' VALUES'
   PRINT *, B

CONTAINS

RECURSIVE SUBROUTINE GET
   IMPLICIT NONE
   INTEGER, PARAMETER :: NE = 20
   INTEGER :: A(NE)
   INTEGER :: I

   DO I = 1, NE
      READ (10, *, END=5) A(I)
   END DO
   I = NE+1
   K = K + NE
   CALL GET
   DO I = NE, 1, -1
      B(K) = A(I)
      K = K - 1
   END DO
   RETURN
5  IF (I > 1 .AND. I <= NE+1) THEN
      K = K + I - 1
   END IF
   IF (K == 0) STOP 0
   ALLOCATE (B(K))
   DO I = I-1, 1, -1
      B(K) = A(I)
      K = K - 1
   END DO
   RETURN
END SUBROUTINE GET

END PROGRAM INPUT

|  When the input data is exhausted, and you know how
| many elements were read in, then allocate an array of the correct
| size, copy the blocks into that array, deallocate the temporary
| blocks, and your're done.  You do need double the memory this way,
| at least temporarily, but you don't need to read through the input
| file twice.  These days, memory is cheaper than I/O time, especially
| for formatted data. 


0
Reply robin512 (309) 3/31/2011 7:33:41 AM

In article <4d9454b6$0$88199$c30e37c6@exi-reader.telstra.net>,
 "robin" <robin51@dodo.mapson.com.au> wrote:

> | Another approach is to allocate space for temporary blocks of data,
> | say 100 elements at a time.  When a block is full, then allocate
> | another one.
> 
> That's something that's possible in PL/I, because successive allocations
> of the same array are stacked (as in a push-down pop-up stack).
> 
> But not in Fortran, as array allocations cannot be stacked.
> 
> However, it is possible using a subroutine, thus:

That is one approach, a recursive internal subroutine with a local 
array.  

However, I probably would not do it that way, I would do it with a 
linked list.  Here is what I had in mind when I wrote the previous 
post:

program linked_read
   implicit none
   integer, parameter :: size=100
   type buffer
      integer :: array(size)
      type(buffer), pointer :: next => null()
   end type buffer
   type(buffer), pointer :: first, current, temp

   integer :: total, i, i1, i2, ierr
   integer, allocatable :: array(:)

   allocate(first)   ! first buffer.
   current => first

   total = 0
   mainloop: do   ! read in the data.
      do i = 1, size
         read(5,*,iostat=ierr) current%array(i)
         if ( ierr < 0 ) exit mainloop
         if ( current%array(i) < 0 ) exit mainloop
         total = total + 1
      enddo
      allocate(current%next)  ! new buffer.
      current => current%next
   enddo mainloop

   write(*,*) 'total=', total
   allocate(array(total))  ! copy data from the buffers to the final 
array
   current => first
   i2 = 0
   do  ! loop over the buffers
      i1 = i2 + 1
      i2 = i2 + min(size, total-i2 )
      array(i1:i2) = current%array(1:i2-i1+1)
      temp => current
      current => current%next
      deallocate(temp)
      if (i2 == total) exit
   enddo
   write(*,*) 'array=', array
end program linked_read

I generally avoid pointers as much as possible in fortran, but this 
is the kind of situation where a linked list seems simple enough to 
make the dangers worthwhile.  I think the above works as intended, 
with the same number of allocation and deallocations, but maybe I 
missed some corner cases.  It currently stops reading input for 
either EOF or for a negative input value.  One or the other of those 
tests might not be appropriate in the actual application, of course, 
but I left the two tests in just in case they need to be treated 
differently.

$.02 -Ron Shepard
0
Reply ron-shepard (1197) 3/31/2011 4:39:54 PM

On Apr 1, 2:39=A0am, Ron Shepard <ron-shep...@NOSPAM.comcast.net> wrote:
> In article <4d9454b6$0$88199$c30e3...@exi-reader.telstra.net>,
>
> =A0"robin" <robi...@dodo.mapson.com.au> wrote:
> > | Another approach is to allocate space for temporary blocks of data,
> > | say 100 elements at a time. =A0When a block is full, then allocate
> > | another one.
>
> > That's something that's possible in PL/I, because successive allocation=
s
> > of the same array are stacked (as in a push-down pop-up stack).
>
> > But not in Fortran, as array allocations cannot be stacked.
>
> > However, it is possible using a subroutine, thus:
>
> That is one approach, a recursive internal subroutine with a local
> array. =A0
>
> However, I probably would not do it that way, I would do it with a
> linked list. =A0Here is what I had in mind when I wrote the previous
> post:
>
> program linked_read
> =A0 =A0implicit none
> =A0 =A0integer, parameter :: size=3D100
> =A0 =A0type buffer
> =A0 =A0 =A0 integer :: array(size)
> =A0 =A0 =A0 type(buffer), pointer :: next =3D> null()
> =A0 =A0end type buffer
> =A0 =A0type(buffer), pointer :: first, current, temp
>
> =A0 =A0integer :: total, i, i1, i2, ierr
> =A0 =A0integer, allocatable :: array(:)
>
> =A0 =A0allocate(first) =A0 ! first buffer.
> =A0 =A0current =3D> first
>
> =A0 =A0total =3D 0
> =A0 =A0mainloop: do =A0 ! read in the data.
> =A0 =A0 =A0 do i =3D 1, size
> =A0 =A0 =A0 =A0 =A0read(5,*,iostat=3Dierr) current%array(i)
> =A0 =A0 =A0 =A0 =A0if ( ierr < 0 ) exit mainloop
> =A0 =A0 =A0 =A0 =A0if ( current%array(i) < 0 ) exit mainloop
> =A0 =A0 =A0 =A0 =A0total =3D total + 1
> =A0 =A0 =A0 enddo
> =A0 =A0 =A0 allocate(current%next) =A0! new buffer.
> =A0 =A0 =A0 current =3D> current%next
> =A0 =A0enddo mainloop
>
> =A0 =A0write(*,*) 'total=3D', total
> =A0 =A0allocate(array(total)) =A0! copy data from the buffers to the fina=
l
> array
> =A0 =A0current =3D> first
> =A0 =A0i2 =3D 0
> =A0 =A0do =A0! loop over the buffers
> =A0 =A0 =A0 i1 =3D i2 + 1
> =A0 =A0 =A0 i2 =3D i2 + min(size, total-i2 )
> =A0 =A0 =A0 array(i1:i2) =3D current%array(1:i2-i1+1)
> =A0 =A0 =A0 temp =3D> current
> =A0 =A0 =A0 current =3D> current%next
> =A0 =A0 =A0 deallocate(temp)
> =A0 =A0 =A0 if (i2 =3D=3D total) exit
> =A0 =A0enddo
> =A0 =A0write(*,*) 'array=3D', array
> end program linked_read
>
> I generally avoid pointers as much as possible in fortran, but this
> is the kind of situation where a linked list seems simple enough to
> make the dangers worthwhile. =A0I think the above works as intended,
> with the same number of allocation and deallocations, but maybe I
> missed some corner cases. =A0It currently stops reading input for
> either EOF or for a negative input value. =A0One or the other of those
> tests might not be appropriate in the actual application, of course,
> but I left the two tests in just in case they need to be treated
> differently.
>
> $.02 -Ron Shepard

Children, Children!
  All that fuss!
  When creating the file for the first time you write one record as
defined below. Than close the file and re-open it and read and rewrite
that record as the first thing you do to get the fle parameters.

  If you change the size or nature of the data, you just write the
first record with all the lastest parameters (number of records,
record length, array size, and whatever) AFTER you have written the
last file record and closed the file, as noted above, then opening
again the file as direct access, with the length of the first record
(ONLY) as the supposed value of RECL.

So you read and re-write that first record whenever updating the data
(e.g. as what you do AFTER you have written the last records!). The
file is then closed.
It doesn't matter what is the size of the rest of the records as long
as you can calculate where they are, and read or write in blocked mode
(doing your own blocking).

I hope no-one will now say that since version f2010 or something of
Fortran compilers, they don't allow mixing access modes on the same
file!
0
Reply tbwright (1098) 4/4/2011 7:52:27 AM

Terence <tbwright@cantv.net> wrote:

> I hope no-one will now say that since version f2010 or something of
> Fortran compilers, they don't allow mixing access modes on the same
> file!

Not f2010 or something. That was in f77, when the OPEN statement was
introduced to the standard. "There is a processor-dependent set of
access methods for the file." (That might not quite be word-for-word
because I put the standard back before sitting down here, but it is at
least very close.)

So it is up to the procesor. A Processor might allow it, but is not
required to. In my experience, very few processors allow it in the sense
defined by the standard. As defined by the standard, you would see the
same content in the records when read either way.

I have used some processors that could pull that trick, but most don't.
It basically requires that the two access methods be able to at least
deal with each other's record structures. (On one implementation I'm
thinking of, direct and sequential acess files were by default created
with different record structures, but the file system recorded
information about the file's record structure and either kind of access
could deal with whichever record structure an existing file had.)

For most implementations, you can't successfully read a direct access
file as sequential at all because it won't have the required record
headers. You usually can read a sequential file as direct access, but
you don't see the same records - just raw data with the sequential
record headers comming through as part of the data. That's not the way
the standard describes mixing acess modes, so that counts as an
extension rather than as standard conforming. It is an extension I have
been known to use, and yes, I agree that it works with darn near all
current systems. But it is not per the standard.

-- 
Richard Maine                    | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle           |  -- Mark Twain
0
Reply nospam47 (9742) 4/4/2011 3:46:21 PM

In article <1jz6yby.5fewkyb6qxj4N%nospam@see.signature>,
 nospam@see.signature (Richard Maine) wrote:

> I have used some processors that could pull that trick, but most don't.

For formatted files, I thought most systems allow both sequential 
and direct access provided the records are all the same fixed 
length.  I don't do this very often, so I'm not certain, but it 
seems like a lot of people talk about that here in c.l.f.

It is less common for unformatted files because of the record 
header/trailer information that is usually included for sequential 
files.  That information is what allows backspace and partial record 
reads to work correctly.  But I have also used file systems that did 
allow records to be written one way and then read the other.  
Typically, you would write the file sequentially, then after a close 
and an open, read the records randomly.  I forget exactly what kind 
of magic was required in the open statement, but I do remember that 
it involved specifying recl for the sequential file, something that 
is generally not required (or done).

$.02 -Ron Shepard
0
Reply ron-shepard (1197) 4/4/2011 4:28:51 PM

Ron Shepard <ron-shepard@NOSPAM.comcast.net> wrote:

> In article <1jz6yby.5fewkyb6qxj4N%nospam@see.signature>,
>  nospam@see.signature (Richard Maine) wrote:
> 
> > I have used some processors that could pull that trick, but most don't.
> 
> For formatted files, I thought most systems allow both sequential 
> and direct access provided the records are all the same fixed 
> length.  I don't do this very often, so I'm not certain, but it 
> seems like a lot of people talk about that here in c.l.f.
> 
> It is less common for unformatted files because of the record 
> header/trailer information that is usually included for sequential 
> files.

Formatted direct access files are pretty rare in my experience. I've
heard of a few people using them, but not many. I had occasion to do so
once long ago, but that was on an obscure system for quirky reasons.
Oddly, on that system, formatted direct access was the only way to do a
substitute for what I'd today call unformatted stream (aka binary, aka
several other things). Unformatted direct access is what you use on
almost all systems, but that didn't do the trick on that one; I forget
whether it had record headers or it was some other quirk. Formatted
direct access with "A" formatting did the trick. (Slow, but it worked.)

That aside, I doubt you will find many systems that allow both direct
and sequential formatted access to the same file as described by the
standard. I have used such systems, but they are rare. Note the "as
described by the standard" part; that's my point. Just like with
unformatted, the problem is the record header/trailer; formatted
sequential files usually have them too - just different ones from
unformatted sequential. In the case of formatted sequential, it is most
commonly something like a CR and/or LF character at the end of the
record. If you take a formatted sequential file whose records are all
the same length, and read it as a formatted direct access file, on most
systems, you will see the CR and/or LF as part of the record data. That
is not as described by the standard. Per the standard you would see the
same record content both ways.

So what you usually see counts as an extension rather than as an
implementation that supports acessing the file both ways as described in
the standard.

-- 
Richard Maine                    | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle           |  -- Mark Twain
0
Reply nospam47 (9742) 4/4/2011 4:45:09 PM

Ron Shepard <ron-shepard@nospam.comcast.net> wrote:
> In article <1jz6yby.5fewkyb6qxj4N%nospam@see.signature>,
> nospam@see.signature (Richard Maine) wrote:
 
>> I have used some processors that could pull that trick, but most don't.
 
> For formatted files, I thought most systems allow both sequential 
> and direct access provided the records are all the same fixed 
> length.  I don't do this very often, so I'm not certain, but it 
> seems like a lot of people talk about that here in c.l.f.

For OS/360 through current z/OS, you should be able to read 
formatted direct files as sequential, but BDAM files must be
unblocked.  A blocked sequential file could not be read as direct.
Unformatted sequential is RECFM=VBS (variable, blocked, spanned)
and completely different from unformatted direct.
 
> It is less common for unformatted files because of the record 
> header/trailer information that is usually included for sequential 
> files.  That information is what allows backspace and partial record 
> reads to work correctly.  

The header information is part of the VBS file, but otherwise, yes.

> But I have also used file systems that did 
> allow records to be written one way and then read the other.  
> Typically, you would write the file sequentially, then after a close 
> and an open, read the records randomly.  

For OS/360, it is required to write all the records as a sequential
file first.  That is normally done by the Fortran library on
the first open to the file.  (Until you do, the records don't
exist and there would be no way to write them.)

> I forget exactly what kind 
> of magic was required in the open statement, but I do remember that 
> it involved specifying recl for the sequential file, something that 
> is generally not required (or done).

-- glen
0
Reply gah (12236) 4/4/2011 6:24:46 PM

In article <ind2de$fmd$1@dont-email.me>,
glen herrmannsfeldt  <gah@ugcs.caltech.edu> wrote:
>Ron Shepard <ron-shepard@nospam.comcast.net> wrote:
>> In article <1jz6yby.5fewkyb6qxj4N%nospam@see.signature>,
>> nospam@see.signature (Richard Maine) wrote:
> 
>>> I have used some processors that could pull that trick, but most don't.
> 
>> For formatted files, I thought most systems allow both sequential 
>> and direct access provided the records are all the same fixed 
>> length.  I don't do this very often, so I'm not certain, but it 
>> seems like a lot of people talk about that here in c.l.f.
>
>For OS/360 through current z/OS, you should be able to read 
>formatted direct files as sequential, but BDAM files must be
>unblocked.  A blocked sequential file could not be read as direct.

Er, not quite.  FBS ones could be, and the better Fortran run-time
systems supported them - though, if I recall, IBM's didn't.  Many
other mainframes had the equivalent of FBS, often by default.

>Unformatted sequential is RECFM=VBS (variable, blocked, spanned)
>and completely different from unformatted direct.

Oh, yes, indeed :-)

>> But I have also used file systems that did 
>> allow records to be written one way and then read the other.  
>> Typically, you would write the file sequentially, then after a close 
>> and an open, read the records randomly.  
>
>For OS/360, it is required to write all the records as a sequential
>file first.  That is normally done by the Fortran library on
>the first open to the file.  (Until you do, the records don't
>exist and there would be no way to write them.)

And a right horrible collection of hacks were used to achieve
that :-(


Regards,
Nick Maclaren.
0
Reply nmm12 (898) 4/4/2011 9:44:17 PM

On Apr 5, 1:46=A0am, nos...@see.signature (Richard Maine) wrote:
(snipped)
> So it is up to the procesor. A Processor might allow it, but is not
> required to. In my experience, very few processors allow it in the sense
> defined by the standard. As defined by the standard, you would see the
> same content in the records when read either way.
>
> I have used some processors that could pull that trick, but most don't.
> It basically requires that the two access methods be able to at least
> deal with each other's record structures. (On one implementation I'm
> thinking of, direct and sequential acess files were by default created
> with different record structures, but the file system recorded
> information about the file's record structure and either kind of access
> could deal with whichever record structure an existing file had.)
>
> For most implementations, you can't successfully read a direct access
> file as sequential at all because it won't have the required record
> headers. You usually can read a sequential file as direct access, but
> you don't see the same records - just raw data with the sequential
> record headers comming through as part of the data. That's not the way
> the standard describes mixing acess modes, so that counts as an
> extension rather than as standard conforming. It is an extension I have
> been known to use, and yes, I agree that it works with darn near all
> current systems. But it is not per the standard.
>
> Richard Maine =A0 =A0 =A0 =A0

I looked at that above text in stark disbelief at first, then I caught
the idea that Richard was really writing about trying to read a
FORMATTED DIRECT access file in the same way as a FORMATTED SEQUENTIAL
file.

 I agree that IBM and possibly other targets implementing Fortram
programs, DID put in bridging prefixes and postfixes in the formatted
direct files, which certainly made life difficult to treat these files
any way you wanted in the same program. I remember now having run into
this and having to write programming to find and bypass those count-
and-marker tage.

However, I have NEVER written a program to write FORMATTED DIRECT
access files, EVER, in over 40 years of Fortran programming. I have
sometimes expressed my opinion here (and in the sixties up the
channels in IBM) that the "fortratted direct" access mode was an
aberration of design.

My direct access files have ALWAYS been unformatted, even if records
ended with cr-lf characters, PRECISELY, because you can always also
read these files as sequential formatted and "binary" or "Transparent"
mode (now described as stream files) as needed and convenient.
Or, more accurately, as long as no weird computer manufacturer ever
tried to implement the direct access unformatted mode as other than
pure sequential padded-to-equal length data records.

And believe me, using just these three modes of file access you can do
anything, especiallyo opening and closing these files in any of the
three modes and in the same program.

I never have met a problem using those three modes intermixed
(formatted sequential, unformatted direct and
stream=3Dbinary=3Dtransparent sequential).

0
Reply tbwright (1098) 4/5/2011 4:26:39 AM

On Apr 5, 2:28=A0am, Ron Shepard <ron-shep...@NOSPAM.comcast.net> wrote:
> In article <1jz6yby.5fewkyb6qxj4N%nos...@see.signature>,
> =A0nos...@see.signature (Richard Maine) wrote:
>
> > I have used some processors that could pull that trick, but most don't.
>
> For formatted files, I thought most systems allow both sequential
> and direct access provided the records are all the same fixed
> length. =A0I don't do this very often, so I'm not certain, but it
> seems like a lot of people talk about that here in c.l.f.
>
> It is less common for unformatted files because of the record
> header/trailer information that is usually included for sequential
> files. =A0That information is what allows backspace and partial record
> reads to work correctly. =A0But I have also used file systems that did
> allow records to be written one way and then read the other. =A0
> Typically, you would write the file sequentially, then after a close
> and an open, read the records randomly. =A0I forget exactly what kind
> of magic was required in the open statement, but I do remember that
> it involved specifying recl for the sequential file, something that
> is generally not required (or done).
>
> $.02 -Ron Shepard

A pretty fair description, and accurate as long as the direct access
was defined as unformmated (although some formatted implementations
would also work if there were no extra bytes stuffed in there by the
implementation).
Remember that you can write direct access records of length equal to
or less than the the RECL parameter; padding to the full length occurs
(one way or another: defined or junk).
The "magic" needed is to put in any needed cr-lf at the end of the
direct access-written record to be able to get sequential formatted
data back out from text data written.
0
Reply tbwright (1098) 4/5/2011 4:34:13 AM

Terence <tbwright@cantv.net> wrote:

> I looked at that above text in stark disbelief at first, then I caught
> the idea that Richard was really writing about trying to read a
> FORMATTED DIRECT access file in the same way as a FORMATTED SEQUENTIAL
> file.

No. It applies equally well to unformatted. I also very rarely used
formatted direct access. I did describe one rare case from several
decades ago.

As I said, but I don't think you picked up, you can pretty much always
succeed in reading anything as unformatted direct access, but that is
not what the standard says, and what usually happens is an extension -
not standard conforming.

-- 
Richard Maine                    | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle           |  -- Mark Twain
0
Reply nospam47 (9742) 4/5/2011 6:38:37 AM

Den 25.03.11 20.25, skrev Allamarein:
> When I use the READ statement, I use data file of this kind:
>
> npts
> 25
>
> x   y
> 2   5
> 7   9
> 10 13
> ....
>
> In other words in the file the number of records is explicit (an
> INTEGER variable).
> I would not specify the number of records.
> I am reading the CVF guide and I found this:
> "err, end, eor
> Are branch specifiers if an error (ERR=label), end-of-file
> (END=label), or end-of-record (EOR=label) condition occurs.
> EOR can only be specified for nonadvancing READ statements."
>
> Could I use this optional input for my purpose?
> I have just posed this question, but I need a more complete answer.
> Thanks

I am no expert, but:
1) write a script to convert the ascii to hdf5 (I would use python+h5py 
- estimated code length < 10 lines). It takes less space and is easier 
to read in fortran - arrays have a given size.
2) Let it run overnight - or however long. It doesn't matter for most 
people. I hope you are one of them. The task is IO bound anyway, I huess
3) Write a nice fortran program that asks about array length, 
allocate()s an array, and read it from disk.

Just a thought.
Paul.
0
Reply paul.anton.letnes (55) 4/8/2011 1:32:19 PM

30 Replies
346 Views

(page loaded in 0.302 seconds)

Similiar Articles:


















7/23/2012 8:49:01 AM


Reply: