Binary write file size limit?

  • Follow


Hi,

I have a binary write statment that errors out when I'm writing huge
files (1 Gbit +). On smaller files (~100 Mb) it works just fine. Was
wondering if there is a size limit on binary write in fortran. If so,
can I take advantage of recl= switch to avoid the problem? One extra
data point, if I switch to ascii write (formatted), same write
statement with the same data size, works ok.

Thanks.

BTW,

  OS: WinXP,
  Compiler: CVF 6.6c

0
Reply mdanesh1 (14) 2/9/2005 8:45:20 PM

pacman wrote:

> I have a binary write statment that errors out when I'm writing huge
> files (1 Gbit +). On smaller files (~100 Mb) it works just fine. Was
> wondering if there is a size limit on binary write in fortran. If so,
> can I take advantage of recl= switch to avoid the problem? 

It is somewhat better to try to write in smaller blocks,
and shouldn't be that much harder to code.

Are you writing one huge array out in one WRITE statement?

It might be that RECL= will allow it, but the system might 
allocate a buffer or two the size you set for RECL.

If you are writing a two (or more) dimensional array,
try it in a DO loop so that the records aren't quite as big.
Otherwise, maybe something like:

  DO I= 0 TO N/1000-1
     WRITE(1) BIGARRAY(1000*I+1:1000*(I+1))
  ENDDO
  IF(MOD(N,1000)>0) WRITE(1) BIGARRAY(N/1000*1000+1:N)

You might try timing it with the smaller arrays to see how fast
each one is.

 > One extra
 > data point, if I switch to ascii write (formatted), same write
 > statement with the same data size, works ok.

Normally formatted output won't put the whole array on one line,
which is what determines the RECL value.

-- glen

0
Reply gah (12302) 2/9/2005 9:21:45 PM


In article <1107981920.033933.243360@o13g2000cwo.googlegroups.com>,
pacman <mdanesh@gmail.com> wrote:

> Was wondering if there is a size limit on binary write in fortran.

It is usually the case that there isn't any limit less than 2
gigabytes, but this depends on the compiler.

-- greg

0
Reply lindahl (696) 2/9/2005 10:44:07 PM

Thanks for the replies. I also think that 2GB is the limit. However, I
think I've found a solution. It turns out that if I ask for IOSTAT on
the write statement, the problem goes away. So (and this is my
interpretation) the Fortran IO sub-system was getting confused on
multiple tiny writes and was missing data. Looks like checking that the
actual write has occurred or not, forces the compiler to "do the actual
write".

Thanks.

0
Reply mdanesh1 (14) 2/10/2005 1:10:34 AM

"pacman" <mdanesh@gmail.com> writes:

> It turns out that if I ask for IOSTAT on
> the write statement, the problem goes away. So (and this is my
> interpretation) the Fortran IO sub-system was getting confused on
> multiple tiny writes and was missing data. Looks like checking that the
> actual write has occurred or not, forces the compiler to "do the actual
> write".

You said just that you "ask for iostat". Do you also check the
value? If iostat returns a zero value, then I don't have a handy
explanation of the difference - though I'd tend to suspect something
else has changed - perhaps even something not obviously related.
While many things are possible, I wouldn't put "iostat reminded the
runtimes to finish the job" as very high on the list of likely
explanations.

But if iostat returns a nonzero value, then you are just seeing normal
iostat functionality. When you specify iostat, you are supposed to
get an iostat value returned instead of an abort.  This does *NOT* mean
that the problem is solved - just that the information about the
problem is returned in a different form.

-- 
Richard Maine
email: my last name at domain
domain: summertriangle dot net
0
Reply nospam47 (9742) 2/10/2005 2:20:16 AM

I do check the IOSTAT value, and it's value is always zero, and I
understand that the check only returns the status of the write attempt.
But at the same time, I can replicate the error by removing the IOSTAT
and "fix it" by putting it back. Very strange ...

0
Reply mdanesh1 (14) 2/10/2005 5:39:44 AM

pacman wrote:
> 
> I do check the IOSTAT value, and it's value is always zero, and I
> understand that the check only returns the status of the write attempt.
> But at the same time, I can replicate the error by removing the IOSTAT
> and "fix it" by putting it back. Very strange ...

Perhaps it is time for you to post code that exhibits the behaviour - 
if that is possible.

FYI:
The 2GB limit exists because on most current systems an (signed) integer 
of 4 bytes is used to address the position within a file (the maximum
value is then 2**31-1 ~ 2.1E9). But most OSs and compilers also offer
a way out of this, because files of such sizes are much more common
than a few years ago :).

One other thing: is the file written on a local disk or on a network
disk? Some networks can not handle such large files (even though
the OS and the compiler can).

Regards,

Arjen
0
Reply arjen.markus (2628) 2/10/2005 7:52:28 AM

Arjen Markus wrote:
> pacman wrote:

>>I do check the IOSTAT value, and it's value is always zero, and I
>>understand that the check only returns the status of the write attempt.
>>But at the same time, I can replicate the error by removing the IOSTAT
>>and "fix it" by putting it back. Very strange ...

> Perhaps it is time for you to post code that exhibits the behaviour - 
> if that is possible.

Good idea.

Maybe I missed the point of the question.  Since RECL= was mentioned,
I thought it was trying to write the whole file as one record.
RECL= specifies the maximum record size, which for UNFORMATTED files is 
that written by one WRITE statement.  I would keep RECL much lower
than 2GB, though 100K might not be too big.

> FYI:
> The 2GB limit exists because on most current systems an (signed) integer 
> of 4 bytes is used to address the position within a file (the maximum
> value is then 2**31-1 ~ 2.1E9). But most OSs and compilers also offer
> a way out of this, because files of such sizes are much more common
> than a few years ago :).

It is more complicated than that.  Many OS have no trouble writing 
larger files, but the C library uses int (the C signed integer type) for 
file positioning, the argument of the fseek() function and return value 
of ftell().  If a program uses those functions for positioning, it could 
overwrite previously written data without any indication that it did so.
To prevent accidental overwrite, an attempt to do so signals a fatal 
error.   As many Fortran libraries use the C library to actually do I/O,
this is propagated to Fortran.

The ability to seek is enough, it is not required that the program 
actually do it.  I have had programs that read stdin/write stdout that 
would fail, but

cat infile | program | cat > outfile

would work, as the pipe is not seekable.  That may not be a solution
for the OP, though.   The official solution for those systems is to give 
the program the LARGEFILES attribute (system dependent), but be sure
that it really doesn't do any seeks.

> One other thing: is the file written on a local disk or on a network
> disk? Some networks can not handle such large files (even though
> the OS and the compiler can).

I have also seen it work the other way around.

-- glen

0
Reply gah (12302) 2/10/2005 8:13:49 AM

In article <L-CdnY8itY4pipbfRVn-vQ@comcast.com>,
glen herrmannsfeldt  <gah@ugcs.caltech.edu> wrote:

>It is more complicated than that.  Many OS have no trouble writing 
>larger files, but the C library uses int (the C signed integer type) for 
>file positioning, the argument of the fseek() function and return value 
>of ftell().  If a program uses those functions for positioning, it could 
>overwrite previously written data without any indication that it did so.
>To prevent accidental overwrite, an attempt to do so signals a fatal 
>error.   As many Fortran libraries use the C library to actually do I/O,
>this is propagated to Fortran.

This is a half-truth at best. First off, it's a long, not an int, and
second off, there are ANSI-standard functions which can be used
instead of fseek() and ftell() which are 64-bits in most OSes,
including 32-bit OSes.

> The official solution for those systems is to give 
> the program the LARGEFILES attribute (system dependent), but be sure
> that it really doesn't do any seeks.

This generally needs to be given when compiling the Fortran
library. If the library (generally written in C) isn't correct, there
usually isn't anything you can do as a Fortran user.

-- greg

0
Reply lindahl (696) 2/10/2005 8:38:58 AM

Greg Lindahl wrote:

> In article <L-CdnY8itY4pipbfRVn-vQ@comcast.com>,
> glen herrmannsfeldt  <gah@ugcs.caltech.edu> wrote:

>>It is more complicated than that.  Many OS have no trouble writing 
>>larger files, but the C library uses int (the C signed integer type) for 
>>file positioning, the argument of the fseek() function and return value 
>>of ftell().  If a program uses those functions for positioning, it could 
>>overwrite previously written data without any indication that it did so.
>>To prevent accidental overwrite, an attempt to do so signals a fatal 
>>error.   As many Fortran libraries use the C library to actually do I/O,
>>this is propagated to Fortran.

> This is a half-truth at best. First off, it's a long, not an int, and
> second off, there are ANSI-standard functions which can be used
> instead of fseek() and ftell() which are 64-bits in most OSes,
> including 32-bit OSes.

Yes, it is a long, but then since Alpha no-one seems to want to 
make a C compiler where long is longer than 32 bits.

Some systems have fseek64(), and some have fseeko(), I don't 
know that either is ANSI, but they might be.  In any case, there 
is not a lot of code using them.

>>The official solution for those systems is to give 
>>the program the LARGEFILES attribute (system dependent), but be sure
>>that it really doesn't do any seeks.

> This generally needs to be given when compiling the Fortran
> library. If the library (generally written in C) isn't correct, there
> usually isn't anything you can do as a Fortran user.

-- glen

0
Reply gah (12302) 2/10/2005 6:14:41 PM

> Yes, it is a long, but then since Alpha no-one seems to want to
> make a C compiler where long is longer than 32 bits.

Every C compiler but one that I have seen for a 64-bit environment,
including SPARC V9, MIPS, PowerPC G5, and AMD64, makes
longs 64 bits long.  The compilers for 32-bit environments make
longs 32 bits, but that makes sense for a 32-bit environment.


    Sincerely,

    Bob Corbett

0
Reply robert.corbett2 (862) 2/11/2005 8:41:06 AM

Pacman:  The Laney compiler IOCS does not
depend on the limit of the OS. Following is from Lahey Support. 

From the Lahey Windows and Linux users guide(s), Limits of Operation section:
I/O file size (including transparent access)
 18,446,744,073,709,551,614 bytes (2**64 - 2) or 16 Exabytes - 2
 
The Layhey user's guides are available online at http://www.lahey.com/doc.htm

Skip Knoble

On 9 Feb 2005 12:45:20 -0800, "pacman" <mdanesh@gmail.com> wrote:

-|Hi,
-|
-|I have a binary write statment that errors out when I'm writing huge
-|files (1 Gbit +). On smaller files (~100 Mb) it works just fine. Was
-|wondering if there is a size limit on binary write in fortran. If so,
-|can I take advantage of recl= switch to avoid the problem? One extra
-|data point, if I switch to ascii write (formatted), same write
-|statement with the same data size, works ok.
-|
-|Thanks.
-|
-|BTW,
-|
-|  OS: WinXP,
-|  Compiler: CVF 6.6c

0
Reply SkipKnobleLESS1 (689) 2/11/2005 1:17:15 PM

robert.corbett@sun.com wrote:
> > Yes, it is a long, but then since Alpha no-one seems to want to
> > make a C compiler where long is longer than 32 bits.

> Every C compiler but one that I have seen for a 64-bit environment,
> including SPARC V9, MIPS, PowerPC G5, and AMD64, makes
> longs 64 bits long.  The compilers for 32-bit environments make
> longs 32 bits, but that makes sense for a 32-bit environment.

Google for "LLP64" and weep.


-- 
pa at panix dot com
0
Reply pa1184 (387) 2/11/2005 3:42:15 PM

12 Replies
117 Views

(page loaded in 0.14 seconds)

Similiar Articles:


















7/26/2012 2:08:48 PM


Reply: