maximum file record lengths?

  • Follow


What have folks seen as typical/practical (but especially, portable) 
maximum for text-file record lengths (eg delimited by LF)  
across various Unix systems?  

32,768, 65.335,  etc, up to 2GB? (farther?) 


0
Reply Joe 4/11/2004 4:51:48 AM


Joe Bloggs wrote:
> 
> What have folks seen as typical/practical (but especially, portable)
> maximum for text-file record lengths (eg delimited by LF)
> across various Unix systems?

There are no records in UNIX. Lines can have any length,
but programs have the right to abort on too long lines...
0
Reply Lorinczy 4/11/2004 8:11:28 AM


> What have folks seen as typical/practical (but especially, portable) 
> maximum for text-file record lengths (eg delimited by LF)  
> across various Unix systems?  
> 
> 32,768, 65.335,  etc, up to 2GB? (farther?) 

Text files, like binary files, contain a series of bytes interspersed with 
line feeds. There is no inherent maximum line length.

-- 
Jem Berkes
http://www.sysdesign.ca/
0
Reply Jem 4/11/2004 5:22:48 PM

On Sun, 11 Apr 2004 10:11:28 +0200, Lorinczy Zsigmond / Domonyik
Mariann <lzsiga@axelero.hu> wrote:

>Joe Bloggs wrote:
>> 
>> What have folks seen as typical/practical (but especially, portable)
>> maximum for text-file record lengths (eg delimited by LF)
>> across various Unix systems?
>
>There are no records in UNIX. Lines can have any length,
>but programs have the right to abort on too long lines...

So, No real limits on line lengths, in practice, then? 

I was mostly wonderng if there was any rule of thumb folks use..

(for ex,  lines longer than N will make (sed, grep, etc) 
 on  XYZ unix blow chunks) 

that, and/oif there was some POSIX definition inre.


0
Reply Joe 4/12/2004 10:37:11 PM

On Sun, 11 Apr 2004 04:51:48 GMT Joe Bloggs <JBloggs@acme.com> wrote:

| What have folks seen as typical/practical (but especially, portable) 
| maximum for text-file record lengths (eg delimited by LF)  
| across various Unix systems?  
| 
| 32,768, 65.335,  etc, up to 2GB? (farther?) 

While there are many programs that have fixed length buffers and such have
limitations in capability, some other programs impose no such limits, and
deal with whatever length it can.  One example is the GNU make program.  It
will handle lines as long as you can give it.  if you need to have dependency
on 30 million files for one target, it will handle it if you have enough
memory for it to build the internal structures.  It can do this by processing
the names as it gets them, long before it ever sees the newline.  I don't know
for sure that it does it this way, but it is plausible.  It could read the
whole line before processing.  But either way, it has no hard coded limits.

I suggest finding a way to handle things as dynamically as possible.  If,
for example, you are reading words on a line, and need to count the words,
then just count them as you read them, looking for that newline as you go.
I may be able to give more specific ideas if I know more about what you are
trying to do.  I did write a function to parse words, quoted strings, and
other things, from an input stream.  It returned what it gets as it gets it.
One of the "things" it can return is an indication that it hit a newline.
The caller may not even care about newlines and just ignore that.

-- 
-----------------------------------------------------------------------------
| Phil Howard KA9WGN       | http://linuxhomepage.com/      http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/   http://ka9wgn.ham.org/ |
-----------------------------------------------------------------------------
0
Reply phil 4/12/2004 10:57:57 PM

On Mon, 12 Apr 2004, Joe Bloggs wrote:

> So, No real limits on line lengths, in practice, then?

None in THEORY, no.

> I was mostly wonderng if there was any rule of thumb folks use..
>
> (for ex,  lines longer than N will make (sed, grep, etc)
>  on  XYZ unix blow chunks)

Ah, well there you're getting into the realm of what an application
decides to do.  UNIX itself has no concept of a "line length" (outside
the scope of terminals), but application writers typically define
the maximum length of a line they want to support.  So if you write
an application, you might take a decision to support lines less than
1024 characters in length.

-- 
Rich Teer, SCNA, SCSA

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-online.net
0
Reply Rich 4/12/2004 11:08:05 PM

In article <Pine.SOL.4.58.0404121605040.3314@zaphod.rite-group.com>,
 Rich Teer <rich.teer@rite-group.com> wrote:

> On Mon, 12 Apr 2004, Joe Bloggs wrote:
> 
> > So, No real limits on line lengths, in practice, then?
> 
> None in THEORY, no.
> 
> > I was mostly wonderng if there was any rule of thumb folks use..
> >
> > (for ex,  lines longer than N will make (sed, grep, etc)
> >  on  XYZ unix blow chunks)
> 
> Ah, well there you're getting into the realm of what an application
> decides to do.  UNIX itself has no concept of a "line length" (outside
> the scope of terminals), but application writers typically define
> the maximum length of a line they want to support.  So if you write
> an application, you might take a decision to support lines less than
> 1024 characters in length.

He's not writing an application, he's using existing applications, and 
wants to know what line lengths are likely to choke them.

In my experience, many Unix applications have limits of 1K or 2K.  But 
if you use the GNU versions of these applications, they often have no 
hard-coded limit at all.

-- 
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** Please don't copy me on replies. ***
0
Reply Barry 4/13/2004 4:24:24 AM

"Joe Bloggs" <JBloggs@acme.com> wrote, on Mon, 12 Apr 2004:

>>There are no records in UNIX. Lines can have any length,
>>but programs have the right to abort on too long lines...
> 
> So, No real limits on line lengths, in practice, then? 
> 
> I was mostly wonderng if there was any rule of thumb folks use..
> 
> (for ex,  lines longer than N will make (sed, grep, etc) 
>  on  XYZ unix blow chunks) 
> 
> that, and/oif there was some POSIX definition inre.

POSIX defines a limit called LINE_MAX and requires utilities that
read text files to accept lines of at least that length (including
the newline terminator).  The minimum value it allows for LINE_MAX
is 2048.

So as far as standards-compliant systems are concerned the general
rule of thumb is: keep line lengths to 2048 characters or less.
If you want a specific rule for a particular system, use
getconf LINE_MAX to find out the limit for that system (or use
sysconf(_SC_LINE_MAX) in C code).

There may be ancient versions of utilities that barf with lines less
than 2048.  (These could be either utilities on ancient systems, or
historical utilities on modern systems, such as those you get on
Solaris if you don't set your PATH properly.  Always put /usr/xpg4/bin
before other system directories, or better still put $(getconf PATH)
near the front of your PATH, as that works for all POSIX systems.)
 
-- 
Geoff Clare <nospam@gclare.org.uk>
0
Reply Geoff 4/15/2004 12:40:20 PM

7 Replies
897 Views

(page loaded in 0.16 seconds)

Similiar Articles:













7/24/2012 3:19:48 AM


Reply: