f



q on 'record', 'field' separators in text files

What characters if any are typically used to delimit 'fields' and 
'records' in a text file?

I'm working on a rewrite of an app that stores some data in text files 
and uses '|' and "\n" as 'field' and 'record' separators, respectively.  
I want to replace these and have been thinking of using "\e" and "\0", 
respectively, to allow data in a field to use the current two 
separators.  Would it be ill-advised in particular to use \0?  (The app 
currently runs on *nix, in case this is important.)

thanks
spark
0
sparkane
12/6/2003 2:58:52 PM
comp.lang.perl.misc 33233 articles. 2 followers. brian (1246) is leader. Post Follow

3 Replies
1027 Views

Similar Articles

[PageSpeed] 27

On Sat, 6 Dec 2003 08:58:52 -0600
sparkane <nertz@numb.no> wrote:

> What characters if any are typically used to delimit 'fields' and 
> 'records' in a text file?
> 
> I'm working on a rewrite of an app that stores some data in text
> files and uses '|' and "\n" as 'field' and 'record' separators,
> respectively.  I want to replace these and have been thinking of
> using "\e" and "\0", respectively, to allow data in a field to use
> the current two separators.  Would it be ill-advised in particular
> to use \0?  (The app currently runs on *nix, in case this is
> important.)

More often than not, ':' and ',' are used for text (aka CSV or "flat
file") databases.  Use of other characters *may* lead to issues you
may not have thought of.  One may be you leaving your organization and
someone else having to pick up where you left off.  Or, you starting a
project, not having the time to maintain it, and then someone else
having to pick up where you left off.  Or, just Father Time.  I know
I've written code from a few years ago, did something "funky", and not
having an idea what the logic was behind it.  I code much better now,
so I don't run into that situation too often.  But, it is something to
consider.

Just a suggestion - you *may* want to consider using DBI for database
functionality.  One driver available is DBD::CSV - which *may* fit the
bill for you.

HTH

-- 
Jim

Copyright notice: all code written by the author in this post is
 released under the GPL. http://www.gnu.org/licenses/gpl.txt 
for more information.

a fortune quote ...
It has been said that man is a rational animal.  All my life I
<have been searching for evidence which could support this.   -- 
Bertrand Russell 
0
James
12/6/2003 10:12:50 PM
sparkane <nertz@numb.no> said:
>What characters if any are typically used to delimit 'fields' and 
>'records' in a text file?

This doesn't matter - as you can nominate a third character to be
used as an "escape" character. This'd typically be '\', and is used
to strip special meanings off characters (at minimum, the two separators
_and_ the escape character itself).
-- 
Wolf  a.k.a.  Juha Laiho     Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
         PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
0
Juha
12/8/2003 5:37:01 PM
sparkane <nertz@numb.no> writes:

> What characters if any are typically used to delimit 'fields' and 
> 'records' in a text file?

Commas, whitespace, pipes, newlines, double-newlines...

> I'm working on a rewrite of an app that stores some data in text files 
> and uses '|' and "\n" as 'field' and 'record' separators, respectively.  
> I want to replace these and have been thinking of using "\e" and
> "\0", 

If you are going to use non-printable characters why not just use the
field-separator "\x1c" and record-separator "\x1e" characters defined
by ASCII[1]?

This, of course, has nothing to do with Perl.

[1] Or Unicode, same thing in this case - codepoints 0-0x7F of Unicode
are ASCII.
        
-- 
     \\   ( )
  .  _\\__[oo
 .__/  \\ /\@
 .  l___\\
  # ll  l\\
 ###LL  LL\\
0
Brian
12/11/2003 5:19:22 PM
Reply: