Parsing delimiter-separated data.

  • Follow


I'm writing a hash (string keys, string values) to a text file (by STDOUT) 
for reading later, and I decided on the following format:

key|value|
another key|another value|

to make the file clearly human-readable (the values and keys can contain 
spaces).  I've also provided for escaping ``|'' and ``\'' in the data with 
``\|'' and ``\\'' respectively.

Here's the output routine

    foreach $key (keys(%table) ) {
        $value = $table{$key} ;
        $key   =~ s/\\/\\\\/g ;
        $key   =~ s/\|/\\\|/g ;
        $value =~ s/\\/\\\\/g ;
        $value =~ s/\|/\\\|/g ;
        print($key . "|" . $value . "|\n") ;

and here's the input routine

    while ($line = <>) {
        chomp($line) ;
        $line =~ /^(.*([^\\]|\\\\))\|(.*)\|$/ ;
        $key   =  $1 ;
        $value =  $3 ;
        $key   =~ s/\\\|/\|/g ;
        $key   =~ s/\\\\/\\/g ;
        $value =~ s/\\\|/\|/g ;
        $value =~ s/\\\\/\\/g ;
        $table{$key} = $value ;
    }


They seem to work, but I'm not sure how efficient they are (in particular 
I have doubts about the regexp), so I'd appreciate any suggestions for 
improvement.

I've also just noticed that the input routine would not correctly handle a 
line like this:

blah\\|blahblah|

What's the best way to reverse the escapes?


0
Reply a24061 (4) 11/14/2003 9:14:35 AM

Adam (a24061@yahoo.munged) wrote on MMMDCCXXVII September MCMXCIII in
<URL:news:%11tb.3616$4a5.30444640@news-text.cableinet.net>:
//  I'm writing a hash (string keys, string values) to a text file (by STDOUT) 
//  for reading later, and I decided on the following format:
//  
//  key|value|
//  another key|another value|
//  
//  to make the file clearly human-readable (the values and keys can contain 
//  spaces).  I've also provided for escaping ``|'' and ``\'' in the data with 
//  ``\|'' and ``\\'' respectively.
//  
//  Here's the output routine
//  
//      foreach $key (keys(%table) ) {
//          $value = $table{$key} ;
//          $key   =~ s/\\/\\\\/g ;
//          $key   =~ s/\|/\\\|/g ;
//          $value =~ s/\\/\\\\/g ;
//          $value =~ s/\|/\\\|/g ;
//          print($key . "|" . $value . "|\n") ;
//  
//  and here's the input routine
//  
//      while ($line = <>) {
//          chomp($line) ;
//          $line =~ /^(.*([^\\]|\\\\))\|(.*)\|$/ ;
//          $key   =  $1 ;
//          $value =  $3 ;
//          $key   =~ s/\\\|/\|/g ;
//          $key   =~ s/\\\\/\\/g ;
//          $value =~ s/\\\|/\|/g ;
//          $value =~ s/\\\\/\\/g ;
//          $table{$key} = $value ;
//      }
//  
//  
//  They seem to work, but I'm not sure how efficient they are (in particular 
//  I have doubts about the regexp), so I'd appreciate any suggestions for 
//  improvement.
//  
//  I've also just noticed that the input routine would not correctly handle a 
//  line like this:
//  
//  blah\\|blahblah|
//  
//  What's the best way to reverse the escapes?


What is the best way? Here is *a* way of dealing with it:

#!/usr/bin/perl

use strict;
use warnings;

while (<DATA>) {
    chomp;
    my ($key, $value) = /^([^\\|]*(?:\\.[^\\|]*)*)\|([^\\|]*(?:\\.[^\\|]*)*)\|$/
         or next;
    map {s/\\(.)/$1/g} $key, $value;
    print "[$key] [$value]\n";
}

__DATA__
hello|world|
he\|\|o|wor\|d|
blah\\|blahblah|


Running this gives:

[hello] [world]
[he||o] [wor|d]
[blah\] [blahblah]



Abigail
-- 
sub _'_{$_'_=~s/$a/$_/}map{$$_=$Z++}Y,a..z,A..X;*{($_::_=sprintf+q=%X==>"$A$Y".
"$b$r$T$u")=~s~0~O~g;map+_::_,U=>T=>L=>$Z;$_::_}=*_;sub _{print+/.*::(.*)/s};;;
*_'_=*{chr($b*$e)};*__=*{chr(1<<$e)};                # Perl 5.6.0 broke this...
_::_(r(e(k(c(a(H(__(l(r(e(P(__(r(e(h(t(o(n(a(__(t(us(J())))))))))))))))))))))))
0
Reply Abigail 11/14/2003 9:35:01 AM


1 Replies
167 Views

(page loaded in 0.066 seconds)

Similiar Articles:













7/30/2012 1:11:57 AM


Reply: