extract parts of file - newbie

  • Follow


Hello. New to Perl and trying to figure out if beter way to do the
following (in Active State Perl under Windows 2000):

I have this DOS text file with about 20,000 lines. In the simple
example below I can extract lines that contain a particular string.

$db = "work.txt"; 
open (FILE,"$db"); 
@LINES=<FILE>; 
close(FILE); 
$SIZE=@LINES; 
print $SIZE,"\n";
for ($i=0;$i<=$SIZE;$i++) 
{ 
   $_=$LINES[$i]; 
   if (/motion/i)
   {print "$_";}
} 


How can I extract:

1. 5 lines before and after the string
2. Columns positions 5-15 (for all selected)
3. Limit selection to rows 5000-7000
4. The last 5 lines of the entire file

Many Thanks for any help or information!!
0
Reply jason 2/16/2004 3:59:26 PM

jason@cyberpine.com wrote:
> In the simple example below I can extract lines that contain a
> particular string.
> 
> $db = "work.txt"; 
> open (FILE,"$db"); 
> @LINES=<FILE>; 
> close(FILE); 
> $SIZE=@LINES; 
> print $SIZE,"\n";
> for ($i=0;$i<=$SIZE;$i++) 
> { 
>    $_=$LINES[$i]; 
>    if (/motion/i)
>    {print "$_";}
> } 
> 
> How can I extract:
> 
> 1. 5 lines before and after the string
> 2. Columns positions 5-15 (for all selected)
> 3. Limit selection to rows 5000-7000
> 4. The last 5 lines of the entire file

By using your fantasy and possibly learning a little more Perl.

What have you tried so far? What difficulties did you encounter that
you weren't able to solve by help of the documentation and the FAQ?

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

0
Reply Gunnar 2/16/2004 4:02:10 PM


jason@cyberpine.com wrote:
> 
> Hello. New to Perl and trying to figure out if beter way to do the
> following (in Active State Perl under Windows 2000):
> 
> I have this DOS text file with about 20,000 lines. In the simple
> example below I can extract lines that contain a particular string.
> 
> $db = "work.txt";
> open (FILE,"$db");
> @LINES=<FILE>;
> close(FILE);
> $SIZE=@LINES;
> print $SIZE,"\n";
> for ($i=0;$i<=$SIZE;$i++)
> {
>    $_=$LINES[$i];
>    if (/motion/i)
>    {print "$_";}
> }

A more Perl-ish version of that would be:

use warnings;
use strict;

my $db = 'work.txt';
open FILE, $db or die "Cannot open $db: $!";
my @lines = <FILE>;
close FILE;
print @lines . "\n";
for ( @lines )
{
   print if /motion/i;
}


> How can I extract:
> 
> 1. 5 lines before and after the string

for my $i ( 0 .. $#lines )
{
   print @lines[ $i - 5 .. $i + 5 ] if /motion/i;
}


> 2. Columns positions 5-15 (for all selected)

for ( @lines )
{
   print substr $_, 4, 11 if /motion/i;
}


> 3. Limit selection to rows 5000-7000

for ( @lines[ 4999 .. 6999 ] )
{
   print if /motion/i;
}


> 4. The last 5 lines of the entire file

for ( @lines[ $#lines - 5 .. $#lines ] )
{
   print if /motion/i;
}



John
-- 
use Perl;
program
fulfillment
0
Reply John 2/16/2004 9:58:52 PM

jason@cyberpine.com wrote:

> I have this DOS text file with about 20,000 lines. 

[snip]

> How can I extract:
> 
> 1. 5 lines before and after the string

Search the Google usenet archives; you'll find a number of solutions. In 
particular, see the thread starting with a post by Tom Christiansen, 
message ID 37e1043e@cs.colorado.edu. (I searched for the words "before 
after lines match" (but not as a phrase), and TC's thread was the first 
match. Lots of other hits, too.)

> 2. Columns positions 5-15 (for all selected)

perldoc -f substr

> 3. Limit selection to rows 5000-7000

Check out $. in perlvar and read the section on range operators in perlop.

> 4. The last 5 lines of the entire file

Left as an exercise...  :-)
0
Reply David 2/16/2004 10:44:06 PM

jason@cyberpine.com <jason@cyberpine.com> wrote:

> New to Perl 


We can tell that from the code.  :-)


> open (FILE,"$db"); 


You should not quote a lone variable.

You should always, yes *always*, check the return value from open():

    open(FILE, $db) or die "could not open '$db' $!";


> @LINES=<FILE>; 
> close(FILE); 
> $SIZE=@LINES; 
> print $SIZE,"\n";
> for ($i=0;$i<=$SIZE;$i++) 
> { 
>    $_=$LINES[$i]; 


Phew!

Don't read it ALL into memory only to process it line-by-line,
just read and process a line at a time.

If you do that, you can replace that whole chunk of code with just this:

   while ( <FILE> ) {


>    if (/motion/i)
>    {print "$_";}
            ^  ^
            ^  ^ more useless quotes, remove them
> } 
> 
> 
> How can I extract:
> 
> 1. 5 lines before and after the string


Oh. _Now_ you might want them all in an array.  :-)

   foreach my $index ( $i - 5 .. $i + 5 ) {
      print $LINES[$index];
   }

Or you could use an "array slice" (see perldata.pod):

   print @LINES[ $i - 5 .. $i + 5 ];


What do you want to do if the matched line is in the first or
last 5 lines? ...

You could still process line-by-line if you maintained a 5-line buffer
of the previous lines.


> 2. Columns positions 5-15 (for all selected)


   print substr($_, 4, 11), "\n" if /motion/i;


> 3. Limit selection to rows 5000-7000


   my @selected = @LINES[ 5000 .. 7000 ];


> 4. The last 5 lines of the entire file


   print @LINES[ $#LINES-4 .. $#LINES ];


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas
0
Reply Tad 2/16/2004 11:46:28 PM

jason@cyberpine.com wrote:

: Hello. New to Perl and trying to figure out if beter way to do the
: following (in Active State Perl under Windows 2000):
: 
: I have this DOS text file with about 20,000 lines. In the simple
: example below I can extract lines that contain a particular string.
: 
: $db = "work.txt"; 
: open (FILE,"$db"); 
: @LINES=<FILE>; 
: close(FILE); 
: $SIZE=@LINES; 
: print $SIZE,"\n";
: for ($i=0;$i<=$SIZE;$i++) 
: { 
:    $_=$LINES[$i]; 
:    if (/motion/i)
:    {print "$_";}
: } 

You've committed several novice mistakes there.
1. Using package variables instead of lexicals.
2. Quoting "$vars"
3. Not checking the return from open() for success.
4. Slurping an entire file to perform line-by-line processing.
5. Iterating across an array's indeces instead of iterating across its
elements.

Sequentially processing the records in a file is such a common task that
you should learn the Perlish way of doing it.

    my $db = "work.txt"; 
    open (FILE, '<', $db) or die "Cannot open '$db' for read:$!";
    while(<FILE>) {
        print if /motion/i;
    }

: How can I extract:
: 
: 1. 5 lines before and after the string

Store the previous five lines in an array.  When your program recognizes
the desired record, have it output the contents of this buffer and note
that it should output the next five records.

: 2. Columns positions 5-15 (for all selected)

The substr() function will do that.  See perlfunc.

: 3. Limit selection to rows 5000-7000

The '..' range operator is imbued with special juju for that purpose.
See perlop.

: 4. The last 5 lines of the entire file

The buffer implemented for requirement 1 can be made to handle that as
well.
 
Altogether, the program might go like:

    #!perl
    use warnings;
    use strict;
    my $db = 'work.txt';
    open my $fh, '<', $db or die "Cannot open '$db' for read: $!";
    my (@w, $n);
    while (<$fh>) {
        push @w, substr($_,5,11); # requirement 2
        $n=6
            if 5000 .. 7000 # requirement 3
            and /motion/;
        if($n) {
            print @w;
            @w = ();
            $n--;
        }
        else {
            splice @w, 0, -5; # limit window to five previous records
        }
    }
    print @w; # requirement 4

0
Reply tiltonj 2/17/2004 12:05:17 AM

5 Replies
375 Views

(page loaded in 0.08 seconds)

Similiar Articles:













7/25/2012 8:48:41 PM


Reply: