f



Pulling out lines of text from a text file

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum.  My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

my @array;

open FILE, "$file";

while (<FILE>) { push @array; }
close FILE;

print $array[$number];
__END__

or something close to it should work.  (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about).  An anyone point the way to a faster
algorithm?

Thanks,
'cid 'ooh

0
poopdeville (133)
3/9/2006 8:54:52 AM
comp.lang.perl.misc 33233 articles. 2 followers. brian (1246) is leader. Post Follow

17 Replies
980 Views

Similar Articles

[PageSpeed] 37

poopdeville@gmail.com wrote:
> ... pull a single line of text from an arbitrary text file.
> I know the following would work:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> 
> print "Enter a file\n";
> my $file = <STDIN>;
> 
> print "Enter a number\n";
> my $number = <STDIN>;
> 
> my @array;
> 
> open FILE, "$file";
> 
> while (<FILE>) { push @array; }
> close FILE;
> 
> print $array[$number];
> __END__
> 
> or something close to it should work.  (I hope there aren't any errors
> there, but if there are, I hope you get the idea of the naive
> implementation I'm talking about).  An anyone point the way to a faster
> algorithm?

Use the FAQ answer provided in "perldoc -q middle", i.e. Tie::File.

I leave it to you to compare the speed. ;-)

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
0
Gunnar
3/9/2006 9:08:06 AM
<poopdeville@gmail.com> wrote in message
news:1141894492.223365.64420@i39g2000cwa.googlegroups.com...
> Hi everybody,
>
> I'm looking to write a Catalyst model to basically use a newline
> delimited database, so that each line of the text file corresponds to a
> datum.  My question isn't about the module per se, but on smart
> algorithms to pull a single line of text from an arbitrary text file.
> I know the following would work:
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
>
> print "Enter a file\n";
> my $file = <STDIN>;
>
> print "Enter a number\n";
> my $number = <STDIN>;
>
> my @array;
>
> open FILE, "$file";
>
> while (<FILE>) { push @array; }
> close FILE;
>
> print $array[$number];
> __END__
>
> or something close to it should work.  (I hope there aren't any errors
> there, but if there are, I hope you get the idea of the naive
> implementation I'm talking about).  An anyone point the way to a faster
> algorithm?
>
> Thanks,
> 'cid 'ooh
>

If I understand you correctly (also untested) :

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

# Always check that open succeeds
open FILE, "$file" or die "Can't open: $!";

while (<FILE>) {
           if($. == $number) {print $_}
          last; # no need to keep reading
          }

# Always check that close succeeds
close FILE or die "Can't close: $!";

__END__

See the documentation for $. in 'perldoc perlvar'.

Cheers,
Rob


0
Sisyphus
3/9/2006 9:10:51 AM
 <poopdeville@gmail.com> wrote in comp.lang.perl.misc:
> Hi everybody,
> 
> I'm looking to write a Catalyst model to basically use a newline
> delimited database, so that each line of the text file corresponds to a
> datum.  My question isn't about the module per se, but on smart
> algorithms to pull a single line of text from an arbitrary text file.
> I know the following would work:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> 
> print "Enter a file\n";
> my $file = <STDIN>;
> 
> print "Enter a number\n";
> my $number = <STDIN>;
> 
> my @array;
> 
> open FILE, "$file";
> 
> while (<FILE>) { push @array; }

You don't need an explicit loop here:

    @array = <FILE>;

> close FILE;
> 
> print $array[$number];
> __END__
> 
> or something close to it should work.  (I hope there aren't any errors
> there, but if there are, I hope you get the idea of the naive
> implementation I'm talking about).  An anyone point the way to a faster
> algorithm?

Look into Tie::File, it will simplify things.

Anno
-- 
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article.  Click on 
"show options" at the top of the article, then click on the 
"Reply" at the bottom of the article headers.
0
anno4000
3/9/2006 9:17:22 AM
On 2006-03-09, poopdeville@gmail.com <poopdeville@gmail.com> wrote:
> Hi everybody,
>
> I'm looking to write a Catalyst model to basically use a newline
> delimited database, so that each line of the text file corresponds to a
> datum.  My question isn't about the module per se, but on smart
> algorithms to pull a single line of text from an arbitrary text file.
> I know the following would work:
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;

> print "Enter a file\n";

# For more of a standard prompt you probably just want:
print "Enter a file: ";

> my $file = <STDIN>;

# $file will have a \n on the end use chomp to remove it:
chomp ($file);

> print "Enter a number\n";

# Ditto:
print "Enter a number: ";

> my $number = <STDIN>;

# Ditto:
chomp ($number);

> my @array;
>
> open FILE, "$file";

# You should check the return code of open rather then assuming it worked,
# you also don't need the quotes around $file since you aren't adding
# any extra text.

open FILE, $file or die "Can't open $file: $!\n";

> while (<FILE>) { push @array; }

# Replaces this with:

@array = <FILE>;

# @lines would be a better name for the array though, @array is
# redundant and generic, we already know it is an array from the @

> close FILE;

# Though less of an issue when you are just reading a file, you should
# get in the habit of checking the return code of close as well.

> print $array[$number];
> __END__
>
> or something close to it should work.  (I hope there aren't any errors
> there, but if there are, I hope you get the idea of the naive
> implementation I'm talking about).  An anyone point the way to a faster
> algorithm?

If you are just trying to read a whole file into an array by line, @var
= <FH> is better then a while loop and push.  If all your program is
doing is printing out a specific line from a file (and especially if the
file is going to be large), it may be better to loop over the file until
you get to that line number (or EOF) and then print the line (or an
error).  This way a 100k lines file isn't stored in RAM to print line
10.

-- 
Michael
michael@thegrebs.com
SpamStats: http://spam.thegrebs.com
0
Michael
3/9/2006 9:19:57 AM
Sisyphus <sisyphus1@nomail.afraid.org> wrote:

> If I understand you correctly (also untested) :

> print "Enter a file\n";
> my $file = <STDIN>;


   chomp $file;   # ?


> # Always check that open succeeds
> open FILE, "$file" or die "Can't open: $!";
             ^^^^^^^
             ^^^^^^^

Never quote a lone variable.


   perldoc -q vars

       What's wrong with always quoting "$vars"?


> while (<FILE>) {
>            if($. == $number) {print $_}
>           last; # no need to keep reading
>           }


A loop that must execute exactly zero or one time isn't much of a loop...


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas
0
Tad
3/9/2006 12:49:24 PM
"Tad McClellan"
..
..
>
> > while (<FILE>) {
> >            if($. == $number) {print $_}
> >           last; # no need to keep reading
> >           }
>
>
> A loop that must execute exactly zero or one time isn't much of a loop...
>

Heh ... indeed .... better make that (still untested):

 while (<FILE>) {
           if($. == $number) {
              print $_;
              last; # no need to keep reading
              }
          }

Cheers,
Rob


0
Sisyphus
3/9/2006 10:25:56 PM
>>>>> "S" == Sisyphus  <sisyphus1@nomail.afraid.org> writes:

  S> "Tad McClellan"
  S> .
  S> .
  >> 
  >> > while (<FILE>) {
  >> >            if($. == $number) {print $_}
  >> >           last; # no need to keep reading
  >> >           }
  >> 
  >> 
  >> A loop that must execute exactly zero or one time isn't much of a loop...
  >> 

  S> Heh ... indeed .... better make that (still untested):

  S>  while (<FILE>) {
  S>            if($. == $number) {
  S>               print $_;
  S>               last; # no need to keep reading
  S>               }
  S>           }

even better:

	while (<FILE>) {
		next if $. < $number ;
		print $_ ;
		last;
	}

saves a whole block and those damned expensive {}. also it has a more
common style of indenting.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org
0
Uri
3/9/2006 10:36:59 PM
poopdeville@gmail.com wrote:
> Hi everybody,
>
> I'm looking to write a Catalyst model to basically use a newline
> delimited database, so that each line of the text file corresponds to a
> datum.  My question isn't about the module per se, but on smart
> algorithms to pull a single line of text from an arbitrary text file.

Thanks everybody!
'cid 'ooh

0
poopdeville
3/9/2006 11:12:02 PM
Uri Guttman schreef:

> even better:
> 
> while (<FILE>) {
>   next if $. < $number ;
>   print $_ ;
>   last;
> }

while (<FILE>) {
  $. == $number or next;
  print;
  last;
}

-- 
Affijn, Ruud

"Gewoon is een tijger."
0
Dr
3/9/2006 11:46:26 PM
Dr.Ruud wrote:
> Uri Guttman schreef:
>
> > even better:
> >
> > while (<FILE>) {
> >   next if $. < $number ;
> >   print $_ ;
> >   last;
> > }
>
> while (<FILE>) {
>   $. == $number or next;
>   print;
>   last;
> }
>

why not just:

while(<FILE>) {
    print and last if $. == $number;
}

Xicheng

> Affijn, Ruud
> "Gewoon is een tijger."

0
Xicheng
3/10/2006 12:36:25 AM
"Dr.Ruud" <rvtol+news@isolution.nl> wrote:

> Uri Guttman schreef:
> 
>> even better:
>> 
>> while (<FILE>) {
>>   next if $. < $number ;
>>   print $_ ;
>>   last;
>> }
> 
> while (<FILE>) {
>   $. == $number or next;
>   print;
>   last;
> }

Ok I just closed my "Reply" :-D.

The reason why I prefer this one is that I read the $. == $number as what 
must be true for the rest to happen. Since I read left to right, I "see" 
it faster compared to next if ...

-- 
John                Experienced Perl programmer: http://castleamber.com/
0
John
3/10/2006 1:04:57 AM
"Xicheng" <xicheng@gmail.com> wrote in message
news:1141950984.990067.235690@v46g2000cwv.googlegroups.com...
> Dr.Ruud wrote:
> > Uri Guttman schreef:
> >
> > > even better:
> > >
> > > while (<FILE>) {
> > >   next if $. < $number ;
> > >   print $_ ;
> > >   last;
> > > }
> >
> > while (<FILE>) {
> >   $. == $number or next;
> >   print;
> >   last;
> > }
> >
>
> why not just:
>
> while(<FILE>) {
>     print and last if $. == $number;
> }
>

This is heading towards the providing of a good example of my reservations
about that construct that does away with the curly braces.

I mean - if:

while(<FILE>) {
    print $_;
}

can be replaced with:

print $_ while <FILE>

then I expect that:

while(<FILE>) {
    print and last if $. == $number;
}

can be replaced with:

print and last if $. == $number while <FILE>;

There's an inconsistency in the implementation of this feature that leaves
me feeling rather cold.

Cheers,
Rob


0
Sisyphus
3/10/2006 1:41:18 AM
>>>>> "S" == Sisyphus  <sisyphus1@nomail.afraid.org> writes:

  S> I mean - if:

  S> while(<FILE>) {
  S>     print $_;
  S> }

  S> can be replaced with:

  S> print $_ while <FILE>

  S> then I expect that:

  S> while(<FILE>) {
  S>     print and last if $. == $number;
  S> }

  S> can be replaced with:

  S> print and last if $. == $number while <FILE>;

the problem with multiple statement modifier is that they are not clear
in what they mean and are tricky to correctly parse out. larry has
stated that perl5 will never get them but i think it has been discussed
in the perl6 lists.

any i would write that (if i wanted to write code like this which i
don't):

	$. == $number and print and last while <FILE>;

or even (and the precedence is correct as , binds before 'and'

	$. == $number and print, last while <FILE>;

but i don't like compound statements like that (which is not the same as
a boolean expression which modifies a statement).

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org
0
Uri
3/10/2006 4:11:11 AM
Sisyphus <sisyphus1@nomail.afraid.org> wrote:
> "Xicheng" <xicheng@gmail.com> wrote in message
> news:1141950984.990067.235690@v46g2000cwv.googlegroups.com...


>> while(<FILE>) {
>>     print and last if $. == $number;
>> }
>>
> 
> This is heading towards the providing of a good example of my reservations
> about that construct that does away with the curly braces.


The docs (perlsyn) call them "modifiers".


> I mean - if:
> 
> while(<FILE>) {
>     print $_;
> }
> 
> can be replaced with:
> 
> print $_ while <FILE>
> 
> then I expect that:
> 
> while(<FILE>) {
>     print and last if $. == $number;
> }
> 
> can be replaced with:
> 
> print and last if $. == $number while <FILE>;


You shouldn't expect that because it is documented to be disallowed.  <g>

perlsyn says you can have only 1 modifier, you are using 2 modifiers.


> There's an inconsistency in the implementation of this feature that leaves
> me feeling rather cold.


Yes, I see that too.

I'm pretty sure that Larry saw that too too.  :-)

I remember hearing/reading somewhere that he purposely limited
it to 1 modifier because it would/could lead to some really
hard-to-understand code.

I thought the "somewhere" was in the std docs, and that

   grep BASIC *.pod

would find it (because the feature was borrowed from BASIC-PLUS),
but I don't see in there anymore (v5.8.7).



The objection to lack of curly brackets is the "dangling else"
problem in disguise, and so is probably a widely held objection. 

Which may be why Larry limited modifiers to only one...?


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas
0
Tad
3/10/2006 4:41:33 AM
John Bokma:
> Dr.Ruud:
>> Uri Guttman:

>>> even better:
>>>
>>> while (<FILE>) {
>>>   next if $. < $number ;
>>>   print $_ ;
>>>   last;
>>> }
>>
>> while (<FILE>) {
>>   $. == $number or next;
>>   print;
>>   last;
>> }
>
> Ok I just closed my "Reply" :-D.
>
> The reason why I prefer this one is that I read the $. == $number as
> what must be true for the rest to happen. Since I read left to right,
> I "see" it faster compared to next if ...

It is much like what "perl -MO=Deparse,-x7" makes of this:

  while (<FILE>) {
    next unless $. == $number;
    print;
    last;
  }

------ 

The "next if $. != $number" version becomes more like:

  while (<FILE>) {
    $. != $number and next;
    print;
    last;
  }

======

The "$. < $number" version has a problem, if the same block can be used
in situations where $number can be any number.

  while (<FILE>) {
    die "huh?" if $. > $number;
    next       if $. < $number;
    print;
    last;
  }

------ 

If you don't like to use 'next' but have no problem with 'last':

  while (<FILE>) {
    die "huh?" if $. > $number;
    if ($. == $number) {
      print;
      last;
    }
  }

------ 

If you don't like to use 'next' nor 'last':

  for (my $last = 0 ; defined($_=<>) and not $last ; ) {
    die "huh?" if $. > $number;
    if ($. == $number) {
      print;
      $last = 1;
    }
  }

(hey, just trying to scare you)

-- 
Affijn, Ruud

"Gewoon is een tijger."

0
Dr
3/10/2006 8:25:11 AM
"Tad McClellan" <tadmc@augustmail.com> wrote in message
..
..
>
> I remember hearing/reading somewhere that he purposely limited
> it to 1 modifier because it would/could lead to some really
> hard-to-understand code.
>

When I look at the code that uri has just provided, I start to think that
the horse has already bolted on that score ... and that allowing one
modifier is one modifier too many :-)

To save you looking it up, uri presented the following 2 alternatives:

$. == $number and print and last while <FILE>;
$. == $number and print, last while <FILE>;

That's plenty hard enough for *me* to understand :-)

(And, yes - I've also pondered that "dangling else" as you called it ....
didn't realize it had a name.)

Cheers,
Rob



0
Sisyphus
3/10/2006 9:27:07 AM
Sisyphus <sisyphus1@nomail.afraid.org> wrote:

> (And, yes - I've also pondered that "dangling else" as you called it ....
> didn't realize it had a name.)


It is a common topic in computer science, google it.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas
0
Tad
3/10/2006 6:02:50 PM
Reply: