Print Section of file

  • Follow


Hi!

How do I print a section of a file, where the section is specified by
shell variables. Something like

awk -v fi="$1" fu="$2" '/^fi/,/^fu/{print}' file

does not work. This also keeps true for appending the shell variables,
i.e. awk '...' fi="$1" fu="$2". However, when I substitute fi, fu with
the desired values, the output is as desired.
With thanks
Yours Wolfgang
0
Reply Gehricht 4/8/2009 6:02:44 PM

On Wednesday 8 April 2009 20:02, Gehricht@googlemail.com wrote:

> Hi!
> 
> How do I print a section of a file, where the section is specified by
> shell variables. Something like
> 
> awk -v fi="$1" fu="$2" '/^fi/,/^fu/{print}' file

try 

awk -v fi="$1" -v fu="$2" '$0~f1,$0~fu' file

0
Reply pk 4/8/2009 6:06:26 PM


On Wednesday 8 April 2009 20:06, pk wrote:

> On Wednesday 8 April 2009 20:02, Gehricht@googlemail.com wrote:
> 
>> Hi!
>> 
>> How do I print a section of a file, where the section is specified by
>> shell variables. Something like
>> 
>> awk -v fi="$1" fu="$2" '/^fi/,/^fu/{print}' file
> 
> try
> 
> awk -v fi="$1" -v fu="$2" '$0~f1,$0~fu' file

oops, you need to match against "^"fi and "^"fu.

0
Reply pk 4/8/2009 6:09:39 PM

On Apr 8, 1:02=A0pm, "Gehri...@googlemail.com" <Gehri...@googlemail.com>
wrote:
> Hi!
>
> How do I print a section of a file, where the section is specified by
> shell variables. Something like
>
> awk -v fi=3D"$1" fu=3D"$2" '/^fi/,/^fu/{print}' file
>
> does not work. This also keeps true for appending the shell variables,
> i.e. awk '...' fi=3D"$1" fu=3D"$2". However, when I substitute fi, fu wit=
h
> the desired values, the output is as desired.
> With thanks
> Yours Wolfgang

Are fi and fu line numbers or regular expressions or something else?
Depending on the answer, one of these MAY be what you want:

awk -v fi=3D"$1" -v fu=3D"$2" 'NR>=3Dfi; NR=3D=3Dfu{exit}' file

awk -v fi=3D"$1" -v fu=3D"$2" '$0~"^"fi , $0~"^"fu"' file

     Ed.
0
Reply Ed 4/8/2009 6:10:17 PM

Thanks (to both of you) for your suggestions!

>
> awk -v fi="$1" -v fu="$2" '$0~"^"fi , $0~"^"fu' file
>

Yes, that is exactly, what I was looking for. Just for the sake of
curiousity, could you explain me, how this works?
With thanks
Yours Wolfgang
0
Reply Gehricht 4/9/2009 7:21:17 AM

On Apr 9, 3:21=A0am, "Gehri...@googlemail.com" <Gehri...@googlemail.com>
wrote:
> Thanks (to both of you) for your suggestions!
>
>
>
> > awk -v fi=3D"$1" -v fu=3D"$2" '$0~"^"fi , $0~"^"fu' file
>
> Yes, that is exactly, what I was looking for. Just for the sake of
> curiousity, could you explain me, how this works?
> With thanks
> Yours Wolfgang

Does anyone else find it strange that on this file f1

aa
baa
baaa
baaa
ccc
ccc
ccc
ddd

 awk '/^aa/,/^b/' f1
 awk '$0~"^aa",$0~"^b"' f1
and
 awk '$0~"^""aa",$0~"^""b"' f1

all yield the expected output

aa
baa


but the similar

  awk '"^aa","^b"' f1
  awk '"^aa","b"' f1
and
  awk '"^aa",/b/' f1

all yield a different output,

aa
baa
baaa
baaa
ccc
ccc
ccc
ddd

And one might also try

  awk '/aa/,"b"' f1
  awk '/aa/,/b/' f1
and
  awk '/^aa/,"aa"' f1

to be really puzzled!



0
Reply r 4/9/2009 11:02:13 PM

On Friday 10 April 2009 01:02, r.p.loui@gmail.com wrote:

> Does anyone else find it strange that on this file f1
> 
> aa
> baa
> baaa
> baaa
> ccc
> ccc
> ccc
> ddd
> 
>  awk '/^aa/,/^b/' f1
>  awk '$0~"^aa",$0~"^b"' f1
> and
>  awk '$0~"^""aa",$0~"^""b"' f1
> 
> all yield the expected output
> 
> aa
> baa
> 
> 
> but the similar
> 
>   awk '"^aa","^b"' f1
>   awk '"^aa","b"' f1
> and
>   awk '"^aa",/b/' f1
> 
> all yield a different output,
> 
> aa
> baa
> baaa
> baaa
> ccc
> ccc
> ccc
> ddd
> 
> And one might also try
> 
>   awk '/aa/,"b"' f1
>   awk '/aa/,/b/' f1
> and
>   awk '/^aa/,"aa"' f1
> 
> to be really puzzled!

That is simply expected behavior. Using a string alone just evaluates to
true (unless it's the empty string), so these programs

awk '"^aa","^b"' f1
awk '"^aa","b"' f1
awk '"^aa",/b/' f1

are equivalent to

awk '1,1' f1
awk '1,1' f1
awk '1,/b/' f1

and the output is as expected (if you understand how ranges work).

Similarly, these

awk '/aa/,"b"' f1
awk '/aa/,/b/' f1
awk '/^aa/,"aa"' f1

are equivalent to

awk '/aa/,1' f1
awk '/aa/,/b/' f1
awk '/^aa/,1' f1

and the results are again as expected.

0
Reply pk 4/9/2009 11:44:15 PM

r.p.loui@gmail.com wrote:
> On Apr 9, 3:21 am, "Gehri...@googlemail.com" <Gehri...@googlemail.com>
> wrote:
> 
>>Thanks (to both of you) for your suggestions!
>>
>>
>>
>>
>>>awk -v fi="$1" -v fu="$2" '$0~"^"fi , $0~"^"fu' file
>>
>>Yes, that is exactly, what I was looking for. Just for the sake of
>>curiousity, could you explain me, how this works?
>>With thanks
>>Yours Wolfgang
> 
> 
> Does anyone else find it strange

No.

> that on this file f1
> 
> aa
> baa
> baaa
> baaa
> ccc
> ccc
> ccc
> ddd
> 
>  awk '/^aa/,/^b/' f1
>  awk '$0~"^aa",$0~"^b"' f1
> and
>  awk '$0~"^""aa",$0~"^""b"' f1
> 
> all yield the expected output
> 
> aa
> baa
> 
> 
> but the similar
> 
>   awk '"^aa","^b"' f1
>   awk '"^aa","b"' f1
> and
>   awk '"^aa",/b/' f1
> 
> all yield a different output,
> 
> aa
> baa
> baaa
> baaa
> ccc
> ccc
> ccc
> ddd
> 
> And one might also try
> 
>   awk '/aa/,"b"' f1
>   awk '/aa/,/b/' f1
> and
>   awk '/^aa/,"aa"' f1
> 
> to be really puzzled!

All output as expected.

("string" is no regexp but a predicate.)

Janis
0
Reply Janis 4/9/2009 11:45:10 PM

On Apr 9, 7:44=A0pm, pk <p...@pk.invalid> wrote:
> On Friday 10 April 2009 01:02, r.p.l...@gmail.com wrote:
>
>
>
>
>
> > Does anyone else find it strange that on this file f1
>
> > aa
> > baa
> > baaa
> > baaa
> > ccc
> > ccc
> > ccc
> > ddd
>
> > =A0awk '/^aa/,/^b/' f1
> > =A0awk '$0~"^aa",$0~"^b"' f1
> > and
> > =A0awk '$0~"^""aa",$0~"^""b"' f1
>
> > all yield the expected output
>
> > aa
> > baa
>
> > but the similar
>
> > =A0 awk '"^aa","^b"' f1
> > =A0 awk '"^aa","b"' f1
> > and
> > =A0 awk '"^aa",/b/' f1
>
> > all yield a different output,
>
> > aa
> > baa
> > baaa
> > baaa
> > ccc
> > ccc
> > ccc
> > ddd
>
> > And one might also try
>
> > =A0 awk '/aa/,"b"' f1
> > =A0 awk '/aa/,/b/' f1
> > and
> > =A0 awk '/^aa/,"aa"' f1
>
> > to be really puzzled!
>
> That is simply expected behavior. Using a string alone just evaluates to
> true (unless it's the empty string), so these programs
>
> awk '"^aa","^b"' f1
> awk '"^aa","b"' f1
> awk '"^aa",/b/' f1
>
> are equivalent to
>
> awk '1,1' f1
> awk '1,1' f1
> awk '1,/b/' f1
>
> and the output is as expected (if you understand how ranges work).
>
> Similarly, these
>
> awk '/aa/,"b"' f1
> awk '/aa/,/b/' f1
> awk '/^aa/,"aa"' f1
>
> are equivalent to
>
> awk '/aa/,1' f1
> awk '/aa/,/b/' f1
> awk '/^aa/,1' f1
>
> and the results are again as expected.- Hide quoted text -
>
> - Show quoted text -

Ah, I did not know that.  Thanks.

So why can I compose regexps from strings in other contexts where a
regexp is expected,
such as

FS=3D"a""b"

or

match(x,"^"string)

and so on?
0
Reply r 4/10/2009 4:03:57 PM

On Friday 10 April 2009 18:03, r.p.loui@gmail.com wrote:

> So why can I compose regexps from strings in other contexts where a
> regexp is expected,
> such as
> 
> FS="a""b"
> 
> or
> 
> match(x,"^"string)

Because if awk expects a regex, whatever it finds is interpreted as a regex.
Specifically, FS *can* be a regex (if it's not a single character, then awk
assumes it's a regex). And of course, match() expects its second argument
to be a regex.
Of course, all that does not mean that a string is *always* treated as a
regex (in fact it never is, unless the context requires a regex).
0
Reply pk 4/10/2009 4:23:18 PM

On Apr 10, 12:23=A0pm, pk <p...@pk.invalid> wrote:
> On Friday 10 April 2009 18:03, r.p.l...@gmail.com wrote:
>
> > So why can I compose regexps from strings in other contexts where a
> > regexp is expected,
> > such as
>
> > FS=3D"a""b"
>
> > or
>
> > match(x,"^"string)
>
> Because if awk expects a regex, whatever it finds is interpreted as a reg=
ex.
> Specifically, FS *can* be a regex (if it's not a single character, then a=
wk
> assumes it's a regex). And of course, match() expects its second argument
> to be a regex.
> Of course, all that does not mean that a string is *always* treated as a
> regex (in fact it never is, unless the context requires a regex).

So a range a,b does not require a pair of regexp's a and b?  Right.
It can be
a relational expression or a boolean combination of patterns...  So
the non-null string
in that context evaluates to 1 (null string to 0) because awk chooses
first to see it as
a relational expression rather than a regular expression?

Seems a bit unfortunate..., but I do like the rule you formulated --
"in fact it never is, unless the context requires a regex"... is this
documented in Robbins?
0
Reply r 4/10/2009 8:21:07 PM

r.p.loui@gmail.com wrote:
> On Apr 10, 12:23 pm, pk <p...@pk.invalid> wrote:
> 
>>On Friday 10 April 2009 18:03, r.p.l...@gmail.com wrote:
>>
>>
>>>So why can I compose regexps from strings in other contexts where a
>>>regexp is expected,
>>>such as
>>
>>>FS="a""b"
>>
>>>or
>>
>>>match(x,"^"string)
>>
>>Because if awk expects a regex, whatever it finds is interpreted as a regex.
>>Specifically, FS *can* be a regex (if it's not a single character, then awk
>>assumes it's a regex). And of course, match() expects its second argument
>>to be a regex.
>>Of course, all that does not mean that a string is *always* treated as a
>>regex (in fact it never is, unless the context requires a regex).
> 
> 
> So a range a,b does not require a pair of regexp's a and b?  Right.

Yes.

> It can be
> a relational expression or a boolean combination of patterns... 

Not quite. It's just a _predicate_. (Some like to call it condition.)

(With the special case extension of a range defined by predicates
as in the examples of this thread.)

Or consider relational expressions which actually evaluate again
to a _predicate_.

See also thread "What is an awk statement ?" (2009-03-17) in c.l.a.

> So
> the non-null string
> in that context evaluates to 1 (null string to 0) because awk chooses
> first to see it as
> a relational expression rather than a regular expression?
> 
> Seems a bit unfortunate...,

The unfortunate thing is the long established misnomer "pattern" in

   pattern  { action }

instead of

  predicate { action }

  /pattern/ { action }

where in the latter case /pattern/ is a shortcut for $0 ~ /pattern/
which is actually, again, a predicate.

Janis

> but I do like the rule you formulated --
> "in fact it never is, unless the context requires a regex"... is this
> documented in Robbins?
0
Reply Janis 4/10/2009 8:50:25 PM

On Friday 10 April 2009 22:21, r.p.loui@gmail.com wrote:

>> Because if awk expects a regex, whatever it finds is interpreted as a
>> regex. Specifically, FS *can* be a regex (if it's not a single character,
>> then awk assumes it's a regex). And of course, match() expects its second
>> argument to be a regex.
>> Of course, all that does not mean that a string is *always* treated as a
>> regex (in fact it never is, unless the context requires a regex).
> 
> So a range a,b does not require a pair of regexp's a and b?  Right.
> It can be
> a relational expression or a boolean combination of patterns...  So
> the non-null string
> in that context evaluates to 1 (null string to 0) because awk chooses
> first to see it as
> a relational expression rather than a regular expression?

Right. Ranges are fairly generic, so you can do

awk 'NR==4,NR==10' to print from line 4 to 10, or

awk 'NR==4,0' to print from line 4 to the end, or

awk '$5<6,$5>20' to print from a line where 5th field is lesser than 6 to a
line where it's greater than 20 (assuming that makes sense for the
particular problem).

It may be worth noting that when you do something like

awk '/foo/,/bar/'

you are effectively doing

awk '$0~/foo/,$0~/bar/'

due to the way regex literals behavior is defined in that context. Plain
strings are NOT defined to behave that way anywhere.

> Seems a bit unfortunate..., but I do like the rule you formulated --
> "in fact it never is, unless the context requires a regex"... is this
> documented in Robbins?

It's documented (implicitly, possibly explicity - I didn't check) in the awk
language specification. I suggest you read the specification for awk.

http://www.opengroup.org/onlinepubs/9699919799/utilities/awk.html
0
Reply pk 4/10/2009 9:02:39 PM

On Apr 10, 5:02=A0pm, pk <p...@pk.invalid> wrote:
> On Friday 10 April 2009 22:21, r.p.l...@gmail.com wrote:
>
> >> Because if awk expects a regex, whatever it finds is interpreted as a
> >> regex. Specifically, FS *can* be a regex (if it's not a single charact=
er,
> >> then awk assumes it's a regex). And of course, match() expects its sec=
ond
> >> argument to be a regex.
> >> Of course, all that does not mean that a string is *always* treated as=
 a
> >> regex (in fact it never is, unless the context requires a regex).
>
> > So a range a,b does not require a pair of regexp's a and b? =A0Right.
> > It can be
> > a relational expression or a boolean combination of patterns... =A0So
> > the non-null string
> > in that context evaluates to 1 (null string to 0) because awk chooses
> > first to see it as
> > a relational expression rather than a regular expression?
>
> Right. Ranges are fairly generic, so you can do
>
> awk 'NR=3D=3D4,NR=3D=3D10' to print from line 4 to 10, or
>
> awk 'NR=3D=3D4,0' to print from line 4 to the end, or
>
> awk '$5<6,$5>20' to print from a line where 5th field is lesser than 6 to=
 a
> line where it's greater than 20 (assuming that makes sense for the
> particular problem).
>
> It may be worth noting that when you do something like
>
> awk '/foo/,/bar/'
>
> you are effectively doing
>
> awk '$0~/foo/,$0~/bar/'
>
> due to the way regex literals behavior is defined in that context. Plain
> strings are NOT defined to behave that way anywhere.
>
> > Seems a bit unfortunate..., but I do like the rule you formulated --
> > "in fact it never is, unless the context requires a regex"... is this
> > documented in Robbins?
>
> It's documented (implicitly, possibly explicity - I didn't check) in the =
awk
> language specification. I suggest you read the specification for awk.
>
> http://www.opengroup.org/onlinepubs/9699919799/utilities/awk.html

Aha, that is nicer reading than the man pages for sure.  Here is the
relevant passage:


  Patterns
  A pattern is any valid expression, a range specified by two
expressions separated by a comma, or one of the two special patterns
BEGIN or END.

   Expression Patterns
   An expression pattern shall be evaluated as if it were an
expression in a Boolean context. If the result is true, the pattern
shall be considered to match, and the associated action (if any) shall
be executed. If the result is false, the action shall not be executed.

which would make the rule explicit in this document (it is not clear
from the man page, I contend).  Since strings are presumably "valid
expressions" and "expression patterns", they get evaluated in a
Boolean context (as a predicate, hence, true for non-null and false
for null).  I would even add something like "Regular expressions are
also valid expression patterns.  Unlike other contexts in which a
string can be treated as a regular expression, strings cannot function
like regular expressions in the pattern or range context."

That would be helpful for people who are expecting the coercion from
string to regexp...
0
Reply r 4/14/2009 5:00:05 PM

On Tuesday 14 April 2009 19:00, r.p.loui@gmail.com wrote:

>   Patterns
>   A pattern is any valid expression, a range specified by two
> expressions separated by a comma, or one of the two special patterns
> BEGIN or END.
> 
>    Expression Patterns
>    An expression pattern shall be evaluated as if it were an
> expression in a Boolean context. If the result is true, the pattern
> shall be considered to match, and the associated action (if any) shall
> be executed. If the result is false, the action shall not be executed.
> 
> which would make the rule explicit in this document (it is not clear
> from the man page, I contend).  Since strings are presumably "valid
> expressions" and "expression patterns", they get evaluated in a
> Boolean context (as a predicate, hence, true for non-null and false
> for null).  I would even add something like "Regular expressions are
> also valid expression patterns.  Unlike other contexts in which a
> string can be treated as a regular expression, strings cannot function
> like regular expressions in the pattern or range context."

It's the other way round. This is the relevant part:

"When an ERE token appears as an expression in any context other than as the
right-hand of the '˜' or "!˜" operator or as one of the built-in function
arguments described below, the value of the resulting expression shall be
the equivalent of:

$0 ˜ /ere/"

*That* is the special case, and it's clearly (imho) understandable as such. 
Knowing how that is evaluated by awk, you can then read the part you quoted
(even if it comes first in the page).
It's not written anywhere (nor it can be inferred) that the above would or
should be expected to happen for "normal" data types like strings and
numbers. 
 
> That would be helpful for people who are expecting the coercion from
> string to regexp...

As I said, from what's written in the language description, one should not
expect that to happen.
0
Reply pk 4/14/2009 7:38:00 PM

14 Replies
134 Views

(page loaded in 0.155 seconds)

Similiar Articles:


















7/24/2012 4:49:51 AM


Reply: