extracting a particular pattern from a line

  • Follow


I have lines of text in a file where, inserted in each line, a pattern
exists (digits and dashes) such as:

    one two 9-8 three 2-up
    once 8-11 4-over quadruple
    time never 22-20 always

What I want to extract is this:

    9-8
    7-11
    22-8

The patterns of digits and dashes are always surrounded by spaces and
the digits are never more than sequences of two digits in a row.

If possible,  I would also like to extract the digits followed by
alphabetic characters (if they exist) such as:

    2-up
    4-over

In all instances, digits are preceded by spaces and followed by a
dash.

Can this be done with sed, awk or perl?  Thanks for any help.


0
Reply Lao 9/9/2010 1:50:31 AM

On 09/09/10 03:50, Lao Ming wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
> 
>     one two 9-8 three 2-up
>     once 8-11 4-over quadruple
>     time never 22-20 always
> 
> What I want to extract is this:
> 
>     9-8
>     7-11
>     22-8

These lines of output doesn't seem to match your input.

Try

  awk 'match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}'


> 
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
> 
> If possible,  I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
> 
>     2-up
>     4-over

For both, the above question and this one, try

awk '
  match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}
  match($0,/[0-9]+-[[:alpha:]]+/) {print substr($0,RSTART,RLENGTH)}
'


Janis

> 
> In all instances, digits are preceded by spaces and followed by a
> dash.
> 
> Can this be done with sed, awk or perl?  Thanks for any help.
> 
> 

0
Reply Janis 9/9/2010 2:00:06 AM


On Sep 8, 7:00=A0pm, Janis Papanagnou <janis_papanag...@hotmail.com>
wrote:
> On 09/09/10 03:50, Lao Ming wrote:
>
> > I have lines of text in a file where, inserted in each line, a pattern
> > exists (digits and dashes) such as:
>
> > =A0 =A0 one two 9-8 three 2-up
> > =A0 =A0 once 8-11 4-over quadruple
> > =A0 =A0 time never 22-20 always
>
> > What I want to extract is this:
>
> > =A0 =A0 9-8
> > =A0 =A0 7-11
> > =A0 =A0 22-8
>
> These lines of output doesn't seem to match your input.
>
> Try
>
> =A0 awk 'match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}'
>
>
>
> > The patterns of digits and dashes are always surrounded by spaces and
> > the digits are never more than sequences of two digits in a row.
>
> > If possible, =A0I would also like to extract the digits followed by
> > alphabetic characters (if they exist) such as:
>
> > =A0 =A0 2-up
> > =A0 =A0 4-over
>
> For both, the above question and this one, try
>
> awk '
> =A0 match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}
> =A0 match($0,/[0-9]+-[[:alpha:]]+/) {print substr($0,RSTART,RLENGTH)}
> '

Wow! Perfect!
Be careful.  I think that the Mafia could soon be after you.


Because you know too much.

:)


> Janis
>
>
>
>
>
> > In all instances, digits are preceded by spaces and followed by a
> > dash.
>
> > Can this be done with sed, awk or perl? =A0Thanks for any help.

0
Reply Lao 9/9/2010 2:19:35 AM

On Sep 8, 8:50=A0pm, Lao Ming <laoming...@gmail.com> wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> =A0 =A0 one two 9-8 three 2-up
> =A0 =A0 once 8-11 4-over quadruple
> =A0 =A0 time never 22-20 always
>
> What I want to extract is this:
>
> =A0 =A0 9-8
> =A0 =A0 7-11
> =A0 =A0 22-8

Let's assume the above matched your input :-).

> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, =A0I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> =A0 =A0 2-up
> =A0 =A0 4-over
>
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? =A0Thanks for any help.

$ cat file
    one two 9-8 three 2-up
    once 8-11 4-over quadruple
    time never 22-20 always

$ sed -n 's/.* \([[:digit:]][[:digit:]]*-[[:digit:]][[:digit:]]*\).*/
\1/p' file
9-8
8-11
22-20

$ sed -n 's/.* \([[:digit:]][[:digit:]]*-[[:alpha:]][[:alpha:]]*\).*/
\1/p' file
2-up
4-over

That'll match sequences of more than 2 digitis and/or letters - if
that's a problem, let us know. Also, if you want to match multiple
occurrences of each pattern on a single line, just add a "g" to the
end of the command.

    Ed.
0
Reply Ed 9/9/2010 3:50:59 PM

Lao Ming wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
>      one two 9-8 three 2-up
>      once 8-11 4-over quadruple
>      time never 22-20 always
>
> What I want to extract is this:
>
>      9-8
>      7-11
>      22-8
>
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible,  I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
>      2-up
>      4-over
>
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl?  Thanks for any help.

$ echo "    one two 9-8 three 2-up
     once 8-11 4-over quadruple
     time never 22-20 always

" | perl -lne'print for 
/(?:^|(?<=\s))(\d\d?-(?:\d\d?|[[:alpha:]]+))(?=\s|$)/g'
9-8
2-up
8-11
4-over
22-20





John
-- 
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.                   -- Albert Einstein
0
Reply John 9/9/2010 7:42:57 PM

On 9 Sep, 02:50, Lao Ming <laoming...@gmail.com> wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> =A0 =A0 one two 9-8 three 2-up
> =A0 =A0 once 8-11 4-over quadruple
> =A0 =A0 time never 22-20 always
>
> What I want to extract is this:
>
> =A0 =A0 9-8
> =A0 =A0 7-11
> =A0 =A0 22-8
>

sed -ne 's|.*\([0-9]\+-[0-9]\+\).*|\1|p' infile

> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, =A0I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> =A0 =A0 2-up
> =A0 =A0 4-over
>

sed -ne 's|.*\([0-9]\+-[a-zA-Z]\+\).*|\1|p' infile

> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? =A0Thanks for any help.

0
Reply pcb1962 9/10/2010 7:53:38 AM

5 Replies
833 Views

(page loaded in 0.094 seconds)

Similiar Articles:













7/20/2012 3:52:44 PM


Reply: