I have lines of text in a file where, inserted in each line, a pattern
exists (digits and dashes) such as:
one two 9-8 three 2-up
once 8-11 4-over quadruple
time never 22-20 always
What I want to extract is this:
9-8
7-11
22-8
The patterns of digits and dashes are always surrounded by spaces and
the digits are never more than sequences of two digits in a row.
If possible, I would also like to extract the digits followed by
alphabetic characters (if they exist) such as:
2-up
4-over
In all instances, digits are preceded by spaces and followed by a
dash.
Can this be done with sed, awk or perl? Thanks for any help.
|
|
0
|
|
|
|
Reply
|
Lao
|
9/9/2010 1:50:31 AM |
|
On 09/09/10 03:50, Lao Ming wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> one two 9-8 three 2-up
> once 8-11 4-over quadruple
> time never 22-20 always
>
> What I want to extract is this:
>
> 9-8
> 7-11
> 22-8
These lines of output doesn't seem to match your input.
Try
awk 'match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}'
>
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> 2-up
> 4-over
For both, the above question and this one, try
awk '
match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}
match($0,/[0-9]+-[[:alpha:]]+/) {print substr($0,RSTART,RLENGTH)}
'
Janis
>
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? Thanks for any help.
>
>
|
|
0
|
|
|
|
Reply
|
Janis
|
9/9/2010 2:00:06 AM
|
|
On Sep 8, 7:00=A0pm, Janis Papanagnou <janis_papanag...@hotmail.com>
wrote:
> On 09/09/10 03:50, Lao Ming wrote:
>
> > I have lines of text in a file where, inserted in each line, a pattern
> > exists (digits and dashes) such as:
>
> > =A0 =A0 one two 9-8 three 2-up
> > =A0 =A0 once 8-11 4-over quadruple
> > =A0 =A0 time never 22-20 always
>
> > What I want to extract is this:
>
> > =A0 =A0 9-8
> > =A0 =A0 7-11
> > =A0 =A0 22-8
>
> These lines of output doesn't seem to match your input.
>
> Try
>
> =A0 awk 'match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}'
>
>
>
> > The patterns of digits and dashes are always surrounded by spaces and
> > the digits are never more than sequences of two digits in a row.
>
> > If possible, =A0I would also like to extract the digits followed by
> > alphabetic characters (if they exist) such as:
>
> > =A0 =A0 2-up
> > =A0 =A0 4-over
>
> For both, the above question and this one, try
>
> awk '
> =A0 match($0,/[0-9]+-[0-9]+/) {print substr($0,RSTART,RLENGTH)}
> =A0 match($0,/[0-9]+-[[:alpha:]]+/) {print substr($0,RSTART,RLENGTH)}
> '
Wow! Perfect!
Be careful. I think that the Mafia could soon be after you.
Because you know too much.
:)
> Janis
>
>
>
>
>
> > In all instances, digits are preceded by spaces and followed by a
> > dash.
>
> > Can this be done with sed, awk or perl? =A0Thanks for any help.
|
|
0
|
|
|
|
Reply
|
Lao
|
9/9/2010 2:19:35 AM
|
|
On Sep 8, 8:50=A0pm, Lao Ming <laoming...@gmail.com> wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> =A0 =A0 one two 9-8 three 2-up
> =A0 =A0 once 8-11 4-over quadruple
> =A0 =A0 time never 22-20 always
>
> What I want to extract is this:
>
> =A0 =A0 9-8
> =A0 =A0 7-11
> =A0 =A0 22-8
Let's assume the above matched your input :-).
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, =A0I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> =A0 =A0 2-up
> =A0 =A0 4-over
>
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? =A0Thanks for any help.
$ cat file
one two 9-8 three 2-up
once 8-11 4-over quadruple
time never 22-20 always
$ sed -n 's/.* \([[:digit:]][[:digit:]]*-[[:digit:]][[:digit:]]*\).*/
\1/p' file
9-8
8-11
22-20
$ sed -n 's/.* \([[:digit:]][[:digit:]]*-[[:alpha:]][[:alpha:]]*\).*/
\1/p' file
2-up
4-over
That'll match sequences of more than 2 digitis and/or letters - if
that's a problem, let us know. Also, if you want to match multiple
occurrences of each pattern on a single line, just add a "g" to the
end of the command.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
9/9/2010 3:50:59 PM
|
|
Lao Ming wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> one two 9-8 three 2-up
> once 8-11 4-over quadruple
> time never 22-20 always
>
> What I want to extract is this:
>
> 9-8
> 7-11
> 22-8
>
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> 2-up
> 4-over
>
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? Thanks for any help.
$ echo " one two 9-8 three 2-up
once 8-11 4-over quadruple
time never 22-20 always
" | perl -lne'print for
/(?:^|(?<=\s))(\d\d?-(?:\d\d?|[[:alpha:]]+))(?=\s|$)/g'
9-8
2-up
8-11
4-over
22-20
John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein
|
|
0
|
|
|
|
Reply
|
John
|
9/9/2010 7:42:57 PM
|
|
On 9 Sep, 02:50, Lao Ming <laoming...@gmail.com> wrote:
> I have lines of text in a file where, inserted in each line, a pattern
> exists (digits and dashes) such as:
>
> =A0 =A0 one two 9-8 three 2-up
> =A0 =A0 once 8-11 4-over quadruple
> =A0 =A0 time never 22-20 always
>
> What I want to extract is this:
>
> =A0 =A0 9-8
> =A0 =A0 7-11
> =A0 =A0 22-8
>
sed -ne 's|.*\([0-9]\+-[0-9]\+\).*|\1|p' infile
> The patterns of digits and dashes are always surrounded by spaces and
> the digits are never more than sequences of two digits in a row.
>
> If possible, =A0I would also like to extract the digits followed by
> alphabetic characters (if they exist) such as:
>
> =A0 =A0 2-up
> =A0 =A0 4-over
>
sed -ne 's|.*\([0-9]\+-[a-zA-Z]\+\).*|\1|p' infile
> In all instances, digits are preceded by spaces and followed by a
> dash.
>
> Can this be done with sed, awk or perl? =A0Thanks for any help.
|
|
0
|
|
|
|
Reply
|
pcb1962
|
9/10/2010 7:53:38 AM
|
|
|
5 Replies
833 Views
(page loaded in 0.094 seconds)
|