I've been learning sed and whilst the complicated things seem to work,
some of the simple ones don't. For example:
I have a file with records of the form:
PMID- 14974909
OWN - NLM
STAT- completed
DA - 20040225
DCOM- 20040325
IS - 0300-0664
VI - 59
IP - 6
DP - 2003 Dec
PG - 690-8
FAU - Nugent, Ailish G
AU - Nugent AG
FAU - Leung, Kin-Chuen
AU - Leung KC
FAU - Sullivan, David
AU - Sullivan D
FAU - Reutens, Anne T
AU - Reutens AT
FAU - Ho, Ken K Y
AU - Ho KK
LA - eng
PT - Clinical Trial
PT - Journal Article
PT - Randomized Controlled Trial
PL - England
TA - Clin Endocrinol (Oxf)
JID - 0346653
RN - 0 (Carrier Proteins)
SO - Clin Endocrinol (Oxf) 2003 Dec;59(6):690-8.
How can I get sed (or grep), to output only the lines tagged with,
say, AU and PT (in the same order as in the source file). I though '()'
allowed me to group patterns a la:
cat filename |grep (^AU,^PT)
but, apparently not.
I thought that maybe:
cat filename |sed /^[^PT,^AU]/\!d
might do it, but no.
Also how can I get sed to remove CR/LFs at the end of specific lines?
|
|
0
|
|
|
|
Reply
|
china-rider (63)
|
7/2/2004 11:29:36 AM |
|
try this:
cat filename | grep -e ^AU -e PT
|
|
0
|
|
|
|
Reply
|
news9932 (46)
|
7/2/2004 9:37:35 AM
|
|
On Fri, 02 Jul 2004 11:37:35 +0200, Ed wrote:
correction:
cat filename | grep -e ^AU -e ^PT
|
|
0
|
|
|
|
Reply
|
news9932 (46)
|
7/2/2004 9:38:21 AM
|
|
On Fri, 02 Jul 2004 13:29:36 +0200, John Stolz <china-rider@wanadoo.fr> wrote:
| I've been learning sed and whilst the complicated things seem to work,
| some of the simple ones don't. For example:
|
| I have a file with records of the form:
| PMID- 14974909
| OWN - NLM
| STAT- completed
| DA - 20040225
| DCOM- 20040325
| IS - 0300-0664
| VI - 59
| IP - 6
| DP - 2003 Dec
| PG - 690-8
| FAU - Nugent, Ailish G
| AU - Nugent AG
| FAU - Leung, Kin-Chuen
| AU - Leung KC
| FAU - Sullivan, David
| AU - Sullivan D
| FAU - Reutens, Anne T
| AU - Reutens AT
| FAU - Ho, Ken K Y
| AU - Ho KK
| LA - eng
| PT - Clinical Trial
| PT - Journal Article
| PT - Randomized Controlled Trial
| PL - England
| TA - Clin Endocrinol (Oxf)
| JID - 0346653
| RN - 0 (Carrier Proteins)
| SO - Clin Endocrinol (Oxf) 2003 Dec;59(6):690-8.
|
| How can I get sed (or grep), to output only the lines tagged with,
| say, AU and PT (in the same order as in the source file). I though '()'
| allowed me to group patterns a la:
|
| cat filename |grep (^AU,^PT)
Almost there. Try:
grep -E '^AU|^PT' filename
also 'egrep' is shorthand for 'grep -E'
The pipe '|' means 'or', and the bracket do group the expression. You need the
quotes to stop the shell seeing the '|' character.
You don't need the grouping for this simple expression, but you could do:
grep -E '^(AU|PT)' filename
| but, apparently not.
| I thought that maybe:
|
| cat filename |sed /^[^PT,^AU]/\!d
| might do it, but no.
|
| Also how can I get sed to remove CR/LFs at the end of specific lines?
Not sure how to do this is sed, sorry.
--
Reverend Paul Colquhoun, ULC. http://andor.dropbear.id.au/~paulcol
Asking for technical help in newsgroups? Read this first:
http://catb.org/~esr/faqs/smart-questions.html#intro
|
|
0
|
|
|
|
Reply
|
postmaster5 (179)
|
7/2/2004 11:20:01 AM
|
|
> | PMID- 14974909
> | OWN - NLM
> | STAT- completed
> | DA - 20040225
> | DCOM- 20040325
> | IS - 0300-0664
> | VI - 59
> | IP - 6
> | DP - 2003 Dec
> | PG - 690-8
> | FAU - Nugent, Ailish G
> | AU - Nugent AG
> | FAU - Leung, Kin-Chuen
> | AU - Leung KC
> | FAU - Sullivan, David
> | AU - Sullivan D
> | FAU - Reutens, Anne T
> | AU - Reutens AT
> | FAU - Ho, Ken K Y
> | AU - Ho KK
> | LA - eng
> | PT - Clinical Trial
> | PT - Journal Article
> | PT - Randomized Controlled Trial
> | PL - England
> | TA - Clin Endocrinol (Oxf)
> | JID - 0346653
> | RN - 0 (Carrier Proteins)
> | SO - Clin Endocrinol (Oxf) 2003 Dec;59(6):690-8.
> |
> | How can I get sed (or grep), to output only the lines tagged with,
> | say, AU and PT (in the same order as in the source file). I though '()'
> | allowed me to group patterns a la:
> |
> | cat filename |grep (^AU,^PT)
>
>
> Almost there. Try:
>
> grep -E '^AU|^PT' filename
I like the grep solution but
cat filename | grep -E '^AU -|^PT -'
is more appropriate unless you know more about the tags than I do.
Regards...Dan.
|
|
0
|
|
|
|
Reply
|
JDanSkinner (96)
|
7/2/2004 4:33:29 PM
|
|
On 2004-07-02, Dan Skinner wrote:
>
> cat filename | grep -E '^AU -|^PT -'
> is more appropriate unless you know more about the tags than I do.
There's no need for cat:
grep -E '^AU -|^PT -' filename
--
Chris F.A. Johnson http://cfaj.freeshell.org/shell
===================================================================
My code (if any) in this post is copyright 2004, Chris F.A. Johnson
and may be copied under the terms of the GNU General Public License
|
|
0
|
|
|
|
Reply
|
cfajohnson (1783)
|
7/2/2004 6:02:10 PM
|
|
On Fri, 02 Jul 2004 11:38:21 +0200, Ed hath writ:
> On Fri, 02 Jul 2004 11:37:35 +0200, Ed wrote:
>
> correction:
> cat filename | grep -e ^AU -e ^PT
*Correct* correction:
grep -e ^AU -e ^PT filename
Jonesy
|
|
0
|
|
|
|
Reply
|
bit-bucket (345)
|
7/3/2004 9:42:52 PM
|
|
On Fri, 02 Jul 2004 18:02:10 +0000, Chris F.A. Johnson wrote:
> On 2004-07-02, Dan Skinner wrote:
>>
>> cat filename | grep -E '^AU -|^PT -'
>> is more appropriate unless you know more about the tags than I do.
>
> There's no need for cat:
>
> grep -E '^AU -|^PT -' filename
Thanks everyone - this works a treat now.
|
|
0
|
|
|
|
Reply
|
china-rider (63)
|
7/5/2004 9:57:35 AM
|
|
|
7 Replies
51 Views
(page loaded in 0.121 seconds)
|