Retrieve string between parentheses?

  • Follow


Hello all,

   (Is the group alt.comp.lang.sed dead?)

   I need to parse the file name from an "md5" output (the file name is
enclosed in parenthesis). With my very limited shell fu, I managed to
tell sed the following:

$ md5 * | sed 's/.*(//; s/).*=\ /:/'

   However, this will - obviously - fail if the file name contains
parenthesis. I need to scan for a ")" backwards from the end of the line.

   Can sed do it, or do I need awk?

-- 
Kind regards,
Jan Danielsson
0
Reply Jan 11/17/2006 10:03:17 PM

Jan Danielsson wrote:

> Hello all,
>
>    (Is the group alt.comp.lang.sed dead?)
>
>    I need to parse the file name from an "md5" output (the file name is
> enclosed in parenthesis). With my very limited shell fu, I managed to
> tell sed the following:
>
> $ md5 * | sed 's/.*(//; s/).*=\ /:/'
>
>    However, this will - obviously - fail if the file name contains
> parenthesis. I need to scan for a ")" backwards from the end of the line.
>
>    Can sed do it, or do I need awk?
>
> --
> Kind regards,
> Jan Danielsson

I don't know md5 format.

Could you give a few lines of input and expected output that covers the
range of input edge cases?

- Paddy.

0
Reply Paddy 11/17/2006 10:08:26 PM


Jan Danielsson wrote:
> Hello all,
> 
>    (Is the group alt.comp.lang.sed dead?)
> 
>    I need to parse the file name from an "md5" output (the file name is
> enclosed in parenthesis).

(My md5sum does not output parenthesis.)

> With my very limited shell fu, I managed to
> tell sed the following:
> 
> $ md5 * | sed 's/.*(//; s/).*=\ /:/'
> 
>    However, this will - obviously - fail if the file name contains
> parenthesis. I need to scan for a ")" backwards from the end of the line.
> 
>    Can sed do it, or do I need awk?
> 

If there's only one parenthesis on the line...

   awk -F[()] '{print $2}'

....otherwise adjust the field number.

Janis
0
Reply Janis 11/17/2006 10:23:12 PM

Paddy wrote:
>>    I need to parse the file name from an "md5" output (the file name is
>> enclosed in parenthesis). With my very limited shell fu, I managed to
>> tell sed the following:
>>
>> $ md5 * | sed 's/.*(//; s/).*=\ /:/'
>>
>>    However, this will - obviously - fail if the file name contains
>> parenthesis. I need to scan for a ")" backwards from the end of the line.
>>
>>    Can sed do it, or do I need awk?
> I don't know md5 format.
> 
> Could you give a few lines of input and expected output that covers the
> range of input edge cases?

   Sure,

$ md5 test.sh
MD5 (test.sh) = 5387c69d47140869a1707f343f11019a

   The problem is, like I mentioned above, that a file name can contain
'(' and ')', and they - obviously - need not be nested.

   So this is a (theoretical) possibility:

MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a

-- 
Kind regards,
Jan Danielsson
0
Reply Jan 11/17/2006 10:24:50 PM

Jan Danielsson wrote:
> Paddy wrote:
> 
>>>   I need to parse the file name from an "md5" output (the file name is
>>>enclosed in parenthesis). With my very limited shell fu, I managed to
>>>tell sed the following:
>>>
>>>$ md5 * | sed 's/.*(//; s/).*=\ /:/'
>>>
>>>   However, this will - obviously - fail if the file name contains
>>>parenthesis. I need to scan for a ")" backwards from the end of the line.
>>>
>>>   Can sed do it, or do I need awk?
>>
>>I don't know md5 format.
>>
>>Could you give a few lines of input and expected output that covers the
>>range of input edge cases?
> 
> 
>    Sure,
> 
> $ md5 test.sh
> MD5 (test.sh) = 5387c69d47140869a1707f343f11019a
> 
>    The problem is, like I mentioned above, that a file name can contain
> '(' and ')', and they - obviously - need not be nested.
> 
>    So this is a (theoretical) possibility:
> 
> MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a
> 

$ awk '{print substr($0,6,length($0)-41)}'

MD5 (test.sh) = 5387c69d47140869a1707f343f11019a
test.sh

MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a
foo(bar)))))())(()(()(.txt


Janis
0
Reply Janis 11/17/2006 10:40:56 PM

Janis Papanagnou wrote:
[---]
>> MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a
> 
> $ awk '{print substr($0,6,length($0)-41)}'
[---]

   Hmm... I think I'd better stop staring at sed, and start learning
awk. It seems much better suited for these kind of things.

   Thanks!

-- 
Kind regards,
Jan Danielsson
0
Reply Jan 11/17/2006 10:46:51 PM

Jan Danielsson wrote:

> Hello all,
> 
>    (Is the group alt.comp.lang.sed dead?)
> 
>    I need to parse the file name from an "md5" output (the file name is
> enclosed in parenthesis). With my very limited shell fu, I managed to
> tell sed the following:
> 
> $ md5 * | sed 's/.*(//; s/).*=\ /:/'
> 
>    However, this will - obviously - fail if the file name contains
> parenthesis. I need to scan for a ")" backwards from the end of the line.
> 
>    Can sed do it, or do I need awk?
> 

I'm assuming the output from md5 * looks like this (where we only want to
first filename file1):
$ cat text
(file1)
(file2
)file3
()4
$ sed -n 's#.*(\(.\{1,\}\)).*#\1#p' text
file1

Some info: 
-n gives no output so we need p to print hits

..\{1,\} some characters (the filename) that has length at least 1 character,
so no empty filenames ()

 .*(\(<filename>\)).* this then comes down to anything ( filename ) anything
Where filename is surrounded by \(   \) to be able to use it as output with
\1

s#regexp#output#p Use # as delimiters instead of the normal /
# is a bit clearer than /

@newsgroup being dead: it looks like it. If you need more try the yahoo sed
group: http://tech.groups.yahoo.com/group/sed-users/
Cya!
0
Reply waka 11/21/2006 12:47:42 PM

Jan Danielsson <jan.m.danielsson@gmail.com> writes:
>$ md5 test.sh
>MD5 (test.sh) = 5387c69d47140869a1707f343f11019a
>
>   The problem is, like I mentioned above, that a file name can contain
>'(' and ')', and they - obviously - need not be nested.
>
>   So this is a (theoretical) possibility:
>
>MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a

sed 's/MD5 (//;s/)[^)]*$//'
-- 
John Savage                   (my news address is not valid for email)
0
Reply John 11/22/2006 4:22:45 AM

Jan Danielsson wrote:
>    The problem is, like I mentioned above, that a file name can contain
> '(' and ')', and they - obviously - need not be nested.
>
>    So this is a (theoretical) possibility:
>
> MD5 (foo(bar)))))())(()(()(.txt) = 5387c69d47140869a1707f343f11019a

Using gawk:

gawk '{if (match($0,/\((.*)\)/,f)) print f[1]}'

Regards,
Andy

0
Reply Andrew 11/22/2006 3:49:14 PM

8 Replies
647 Views

(page loaded in 0.015 seconds)

Similiar Articles:













7/23/2012 12:36:17 AM


Reply: