replace hex sequence

  • Follow


I need to replace a sequence of hex characters with awk.

Let's say that $0 contains the hex sequence: 0d0c0d

sub("\f","") removes the 0c and leaves 0d0d, but

sub("\r\f\r","") will not do anything.

Is there a way to match a string of more than one hex char?

Thanks,
Doug

0
Reply Doug 7/25/2003 6:08:42 PM

Doug wrote:
> I need to replace a sequence of hex characters with awk.
>
> Let's say that $0 contains the hex sequence: 0d0c0d
>
> sub("\f","") removes the 0c and leaves 0d0d, but
>
> sub("\r\f\r","") will not do anything.
>
> Is there a way to match a string of more than one hex char?
>
> Thanks,
> Doug

But how do you _know_ it's not doing anything?  You haven't shown any
output for the commands.  For multiple replacements in awk you need
gsub() not sub().

This works for me:

$ echo -e "\r\f\r\c" | awk '{gsub(/\r|\f/,"a");print}'
aaa

$

I use GNU echo, hence the -e to enable escapes.

-- 
Peter S Tillier
"Who needs perl when you can write dc, sokoban,
arkanoid and an unlambda interpreter in sed?"
0
Reply Peter 7/26/2003 2:05:34 AM


Peter S Tillier wrote:
>> I need to replace a sequence of hex characters with awk.
 >>
 >> Let's say that $0 contains the hex sequence: 0d0c0d
 >>
 >> sub("\f","") removes the 0c and leaves 0d0d, but
 >>
 >> sub("\r\f\r","") will not do anything.
 >>
 >
> But how do you _know_ it's not doing anything?  You haven't shown any
> output for the commands.  For multiple replacements in awk you need
> gsub() not sub().
> 
> This works for me:
> 
> $ echo -e "\r\f\r\c" | awk '{gsub(/\r|\f/,"a");print}'
> aaa
> 
> I use GNU echo, hence the -e to enable escapes.
> 

Using the "echo -e" command as you illustrate did seem to work fine. 
However the real application is different, because it is a file. I 
believe it is a byte-order (little-endian) thing. Also, the example you 
give is still a single-byte match because of the "|" (or) operator.

To my surprise, I found that sub(/\f\r\r/,"") worked where 
sub(/\r\f\r/,"") did not.

This was explained by the fact that 'hexdump -C testfile' shows the 
first three bytes as 0d0c0d, but a plain 'hexdump testfile' shows the 
first three bytes as 0c0d0d.

To quote the GNU awk manual on the use of hex values in regex: "...using 
more than two hexadecimal digits produces undefined results."
(http://www.gnu.org/manual/gawk-3.1.1/html_mono/gawk.html#Regexp)

I gave up on trying a multi-char hex replacement and took a different 
approach to eliminate the line with the form feed in it.

Thanks,
Doug

0
Reply Doug 7/28/2003 7:16:39 PM

2 Replies
453 Views

(page loaded in 3.787 seconds)

Similiar Articles:













7/27/2012 10:16:28 AM


Reply: