Detecting ascii nulls

  • Follow



I have input files with ascii nulls, i.e., hex 00, octal 000.

I'd like to, among other things, detect these nulls, count them, print
data about the offending row, and ulimately, delete the null
character.

/\000/	{nulls++
	gsub(/\000/,"")
	}

seems like it ought to be a start - and it is, using gawk, on my PC
under windows.

Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
line, whether the lines have nulls, or not.

Any ideas?

Thanks!

Gerry
0
Reply Gerard 7/25/2004 6:58:53 PM

Gerard C Blais <gerard.blais@mci.com> wrote:
> 
> 
> I have input files with ascii nulls, i.e., hex 00, octal 000.
> 
> I'd like to, among other things, detect these nulls, count them, print
> data about the offending row, and ulimately, delete the null
> character.
> 
> /\000/  {nulls++
>        gsub(/\000/,"")
>        }
> 
> seems like it ought to be a start - and it is, using gawk, on my PC
> under windows.
> 
> Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
> line, whether the lines have nulls, or not.

Some awks don't deal with \000 well.
For example, with gawk, 
find . -print0|awk 'BEGIN{RS="\000"}//'
will print the list of files one per line.

0
Reply Ian 7/25/2004 7:56:26 PM


In article <ac08g0pj1iof1ro206qjersge6rguia323@4ax.com>,
Gerard C Blais  <gerard.blais@mci.com> wrote:
>
>
>I have input files with ascii nulls, i.e., hex 00, octal 000.
>
>I'd like to, among other things, detect these nulls, count them, print
>data about the offending row, and ulimately, delete the null
>character.
>
>/\000/	{nulls++
>	gsub(/\000/,"")
>	}
>
>seems like it ought to be a start - and it is, using gawk, on my PC
>under windows.
>
>Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
>line, whether the lines have nulls, or not.
>
>Any ideas?

Use Perl

Awk and sed were meant to deal with text, and nulls are not generally 
considered text.


Chuck Demas

-- 
  Eat Healthy        |   _ _   | Nothing would be done at all,
  Stay Fit           |   @ @   | If a man waited to do it so well,
  Die Anyway         |    v    | That no one could find fault with it.
  demas@theworld.com |  \___/  | http://world.std.com/~cpd
0
Reply demas 7/25/2004 9:12:09 PM

In article <ce17n9$hv$1@pcls3.std.com>,
Charles Demas <demas@TheWorld.com> wrote:
....
>Use Perl
>
>Awk and sed were meant to deal with text, and nulls are not generally 
>considered text.

Use C (or, better yet, assembler).  Unix text utilities (Awk, sed, Perl,
cut, join, etc) were meant to deal with text, and nulls are not generally
considered text.

Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
handle nulls just fine.

0
Reply gazelle 7/26/2004 12:19:13 AM

In article <ce17n9$hv$1@pcls3.std.com>,
Charles Demas <demas@TheWorld.com> wrote:

% Use Perl
% 
% Awk and sed were meant to deal with text, and nulls are not generally 
% considered text.

Whereas perl was not meant to deal with anything in particular,
so if it doesn't handles nulls well, it goes without notice.
-- 

Patrick TJ McPhee
East York  Canada
ptjm@interlog.com
0
Reply ptjm 7/26/2004 2:53:43 AM

> Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)


Without wishing to get in to a flame war about this: I also find mawk a 
useful tool to have around. It's often somewhat faster than gawk. That 
applies to the versions that were around when I installed the software 
on this computer, anyhow.

-Ed


> handle nulls just fine.




-- 
(You can't go wrong with psycho-rats.)       (er258)(@)(eng.cam)(.ac.uk)

/d{def}def/f{/Times findfont s scalefont setfont}d/s{10}d/r{roll}d f 5/m
{moveto}d -1 r 230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}
for /s 15 d f pop 240 420 m 0 1 3 { 4 2 1 r sub -1 r show } for showpage

0
Reply E 7/26/2004 10:17:22 AM

Hello,

In article <4104DA32.8040409@my.sig>, E. Rosten wrote:
>> Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
> 
> I also find mawk a useful tool [...] It's [...] faster than gawk.

I agree, and I'm sure Arnold would agree too.  If the speed of execution of
your code is critical, mawk can help.

Stepan
0
Reply Stepan 7/26/2004 12:18:42 PM

>>>Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
>>
>>I also find mawk a useful tool [...] It's [...] faster than gawk.
> 
> 
> I agree, and I'm sure Arnold would agree too.  If the speed of execution of
> your code is critical, mawk can help.

I have a habit of using awk for a general quick & easy interpreted 
language for all sorts of stuff (reasonably often including image 
manipulation), so speed of execution starts to matter quite a lot.

But gawk offers enought neat features that I have it around as well.

-Ed



-- 
(You can't go wrong with psycho-rats.)       (er258)(@)(eng.cam)(.ac.uk)

/d{def}def/f{/Times findfont s scalefont setfont}d/s{10}d/r{roll}d f 5/m
{moveto}d -1 r 230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}
for /s 15 d f pop 240 420 m 0 1 3 { 4 2 1 r sub -1 r show } for showpage

0
Reply E 7/26/2004 5:37:09 PM

Thanks for all the suggestions.

I've found a mawk, and will try to get it insatlled on the Solaris
box.  

Gerry

On Sun, 25 Jul 2004 14:58:53 -0400, Gerard C Blais
<gerard.blais@mci.com> wrote:

>
>
>I have input files with ascii nulls, i.e., hex 00, octal 000.
>
>I'd like to, among other things, detect these nulls, count them, print
>data about the offending row, and ulimately, delete the null
>character.
>
>/\000/	{nulls++
>	gsub(/\000/,"")
>	}
>
>seems like it ought to be a start - and it is, using gawk, on my PC
>under windows.
>
>Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
>line, whether the lines have nulls, or not.
>
>Any ideas?
>
>Thanks!
>
>Gerry

0
Reply Gerard 7/26/2004 7:07:51 PM

"Kenny McCormack" <gazelle@yin.interaccess.com> wrote...
....
>Use C (or, better yet, assembler).  Unix text utilities (Awk, sed, Perl,
>cut, join, etc) were meant to deal with text, and nulls are not generally
>considered text.
>
>Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
>handle nulls just fine.

It's been a long, long time since Perl was just a text utility. It handles
NULLs just fine.

That said, if the OP wanted to remove NULLs from files, wouldn't tr -d work
for that? If the OP wanted to count NULLs and report on their lines, the
file could be passed through od and a vanila awk script used to keep track
of NULLs and newlines. All I'm trying to show is that the OP's tasks could
be performed with the standard Solaris POSIX tools.



 Posted Via Nuthinbutnews.Com Premium Usenet Newsgroup Services
----------------------------------------------------------
    ** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------        
                http://www.nuthinbutnews.com
0
Reply Harlan 7/28/2004 4:14:13 AM

In article <410727e8$1_1@127.0.0.1>, Harlan Grove <hrlngrv@aol.com> wrote:
>"Kenny McCormack" <gazelle@yin.interaccess.com> wrote...
>...
>>Use C (or, better yet, assembler).  Unix text utilities (Awk, sed, Perl,
>>cut, join, etc) were meant to deal with text, and nulls are not generally
>>considered text.
>>
>>Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
>>handle nulls just fine.
>
>It's been a long, long time since Perl was just a text utility. It handles
>NULLs just fine.
>
>That said, if the OP wanted to remove NULLs from files, wouldn't tr -d work
>for that?

I seem to recall that some versions of tr don't handle nulls, or maybe
it was that nulls were automatically deleted by that version of tr.



Chuck Demas

-- 
  Eat Healthy        |   _ _   | Nothing would be done at all,
  Stay Fit           |   @ @   | If a man waited to do it so well,
  Die Anyway         |    v    | That no one could find fault with it.
  demas@theworld.com |  \___/  | http://world.std.com/~cpd
0
Reply demas 7/28/2004 4:16:13 PM

10 Replies
173 Views

(page loaded in 0.22 seconds)

Similiar Articles:












7/23/2012 5:02:20 PM


Reply: