substitute string for ascii control character

  • Follow


Hi,
  I'm trying to substitute all
occurances of the ascii control
characters with strings
\000 - \010     "&ccrs000:" - "&ccrs010:"
\013 - \014     "&ccrs013:" - "&ccrs014:"
\016 - \037     "&ccrs016:" - "&ccrs037:"
\177            "&ccrs177:"

I've tried several approaches - the following
is wrong in that it inserts the replacement
string following the substitution character
rather than replacing it. Any help with the
gsub function or another approach will be
greatly appreciated.

Thanks
Jeff Higgins

/^/{
    gsub(/[\000]/,"&ccrs000;")
    gsub(/[\001]/,"&ccrs001;")
    gsub(/[\002]/,"&ccrs002;")
    gsub(/[\003]/,"&ccrs003;")
    gsub(/[\004]/,"&ccrs004;")
    gsub(/[\005]/,"&ccrs005;")
    gsub(/[\006]/,"&ccrs006;")
    gsub(/[\007]/,"&ccrs007;")
    gsub(/[\010]/,"&ccrs010;")
    gsub(/[\013]/,"&ccrs013;")
    gsub(/[\014]/,"&ccrs014;")
    gsub(/[\016]/,"&ccrs016;")
    gsub(/[\017]/,"&ccrs017;")
    gsub(/[\020]/,"&ccrs020;")
    gsub(/[\021]/,"&ccrs021;")
    gsub(/[\022]/,"&ccrs022;")
    gsub(/[\023]/,"&ccrs023;")
    gsub(/[\024]/,"&ccrs024;")
    gsub(/[\025]/,"&ccrs025;")
    gsub(/[\026]/,"&ccrs026;")
    gsub(/[\027]/,"&ccrs027;")
    gsub(/[\030]/,"&ccrs030;")
    gsub(/[\031]/,"&ccrs031;")
    gsub(/[\032]/,"&ccrs032;")
    gsub(/[\033]/,"&ccrs033;")
    gsub(/[\034]/,"&ccrs034;")
    gsub(/[\035]/,"&ccrs035;")
    gsub(/[\036]/,"&ccrs036;")
    gsub(/[\037]/,"&ccrs037;")
    gsub(/[\177]/,"&ccrs177;")
    print
} 


0
Reply Jeff 11/5/2005 4:41:56 PM

Jeff Higgins wrote:
> Hi,
>   I'm trying to substitute all
> occurances of the ascii control
> characters with strings
> \000 - \010     "&ccrs000:" - "&ccrs010:"
> \013 - \014     "&ccrs013:" - "&ccrs014:"
> \016 - \037     "&ccrs016:" - "&ccrs037:"
> \177            "&ccrs177:"
> 
> I've tried several approaches - the following
> is wrong in that it inserts the replacement
> string following the substitution character
> rather than replacing it. Any help with the
> gsub function or another approach will be
> greatly appreciated.
> 
> Thanks
> Jeff Higgins
> 
> /^/{
>     gsub(/[\000]/,"&ccrs000;")

       gsub(/[\000]/,"\\&ccrs000;")

It this what you want?

Janis

>     ...
>     print
> } 
0
Reply Janis 11/5/2005 5:13:18 PM


Janis Papanagnou wrote:
[snip]
>>
>> /^/{
>>     gsub(/[\000]/,"&ccrs000;")
>
>       gsub(/[\000]/,"\\&ccrs000;")
>
> It this what you want?
>
> Janis
>

Yes. Thank you Janis.

I now understand that "\&" escapes
the ampersand allowing me to insert
it as literal text rather than inserting
the string matched by the regex.

I'm afraid I do not understand how
the "\\&" prevents the regex match string
form being output. Will you explain?

Thank you,
Jeff Higgins


0
Reply Jeff 11/5/2005 6:14:45 PM

Jeff Higgins wrote:
> Janis Papanagnou wrote:
> [snip]
> 
>>>/^/{
>>>    gsub(/[\000]/,"&ccrs000;")
>>
>>      gsub(/[\000]/,"\\&ccrs000;")
>>
>>It this what you want?
>>
> Yes. Thank you Janis.
> 
> I now understand that "\&" escapes
> the ampersand allowing me to insert
> it as literal text rather than inserting
> the string matched by the regex.
> 
> I'm afraid I do not understand how
> the "\\&" prevents the regex match string
> form being output. Will you explain?

The backslash is the escape character in _all_ strings, as the one
in the argument to gsub. Function gsub needs an escaped ampersand
to not use it with the "matched expression"-meaning. So if you use
"\&" the backslash escapes the ampersand to create a normalized
string where "\&" is replaced by "&", and gsub is interpreting it
that way. If OTOH you use "\\&" the string expression will evaluate
to the normalized string "\&" that is passed to gsub, and gsub is
seeing the escaped ampersand and takes it literally.

Janis
0
Reply Janis 11/5/2005 8:20:04 PM

This is good to know.  Thanks!  This has bitten me in the past.

Scotty

0
Reply Scotty 11/5/2005 9:14:39 PM

> > Janis Papanagnou wrote:
> The backslash is the escape character in _all_ strings, as the one
> in the argument to gsub. Function gsub needs an escaped ampersand
> to not use it with the "matched expression"-meaning. So if you use
> "\&" the backslash escapes the ampersand to create a normalized
> string where "\&" is replaced by "&", and gsub is interpreting it
> that way. If OTOH you use "\\&" the string expression will evaluate
> to the normalized string "\&" that is passed to gsub, and gsub is
> seeing the escaped ampersand and takes it literally.

Thanks!!!  That is good to know.  This has bitten me before...

0
Reply Scotty 11/5/2005 9:28:48 PM

Jeff Higgins wrote:
> Any help with the
> gsub function or another approach will be
> greatly appreciated.
>    gsub(/[\000]/,"&ccrs000;")
>
>>Janis Papanagnou wrote:
>>
>>    gsub(/[\000]/,"\\&ccrs000;")
>>
>>It this what you want?
>>
>>>Jeff Higgins wrote:
>>>I'm afraid I do not understand how
>>>the "\\&" prevents the regex match string
>>>form being output. Will you explain?

>>>>Janis Papanagnou wrote:
>>>>The backslash is the escape character in _all_ strings, as the one
>>>>in the argument to gsub. Function gsub needs an escaped ampersand
>>>>to not use it with the "matched expression"-meaning. So if you use
>>>>"\&" the backslash escapes the ampersand to create a normalized
>>>>string where "\&" is replaced by "&", and gsub is interpreting it
>>>>that way. If OTOH you use "\\&" the string expression will evaluate
>>>>to the normalized string "\&" that is passed to gsub, and gsub is
>>>>seeing the escaped ampersand and takes it literally.

Very helpful and much appreciated. Thank you Janis.
Jeff Higgins 


0
Reply Jeff 11/7/2005 1:08:54 AM

6 Replies
621 Views

(page loaded in 0.096 seconds)

Similiar Articles:













7/24/2012 12:54:07 PM


Reply: