f



Converting pain text to TeX

Hi,
I am trying to convert a plain text file to TeX format. For this I wrote
a awk script to convert double quotes to the correct type in TeX, as in

#!/usr/bin/awk -f
# DQ2TQ : Converts double qoutes in plaint text file to TeX format
# USAGE
#       dq2tq filename > newfilename
BEGIN   { count = 0 }
/"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
              {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
        { print }


However this does not seem to work right. Given the following input :

"
" "
" " "
" " " "
The Indians, Columbus reported, "are so naive and so free with their
possessions that no one who has not witnessed them would believe it.
When you ask for something they have, they never say no. To the
contrary, they offer to share with anyone...." He concluded his report
by  asking for a little help from their Majesties, and in return he
would bring them from his next voyage "as much gold as they need ... and
as many slaves as they ask." He was full of religious talk: "Thus the
eternal God, our Lord, gives victory to those who follow His way over
apparent impossibilities."


It produces the following output :

``
'' ``
'' `` ''
`` '' `` ''
The Indians, Columbus reported, ``are so naive and so free with their
possessions that no one who has not witnessed them would believe it.
When you ask for something they have, they never say no. To the
contrary, they offer to share with anyone....`` He concluded his report
by  asking for a little help from their Majesties, and in return he
would bring them from his next voyage ``as much gold as they need ...
and as many slaves as they ask.'' He was full of religious talk: ``Thus the
eternal God, our Lord, gives victory to those who follow His way over
apparent impossibilities.''


Could you kindly explain what may be wrong with the script, and any
possible fix.

sincerely
B Thomas

0
B
10/27/2005 4:48:07 PM
comp.lang.awk 3450 articles. 0 followers. Post Follow

6 Replies
1732 Views

Similar Articles

[PageSpeed] 8

B Thomas <thomasb@math.ohio-state.edu> wrote:
> Hi,
> I am trying to convert a plain text file to TeX format. For this I wrote
> a awk script to convert double quotes to the correct type in TeX, as in
> 
> #!/usr/bin/awk -f
> # DQ2TQ : Converts double qoutes in plaint text file to TeX format
> # USAGE
> #       dq2tq filename > newfilename
> BEGIN   { count = 0 }
> /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
>               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
>         { print }

If you're simply replacing doublequotes, then just split on
doublequotes, like
    awk -F'"' ...
then generate the output with correctly paired `` or ''.

-- 
William Park <opengeometry@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
	   http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
	  http://freshmeat.net/projects/bashdiff/
0
William
10/27/2005 5:02:36 PM
Hi,

Thanks. I did set FS in my script to " as in 
BEGIN   { FS = /"/ ; count = 0 }
but then only the first double quote in each line get changed and the
result is :

``
'' "
`` " "
'' " " "

regards
b thomas

On Thu, 27 Oct 2005 13:02:36 -0400, William Park <opengeometry@yahoo.ca> wrote:
> B Thomas <thomasb@math.ohio-state.edu> wrote:
>> Hi,
>> I am trying to convert a plain text file to TeX format. For this I wrote
>> a awk script to convert double quotes to the correct type in TeX, as in
>> 
>> #!/usr/bin/awk -f
>> # DQ2TQ : Converts double qoutes in plaint text file to TeX format
>> # USAGE
>> #       dq2tq filename > newfilename
>> BEGIN   { count = 0 }
>> /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
>>               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
>>         { print }
>
> If you're simply replacing doublequotes, then just split on
> doublequotes, like
>     awk -F'"' ...
> then generate the output with correctly paired `` or ''.
>
0
B
10/27/2005 5:46:39 PM
Hi,

Thank you for your help. I did get it fixed. I just discoved gsub.
Using it solved the problem.
Thanks again.

b thomas

On Thu, 27 Oct 2005 13:02:36 -0400, William Park <opengeometry@yahoo.ca> wrote:
> B Thomas <thomasb@math.ohio-state.edu> wrote:
>> Hi,
>> I am trying to convert a plain text file to TeX format. For this I wrote
>> a awk script to convert double quotes to the correct type in TeX, as in
>> 
>> #!/usr/bin/awk -f
>> # DQ2TQ : Converts double qoutes in plaint text file to TeX format
>> # USAGE
>> #       dq2tq filename > newfilename
>> BEGIN   { count = 0 }
>> /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
>>               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
>>         { print }
>
> If you're simply replacing doublequotes, then just split on
> doublequotes, like
>     awk -F'"' ...
> then generate the output with correctly paired `` or ''.
>
0
B
10/27/2005 5:50:01 PM
Oops,

No it didn't. I'll be dammed.

b t
On Thu, 27 Oct 2005 17:50:01 GMT, B Thomas <thomasb@math.ohio-state.edu> wrote:
> Hi,
>
> Thank you for your help. I did get it fixed. I just discoved gsub.
> Using it solved the problem.
> Thanks again.
>
> b thomas
>
> On Thu, 27 Oct 2005 13:02:36 -0400, William Park <opengeometry@yahoo.ca> wrote:
>> B Thomas <thomasb@math.ohio-state.edu> wrote:
>>> Hi,
>>> I am trying to convert a plain text file to TeX format. For this I wrote
>>> a awk script to convert double quotes to the correct type in TeX, as in
>>> 
>>> #!/usr/bin/awk -f
>>> # DQ2TQ : Converts double qoutes in plaint text file to TeX format
>>> # USAGE
>>> #       dq2tq filename > newfilename
>>> BEGIN   { count = 0 }
>>> /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
>>>               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
>>>         { print }
>>
>> If you're simply replacing doublequotes, then just split on
>> doublequotes, like
>>     awk -F'"' ...
>> then generate the output with correctly paired `` or ''.
>>
0
B
10/27/2005 5:52:15 PM
B Thomas wrote:
> Hi,
> I am trying to convert a plain text file to TeX format. For this I wrote
> a awk script to convert double quotes to the correct type in TeX, as in
>
> #!/usr/bin/awk -f
> # DQ2TQ : Converts double qoutes in plaint text file to TeX format
> # USAGE
> #       dq2tq filename > newfilename
> BEGIN   { count = 0 }
> /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
>               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }
>         { print }
>
>
> However this does not seem to work right. Given the following input :
>
> "
> " "
> " " "
> " " " "
> The Indians, Columbus reported, "are so naive and so free with their
> possessions that no one who has not witnessed them would believe it.
> When you ask for something they have, they never say no. To the
> contrary, they offer to share with anyone...." He concluded his report
> by  asking for a little help from their Majesties, and in return he
> would bring them from his next voyage "as much gold as they need ... and
> as many slaves as they ask." He was full of religious talk: "Thus the
> eternal God, our Lord, gives victory to those who follow His way over
> apparent impossibilities."
>
>
> It produces the following output :
>
> ``
> '' ``
> '' `` ''
> `` '' `` ''
> The Indians, Columbus reported, ``are so naive and so free with their
> possessions that no one who has not witnessed them would believe it.
> When you ask for something they have, they never say no. To the
> contrary, they offer to share with anyone....`` He concluded his report
> by  asking for a little help from their Majesties, and in return he
> would bring them from his next voyage ``as much gold as they need ...
> and as many slaves as they ask.'' He was full of religious talk: ``Thus the
> eternal God, our Lord, gives victory to those who follow His way over
> apparent impossibilities.''
>
>
> Could you kindly explain what may be wrong with the script, and any
> possible fix.
>
> sincerely
> B Thomas

BEGIN { FS = "\"" ; OFS = "" }
{
  if (NF)
  { for (i=2;i<=NF;i++)
    { if( q = !q )
        $i = "``" $i
      else
        $i = "''" $i
    }
  }
  else
    q = 0
  print
}

0
William
10/27/2005 6:32:04 PM
In article <bx78f.71595$Hs.28990@tornado.ohiordc.rr.com>,
B Thomas  <thomasb> wrote:

% I am trying to convert a plain text file to TeX format. For this I wrote
% a awk script to convert double quotes to the correct type in TeX, as in

I find it's helpful to match the quotes to the surrounding context,
for instance a quote at the start of the line or following a space
becomes ``. One at the end of the line or with a following space becomes
''. You need additional rules to deal with punctuation. What worries me
about your approach is that it requires perfectly balanced quotes. As
soon as one quote is omitted, all the following quotes will be pointing
the wrong way.

Anyway, that's a design issue. The problem with your implementation
is straight-forward:

% #!/usr/bin/awk -f
% # DQ2TQ : Converts double qoutes in plaint text file to TeX format
% # USAGE
% #       dq2tq filename > newfilename
% BEGIN   { count = 0 }
% /"/     { for (i=1; i<=NF; i++) {if (count % 2 == 0)
%               {sub(/"/,"``",$i);count++} else {sub(/"/,"''",$i);count++}} }

Your algorithm calls for you to replace " by `` when count is even,
and replace " by '' when count is odd. For this to work, you should
increment count only when you've made a replacement. Instead, you're
incrementing count for every field. You could have this:

  if (count % 2 == 0 && sub(/"/, "``", $i)) count++

You could also take advantage of the fact that sub replaces only
the first occurrance first occurance of its RE, and do something like this

 BEGIN { p["``"] = "''"; p["''"] = "``"; pat = "``" }
 /"/ { while (sub(/"/, pat)) pat = p[pat] }
 { print }

You could also use " as the field separator, but I'm not sure it buys
you anything in this case.
-- 

Patrick TJ McPhee
North York  Canada
ptjm@interlog.com
0
ptjm
10/28/2005 5:06:05 AM
Reply: