Replace all new lines, only if...

  • Follow


I want to replace all carrige return &
linefeed pairs in a file with a space,
but only if the lines contains text.

----
Before:

a
b
c

d
e
f

After (Note the empty line is preserved):

a b c

d e f
----

This has me stumped, any suggestions?

Mike


0
Reply Just 9/5/2006 12:14:21 AM

Just Me wrote:
> I want to replace all carrige return &
> linefeed pairs in a file with a space,
> but only if the lines contains text.
>
> ----
> Before:
>
> a
> b
> c
>
> d
> e
> f
>
> After (Note the empty line is preserved):
>
> a b c
>
> d e f
> ----
>
> This has me stumped, any suggestions?
>
> Mike

This solution complies to the letter of your request:

$0 { printf "%s ", $0 }
!$0 { print }

Note that there is a space at the end of every line, except the last
one.

0
Reply Vassilis 9/5/2006 12:56:08 AM


In article <12fpgdhrifv3of9@corp.supernews.com>, Just Me <nothanks> wrote:
>I want to replace all carrige return &
>linefeed pairs in a file with a space,
>but only if the lines contains text.
>
>----
>Before:
>
>a
>b
>c
>
>d
>e
>f
>
>After (Note the empty line is preserved):
>
>a b c
>
>d e f
>----
>
>This has me stumped, any suggestions?

ORS=NF?" ":"\n"
0
Reply gazelle 9/5/2006 1:12:03 AM

Okay both of these work as advertised,
(many thanks) but after trying them out,
I realize I've failed to phrase my
question properly.

I'll take another crack at it...

When (M)AWK encounters a block of text
all the new lines need to be replaced
with spaces. When an empty line is found,
it should skip that empty line.
The end result is close to word wrap.

---
Before:

aaa
bbb
ccc

ddd
eee
fff

After:

aaabbbccc

dddeeefff
---

Sorry for the mix up, still learning...

Mike


0
Reply Just 9/5/2006 2:32:34 AM

Just Me wrote:
> Okay both of these work as advertised,
> (many thanks) but after trying them out,
> I realize I've failed to phrase my
> question properly.
>
> I'll take another crack at it...
>
> When (M)AWK encounters a block of text
> all the new lines need to be replaced
> with spaces. When an empty line is found,
> it should skip that empty line.
> The end result is close to word wrap.
>
> ---
> Before:
>
> aaa
> bbb
> ccc
>
> ddd
> eee
> fff
>
> After:
>
> aaabbbccc
>
> dddeeefff
> ---
>
> Sorry for the mix up, still learning...
>
> Mike

Your description of the problem contradicts the example output. I take
it that the description is correct and the output erroneous:

BEGIN { RS = "\n\n" }
$0 { gsub(/\n/, " "); print }

0
Reply Vassilis 9/5/2006 3:32:49 AM

Just Me wrote:
> Okay both of these work as advertised,
> (many thanks) but after trying them out,
> I realize I've failed to phrase my
> question properly.
> 
> I'll take another crack at it...
> 
> When (M)AWK encounters a block of text
> all the new lines need to be replaced
> with spaces. When an empty line is found,
> it should skip that empty line.
> The end result is close to word wrap.
> 
> ---
> Before:
> 
> aaa
> bbb
> ccc
> 
> ddd
> eee
> fff
> 
> After:
> 
> aaabbbccc
> 
> dddeeefff
> ---
> 
> Sorry for the mix up, still learning...
> 
> Mike
> 

The posted output looks nothing like your description. Where are the 
spaces between the concatenated lines? Why is there still a blank line 
between the text lines?

	Ed.
0
Reply Ed 9/5/2006 4:00:59 AM

Just Me wrote:
> Okay both of these work as advertised,
> (many thanks) but after trying them out,
> I realize I've failed to phrase my
> question properly.
> 
> I'll take another crack at it...
> 
> When (M)AWK encounters a block of text
> all the new lines need to be replaced
> with spaces. When an empty line is found,
> it should skip that empty line.
> The end result is close to word wrap.
> 
> ---
> Before:
> 
> aaa
> bbb
> ccc
> 
> ddd
> eee
> fff
> 
> After:
> 
> aaabbbccc
> 
> dddeeefff
> ---
> 
> Sorry for the mix up, still learning...
> 
> Mike
> 
> 


Your expected data can be created by a variant of Kenny's suggestion...

   ORS=NF?"\0":"\n\n"

(Not quite sure whether the "\0" is supported by all awk's, though.)

Janis
0
Reply Janis 9/5/2006 4:49:31 AM

Hi Vassilis & Ed.

I'm having trouble conveying what I'd like to accomplish it seems...

All lines that contain text should have '\n' replaced with '\s' unless the line is empty.

Where '\n' = both a carriage return and a linefeed pair, and '\s' = space.

--

Here is the data now:

a
b
c

d
e
f

--

Here is what I need:

abc

def

--

Notice how the line that's empty (between abc and def) is not touched...

Mike


0
Reply Just 9/5/2006 5:04:58 AM

Just Me wrote:
> Hi Vassilis & Ed.
> 
> I'm having trouble conveying what I'd like to accomplish it seems...

Yes, your descriptions have never fit your examples.

> 
> All lines that contain text should have '\n' replaced with '\s' unless the line is empty.
> 
> Where '\n' = both a carriage return and a linefeed pair, and '\s' = space.
> 
> --
> 
> Here is the data now:
> 
> a
> b
> c
> 
> d
> e
> f
> 
> --
> 
> Here is what I need:
> 
> abc
> 
> def

You say the newline (\n) should be replaced with space,
but there are no spaces in you output.

If viewed as a stream, your original input would look like:

    "a\nb\nc\n\nd\ne\nf\n"
           ^^^^
Note, a blank line is just two adjacent newlines
After the substitution you describe this would be (real spaces put in)

    "a b c \nd e f "

Note, no newline after "f " (f + space) unless there was a blank line
in the original input after "f\n".  Also note, there is no longer any
blank line (two \n's together) so there wouldn't be any in the output.

The output would be:

    a b c
    d e f But there would be no newline here the next text is on the same line.

And there is a space, not newline, after the "c" and "f".

Tell us what you really want and why your spaces don't show in your output.



> 
> --
> 
> Notice how the line that's empty (between abc and def) is not touched...
> 
> Mike
> 
> 
0
Reply Jon 9/5/2006 2:17:14 PM

Jon LaBadie writes:

> Yes, your descriptions have never fit your examples.

Assumming \n = newline and \s = space
This is what the file looks like before its processed.

aaa\n
bbb\n
ccc\n
\n
ddd\n
eee\n
fff\n

This is what I'd like the file to
look like after I run it through awk:

aaa\sbbb\sccc\n
\n
ddd\seee\sfff\n

I don't know how else to express it...

Mike


0
Reply Just 9/5/2006 6:32:56 PM

Thank you, Janis, but since I'm having
trouble describing my problem, the solution
you've offered will not work (my fault).

Please my latest reply to Jon LaBadie higher
in this thread. Thanks for your help.

Mike


0
Reply Just 9/5/2006 6:39:13 PM

Just Me wrote:
> Jon LaBadie writes:
> 
>> Yes, your descriptions have never fit your examples.
> 
> Assumming \n = newline and \s = space
> This is what the file looks like before its processed.
> 
> aaa\n
> bbb\n
> ccc\n
> \n
> ddd\n
> eee\n
> fff\n
> 
> This is what I'd like the file to
> look like after I run it through awk:
> 
> aaa\sbbb\sccc\n
> \n
> ddd\seee\sfff\n
> 
> I don't know how else to express it...
> 

One approach

awk'
BEGIN {prev = 0}
NF > 0 {
         printf "%s%s", prev ? " " : "", $0
         prev = 1
}
NF == 0 {
         printf "\n\n"
         prev = 0
}
END { printf "%s", prev ? "\n" : "" }
' datafile
0
Reply Jon 9/5/2006 7:44:47 PM

In article <12frgpd18b33821@corp.supernews.com>, Just Me <nothanks> wrote:
>Jon LaBadie writes:
>
>> Yes, your descriptions have never fit your examples.
>
>Assumming \n = newline and \s = space
>This is what the file looks like before its processed.
>
>aaa\n
>bbb\n
>ccc\n
>\n
>ddd\n
>eee\n
>fff\n
>
>This is what I'd like the file to
>look like after I run it through awk:
>
>aaa\sbbb\sccc\n
>\n
>ddd\seee\sfff\n
>
>I don't know how else to express it...

Seems pretty clear to me.  I think I got it the first time.

My original idea (Yes, this is the whole program!):

    ORS=NF?" ":"\n"

is close, but nitpickers may find fault with it.

Here's another one:

BEGIN {RS="";ORS="\n\n"}
$1=$1

0
Reply gazelle 9/5/2006 9:41:44 PM

Kenny McCormack wrote:
> In article <12frgpd18b33821@corp.supernews.com>, Just Me <nothanks> wrote:
>> Jon LaBadie writes:
>>
>>> Yes, your descriptions have never fit your examples.
>> Assumming \n = newline and \s = space
>> This is what the file looks like before its processed.
>>
>> aaa\n
>> bbb\n
>> ccc\n
>> \n
>> ddd\n
>> eee\n
>> fff\n
>>
>> This is what I'd like the file to
>> look like after I run it through awk:
>>
>> aaa\sbbb\sccc\n
>> \n
>> ddd\seee\sfff\n
>>
>> I don't know how else to express it...
> 
> Seems pretty clear to me.  I think I got it the first time.
> 
> My original idea (Yes, this is the whole program!):
> 
>     ORS=NF?" ":"\n"
> 
> is close, but nitpickers may find fault with it.
> 
> Here's another one:
> 
> BEGIN {RS="";ORS="\n\n"}
> $1=$1
> 

Seeing as the output of your two versions differ
from one another it must be at least a little muddy.

The first version eliminates the blank line separating
two collections of data.  It also preserves a space at
the end of each output data line and doesn't add a \n
to the last data line.

The second adds an extra blank line to the end of the
data that wasn't there in the input.
0
Reply Jon 9/6/2006 2:10:40 AM

In article <L5WdnTeuXte9s2PZnZ2dnUVZ_oednZ2d@comcast.com>,
Jon LaBadie  <jxlabadie@axcxmx.org> wrote:
....
>Seeing as the output of your two versions differ
>from one another it must be at least a little muddy.
>
>The first version eliminates the blank line separating
>two collections of data.  It also preserves a space at
>the end of each output data line and doesn't add a \n
>to the last data line.

True.

>The second adds an extra blank line to the end of the
>data that wasn't there in the input.

Your grasp of the obvious is stunning.

0
Reply gazelle 9/6/2006 2:18:25 AM

Kenny McCormack writes:

> Here's another one:
>
> BEGIN {RS="";ORS="\n\n"}
> $1=$1

Spot on, dead ringer, 100% what I needed Kenny.

Thanks very much! I been trying to clean over
300 meg of data (by hand) for better than 2 days.

With his script and mawk on my oldish computer,
the task was competed without error, spanning 29
files in less than 5mins...

btw... My sincerest thanks to all others that
helped me as well.

Mike



0
Reply Just 9/6/2006 4:48:19 AM

something like this, maybe?

{
printf ("%s",$1)
if ($1=="")
        {
        print ("\n")
        }
}
END {
printf("\n")
}

awk -f script.awk file

HTH




Just Me wrote:
> Kenny McCormack writes:
>
> > Here's another one:
> >
> > BEGIN {RS="";ORS="\n\n"}
> > $1=$1
>
> Spot on, dead ringer, 100% what I needed Kenny.
>
> Thanks very much! I been trying to clean over
> 300 meg of data (by hand) for better than 2 days.
>
> With his script and mawk on my oldish computer,
> the task was competed without error, spanning 29
> files in less than 5mins...
>
> btw... My sincerest thanks to all others that
> helped me as well.
> 
> Mike

0
Reply Mag 9/7/2006 4:17:46 AM

Mag Gam wrote:

> something like this, maybe?

Please don't top-post.

> {
> printf ("%s",$1)
> if ($1=="")
>         {
>         print ("\n")
>         }
> }

Whether it works or not, the above could be written more awk-like as:

{ printf "%s", (NF ? $1 : ORS ORS) }

Regards,

	Ed.

> END {
> printf("\n")
> }
> 
> awk -f script.awk file
> 
> HTH
> 
> 
> 
> 
> Just Me wrote:
> 
>>Kenny McCormack writes:
>>
>>
>>>Here's another one:
>>>
>>>BEGIN {RS="";ORS="\n\n"}
>>>$1=$1
>>
>>Spot on, dead ringer, 100% what I needed Kenny.
>>
>>Thanks very much! I been trying to clean over
>>300 meg of data (by hand) for better than 2 days.
>>
>>With his script and mawk on my oldish computer,
>>the task was competed without error, spanning 29
>>files in less than 5mins...
>>
>>btw... My sincerest thanks to all others that
>>helped me as well.
>>
>>Mike
> 
> 
0
Reply Ed 9/7/2006 11:27:53 AM

Ed:

Sorry, not sure how I am 'top posting'. I am using google groups, and I
think it does it automatically.

As for the awk comments, I too am new to awk. Still learning its ways.
Why is the BEGIN{} wrong? I want to make the user clear, that I am
declaring a var with 0 value. I am used to C-style syntax.


Ed Morton wrote:
> Mag Gam wrote:
>
> > something like this, maybe?
>
> Please don't top-post.
>
> > {
> > printf ("%s",$1)
> > if ($1=="")
> >         {
> >         print ("\n")
> >         }
> > }
>
> Whether it works or not, the above could be written more awk-like as:
>
> { printf "%s", (NF ? $1 : ORS ORS) }
>
> Regards,
>
> 	Ed.
>
> > END {
> > printf("\n")
> > }
> >
> > awk -f script.awk file
> >
> > HTH
> >
> >
> >
> >
> > Just Me wrote:
> >
> >>Kenny McCormack writes:
> >>
> >>
> >>>Here's another one:
> >>>
> >>>BEGIN {RS="";ORS="\n\n"}
> >>>$1=$1
> >>
> >>Spot on, dead ringer, 100% what I needed Kenny.
> >>
> >>Thanks very much! I been trying to clean over
> >>300 meg of data (by hand) for better than 2 days.
> >>
> >>With his script and mawk on my oldish computer,
> >>the task was competed without error, spanning 29
> >>files in less than 5mins...
> >>
> >>btw... My sincerest thanks to all others that
> >>helped me as well.
> >>
> >>Mike
> > 
> >

0
Reply Mag 9/7/2006 1:39:37 PM

In article <1157636377.763545.103870@h48g2000cwc.googlegroups.com>,
Mag Gam <magawake@gmail.com> wrote:
>Ed:
>
>Sorry, not sure how I am 'top posting'. I am using google groups, and I
>think it does it automatically.

Well, that's the problem.  By the standards of most (non-google using)
Usenetters, "google groups" is broken and should not be used.  As the
saying goes, "Here's a nickel, kid.  Buy yourself a real newsreader."

That said, it goes without saying that if that's what you're stuck with,
then there's not much you can do about it.

>As for the awk comments, I too am new to awk. Still learning its ways.
>Why is the BEGIN{} wrong?

I think you got your threads crossed.  You are referring to a comment Ed
made in the thread titled: Furthest char from the beginning of line.

>I want to make the user clear, that I am declaring a var with 0 value.
>I am used to C-style syntax.

Well, that's the problem.  If you want to be accepted in the AWK
community, you will need to learn to speak idiomatic AWK.

0
Reply gazelle 9/7/2006 1:59:27 PM

Mag Gam wrote:

> Ed:
> 
> Sorry, not sure how I am 'top posting'. I am using google groups, and I
> think it does it automatically.

I've never posted using google groups, but I believe I've seen some 
advice given that you just have to type below the quoted text rather 
than above it. That'd be OT for this NG though.

> As for the awk comments, I too am new to awk. Still learning its ways.

No problem. It's hard to change paradigms.

> Why is the BEGIN{} wrong? I want to make the user clear, that I am
> declaring a var with 0 value.

It's already clear to people who know awk how variables get initialised 
to zero or null. By adding the BEGIN, you're adding unnecesary code, 
thus obfuscating the program and adding a little extra maintenance 
headache if variable names change.

  I am used to C-style syntax.
> 

But you're not using C, you're using awk. C programmers who switch to 
C++ likewise have problems because C++ supports C syntax so they 
initially stick to what they know and it's only after doing that for a 
while that they start to realise how they should have been using it.

	Ed.
0
Reply Ed 9/7/2006 8:43:15 PM

On 2006-09-07, Mag Gam wrote:
> Ed:
>
> Sorry, not sure how I am 'top posting'.

   You are putting your answer before the question. That makes a
   thread very hard to follow.

> I am using google groups, and I think it does it automatically.

   Google Groups may be an amazing service, but it doesn't type your
   message for you; it goes wherever you put it.

   Delete irrelevant quoted material.
   Move your cursor after the section to which you are replying.
   Enter your text.

-- 
   Chris F.A. Johnson, author   |    <http://cfaj.freeshell.org>
   Shell Scripting Recipes:     |  My code in this post, if any,
   A Problem-Solution Approach  |         is released under the
   2005, Apress                 |    GNU General Public Licence
0
Reply Chris 9/8/2006 12:19:55 AM

21 Replies
164 Views

(page loaded in 0.226 seconds)

Similiar Articles:


















7/24/2012 6:41:20 PM


Reply: