split at a pattern

  • Follow


I have a big file that that I need to split after a certain pattern is
found into multiple files.
whenver it finds EOD  I need to write a new file name and then
continue reading and write another file when the pattern is found
again and so on.
..
Can you please help
0
Reply ayman_zekry 10/14/2008 2:23:29 PM

On Tuesday 14 October 2008 16:23, ayman_zekry@yahoo.com wrote:

> I have a big file that that I need to split after a certain pattern is
> found into multiple files.
> whenver it finds EOD  I need to write a new file name and then
> continue reading and write another file when the pattern is found
> again and so on.
> .
> Can you please help

You don't say whether EOD itself must go in the files, however, assuming it
shouldn't:

awk -v f="file1" '/EOD/{n++;f="file"n;next}{print>f}' originalfile

that creates a series of files "file1", "file2", "file3", etc.

"file1" contains from the beginning of file to the first EOD (not inclusive)
"file2" contains from the first EOD (not inclusive) to the next EOD (not
inclusive)
etc.

If you need the EODs to go into the file too, just remove the ';next' in the
above program. Also, if you're going to create thousands of files, it's
probably better to close each file before writing to the new one. This
would do that:

awk -v f="file1" '/EOD/{close(f);n++;f="file"n;next}{print>f}' originalfile

0
Reply pk 10/14/2008 2:43:34 PM


On 10/14/2008 9:43 AM, pk wrote:
> On Tuesday 14 October 2008 16:23, ayman_zekry@yahoo.com wrote:
> 
> 
>>I have a big file that that I need to split after a certain pattern is
>>found into multiple files.
>>whenver it finds EOD  I need to write a new file name and then
>>continue reading and write another file when the pattern is found
>>again and so on.
>>.
>>Can you please help
> 
> 
> You don't say whether EOD itself must go in the files, however, assuming it
> shouldn't:
> 
> awk -v f="file1" '/EOD/{n++;f="file"n;next}{print>f}' originalfile
> 
> that creates a series of files "file1", "file2", "file3", etc.

Assuming the file doesn't start with EOD, that'd put the first two sets of data
both into file1 since n starts at zero so the first hit of n++, i.e. at the
first ocurrence of EOD, would set it to 1.

ITYM:

awk -v n=1 '/EOD/{n++;next}{print>"file"n}' originalfile

	Ed.

> "file1" contains from the beginning of file to the first EOD (not inclusive)
> "file2" contains from the first EOD (not inclusive) to the next EOD (not
> inclusive)
> etc.
> 
> If you need the EODs to go into the file too, just remove the ';next' in the
> above program. Also, if you're going to create thousands of files, it's
> probably better to close each file before writing to the new one. This
> would do that:
> 
> awk -v f="file1" '/EOD/{close(f);n++;f="file"n;next}{print>f}' originalfile
> 

0
Reply Ed 10/14/2008 3:08:13 PM

On Tuesday 14 October 2008 17:08, Ed Morton wrote:

>> awk -v f="file1" '/EOD/{n++;f="file"n;next}{print>f}' originalfile
>> 
>> that creates a series of files "file1", "file2", "file3", etc.
> 
> Assuming the file doesn't start with EOD, that'd put the first two sets of
> data both into file1 since n starts at zero so the first hit of n++, i.e.
> at the first ocurrence of EOD, would set it to 1.
> 
> ITYM:
> 
> awk -v n=1 '/EOD/{n++;next}{print>"file"n}' originalfile

Right, thanks for the correction.

0
Reply pk 10/14/2008 3:15:22 PM

On Oct 14, 5:15=A0pm, pk <p...@pk.invalid> wrote:
> On Tuesday 14 October 2008 17:08, Ed Morton wrote:
>
> >> awk -v f=3D"file1" '/EOD/{n++;f=3D"file"n;next}{print>f}' originalfile
>
> >> that creates a series of files "file1", "file2", "file3", etc.
>
> > Assuming the file doesn't start with EOD, that'd put the first two sets=
 of
> > data both into file1 since n starts at zero so the first hit of n++, i.=
e.
> > at the first ocurrence of EOD, would set it to 1.
>
> > ITYM:
>
> > awk -v n=3D1 '/EOD/{n++;next}{print>"file"n}' originalfile
>
> Right, thanks for the correction.

Thanks for your help
The file does not start with EOD and EOD should be included.
I tried awk -v n=3D1 '/EOD/{n++;next}{print>"file"n}' originalfile
but it gives the following error:

nawk: syntax error at source line 1
 context is
         >>> /EOD/{n++;next}{print>"file"n <<< }
nawk: illegal statement at source line

I am using nawk on solaris 8
0
Reply ayman_zekry 10/15/2008 7:43:25 AM

On Wednesday 15 October 2008 09:43, ayman_zekry@yahoo.com wrote:

> The file does not start with EOD and EOD should be included.
> I tried awk -v n=1 '/EOD/{n++;next}{print>"file"n}' originalfile

If EOD should be included, do

awk -v n=1 '/EOD/{n++}{print>"file"n}' originalfile

or (probably better if you'll create lots of chunks)

awk -v n=1 '/EOD/{close("file"n);n++}{print>"file"n}' originalfile

> but it gives the following error:
> 
> nawk: syntax error at source line 1
>  context is
>          >>> /EOD/{n++;next}{print>"file"n <<< }
> nawk: illegal statement at source line
> 
> I am using nawk on solaris 8

Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work fine.

0
Reply pk 10/15/2008 7:56:57 AM

On Wednesday 15 October 2008 09:56, pk wrote:

> If EOD should be included, do
> 
> awk -v n=1 '/EOD/{n++}{print>"file"n}' originalfile
> 
> or (probably better if you'll create lots of chunks)
> 
> awk -v n=1 '/EOD/{close("file"n);n++}{print>"file"n}' originalfile

In case EOD is to be be included in the *previous* chunk, do this:

awk -v n=1 '{print>"file"n} /EOD/{n++}' originalfile

or

awk -v n=1 '{print>"file"n} /EOD/{close("file"n);n++}' originalfile

respectively.

0
Reply pk 10/15/2008 7:59:37 AM

On Oct 15, 9:56=A0am, pk <p...@pk.invalid> wrote:
> On Wednesday 15 October 2008 09:43, ayman_ze...@yahoo.com wrote:
>
> > The file does not start with EOD and EOD should be included.
> > I tried awk -v n=3D1 '/EOD/{n++;next}{print>"file"n}' originalfile
>
> If EOD should be included, do
>
> awk -v n=3D1 '/EOD/{n++}{print>"file"n}' originalfile
>
> or (probably better if you'll create lots of chunks)
>
> awk -v n=3D1 '/EOD/{close("file"n);n++}{print>"file"n}' originalfile
>
> > but it gives the following error:
>
> > nawk: syntax error at source line 1
> > =A0context is
> > =A0 =A0 =A0 =A0 =A0>>> /EOD/{n++;next}{print>"file"n <<< }
> > nawk: illegal statement at source line
>
> > I am using nawk on solaris 8
>
> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work fi=
ne.

awk gives the following error:

awk: syntax error near line 1
awk: bailing out near line 1
0
Reply ayman_zekry 10/15/2008 8:06:14 AM

On Wednesday 15 October 2008 10:06, ayman_zekry@yahoo.com wrote:

>> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work
>> fine.
> 
> awk gives the following error:
> 
> awk: syntax error near line 1
> awk: bailing out near line 1

Sorry, I can't help you further. That works just fine for me with GNU awk,
and AFAICT it uses pretty much standard awk syntax.

0
Reply pk 10/15/2008 8:17:41 AM

On Oct 15, 10:17=A0am, pk <p...@pk.invalid> wrote:
> On Wednesday 15 October 2008 10:06, ayman_ze...@yahoo.com wrote:
>
> >> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work
> >> fine.
>
> > awk gives the following error:
>
> > awk: syntax error near line 1
> > awk: bailing out near line 1
>
> Sorry, I can't help you further. That works just fine for me with GNU awk=
,
> and AFAICT it uses pretty much standard awk syntax.

Thanks for your help, I will see if I can get a linux machine and try
it there.
0
Reply ayman_zekry 10/15/2008 8:47:53 AM

On Oct 15, 10:47=A0am, "ayman_ze...@yahoo.com" <ayman_ze...@yahoo.com>
wrote:
> On Oct 15, 10:17=A0am, pk <p...@pk.invalid> wrote:
>
> > On Wednesday 15 October 2008 10:06, ayman_ze...@yahoo.com wrote:
>
> > >> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to wo=
rk
> > >> fine.
>
> > > awk gives the following error:
>
> > > awk: syntax error near line 1
> > > awk: bailing out near line 1
>
> > Sorry, I can't help you further. That works just fine for me with GNU a=
wk,
> > and AFAICT it uses pretty much standard awk syntax.
>
> Thanks for your help, I will see if I can get a linux machine and try
> it there.

On my Solaris 8 machine this works fine with /usr/xpg4/bin/awk, while
using
/usr/bin/awk gives me the same error. Did you really try this:

/usr/xpg4/bin/awk -v n=3D1 '/EOD/{close("file"n);n++}{print>"file"n}'
originalfile
0
Reply schot 10/15/2008 10:42:19 AM

On 10/15/2008 3:06 AM, ayman_zekry@yahoo.com wrote:
> On Oct 15, 9:56 am, pk <p...@pk.invalid> wrote:
> 
>>On Wednesday 15 October 2008 09:43, ayman_ze...@yahoo.com wrote:
>>
>>
>>>The file does not start with EOD and EOD should be included.
>>>I tried awk -v n=1 '/EOD/{n++;next}{print>"file"n}' originalfile
>>
>>If EOD should be included, do
>>
>>awk -v n=1 '/EOD/{n++}{print>"file"n}' originalfile
>>
>>or (probably better if you'll create lots of chunks)
>>
>>awk -v n=1 '/EOD/{close("file"n);n++}{print>"file"n}' originalfile
>>
>>
>>>but it gives the following error:
>>
>>>nawk: syntax error at source line 1
>>> context is
>>>         >>> /EOD/{n++;next}{print>"file"n <<< }
>>>nawk: illegal statement at source line
>>
>>>I am using nawk on solaris 8
>>
>>Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work fine.
> 
> 
> awk gives the following error:
> 
> awk: syntax error near line 1
> awk: bailing out near line 1

That message comes from old, broken awk (/usr/bin/awk on Solaris). Use the awk
that pk suggested, though I'm very surprised that nawk complains.

	Ed.

0
Reply Ed 10/15/2008 12:13:48 PM

In article <gd48mf$bcj$1@aioe.org>, pk  <pk@pk.invalid> wrote:
>On Wednesday 15 October 2008 10:06, ayman_zekry@yahoo.com wrote:
>
>>> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to work
>>> fine.
>> 
>> awk gives the following error:
>> 
>> awk: syntax error near line 1
>> awk: bailing out near line 1
>
>Sorry, I can't help you further. That works just fine for me with GNU awk,
>and AFAICT it uses pretty much standard awk syntax.

(Yes, another poster said this as well, but I want to emphasize one
specific point).

Whenever you see "bailing out", that's a sure sign that you're using the
old (ancient!) AWK.

0
Reply gazelle 10/15/2008 2:05:53 PM

Kenny McCormack wrote:
> 
> Whenever you see "bailing out", that's a sure sign that you're using the
> old (ancient!) AWK.
> 

Not necessarily:

$ gawk --nostalgia
awk: bailing out near line 1
Segmentation fault (core dumped)

$ gawk --version|head -1
GNU Awk 3.1.6

;-) Hermann
0
Reply Hermann 10/15/2008 3:06:01 PM

In article <48F606D9.3020107@gmx.eu>, Hermann Peifer  <peifer@gmx.eu> wrote:
>Kenny McCormack wrote:
>> 
>> Whenever you see "bailing out", that's a sure sign that you're using the
>> old (ancient!) AWK.
>> 
>
>Not necessarily:
>
>$ gawk --nostalgia
>awk: bailing out near line 1
>Segmentation fault (core dumped)
>
>$ gawk --version|head -1
>GNU Awk 3.1.6
>
>;-) Hermann

Funny.  I never knew about that.
Talk about preserving backwards compatibility!

0
Reply gazelle 10/15/2008 3:24:54 PM

Kenny McCormack wrote:
> In article <48F606D9.3020107@gmx.eu>, Hermann Peifer  <peifer@gmx.eu> wrote:
>> Kenny McCormack wrote:
>>> Whenever you see "bailing out", that's a sure sign that you're using the
>>> old (ancient!) AWK.
>>>
>> Not necessarily:
>>
>> $ gawk --nostalgia
>> awk: bailing out near line 1
>> Segmentation fault (core dumped)
>>
>> $ gawk --version|head -1
>> GNU Awk 3.1.6
>>
>> ;-) Hermann
> 
> Funny.  I never knew about that.
> Talk about preserving backwards compatibility!
> 

I guess that this feature is undocumented, on purpose. I found it occasionally, somewhere in main.c (when actually searching for some other error message).

/* nostalgia --- print the famous error message and die */

static void
nostalgia()
{
        /*
         * N.B.: This string is not gettextized, on purpose.
         * So there.
         */
        fprintf(stderr, "awk: bailing out near line 1\n");
        fflush(stderr);
        abort();
}


0
Reply Hermann 10/15/2008 3:33:29 PM

In article <48F60D49.40107@gmx.eu>, Hermann Peifer  <peifer@gmx.eu> wrote:
....
>I guess that this feature is undocumented, on purpose. I found it
>occasionally, somewhere in main.c (when actually searching for some
>other error message).
>
>/* nostalgia --- print the famous error message and die */
>
>static void
>nostalgia()

Interesting.  I have worked on the GAWK source code (not in any official
capacity - just to add some features for myself), and specifically in
main.c, so it is a little funny that I never saw that in there.

0
Reply gazelle 10/15/2008 3:36:26 PM

Kenny McCormack wrote:
> In article <48F60D49.40107@gmx.eu>, Hermann Peifer  <peifer@gmx.eu> wrote:
> ...
>> I guess that this feature is undocumented, on purpose. I found it
>> occasionally, somewhere in main.c (when actually searching for some
>> other error message).
>>
>> /* nostalgia --- print the famous error message and die */
>>
>> static void
>> nostalgia()
> 
> Interesting.  I have worked on the GAWK source code (not in any official
> capacity - just to add some features for myself), and specifically in
> main.c, so it is a little funny that I never saw that in there.
> 

It must have been there since many years. The oldest related ChangeLog entry on savannah.gnu.org is:

> Sun Sep 10 10:37:35 2000  Arnold D. Robbins  <arnold@skeeve.com>
>
>     * main.c (nostalgia): Add call to fflush(stderr).

Hermann

0
Reply Hermann 10/15/2008 3:47:12 PM

On Oct 15, 12:42=A0pm, schot <jeroensc...@gmail.com> wrote:
> On Oct 15, 10:47=A0am, "ayman_ze...@yahoo.com" <ayman_ze...@yahoo.com>
> wrote:
>
>
>
>
>
> > On Oct 15, 10:17=A0am, pk <p...@pk.invalid> wrote:
>
> > > On Wednesday 15 October 2008 10:06, ayman_ze...@yahoo.com wrote:
>
> > > >> Under solaris, use /usr/xpg4/bin/awk to get an awk more likely to =
work
> > > >> fine.
>
> > > > awk gives the following error:
>
> > > > awk: syntax error near line 1
> > > > awk: bailing out near line 1
>
> > > Sorry, I can't help you further. That works just fine for me with GNU=
 awk,
> > > and AFAICT it uses pretty much standard awk syntax.
>
> > Thanks for your help, I will see if I can get a linux machine and try
> > it there.
>
> On my Solaris 8 machine this works fine with /usr/xpg4/bin/awk, while
> using
> /usr/bin/awk gives me the same error. Did you really try this:
>
> /usr/xpg4/bin/awk -v n=3D1 '/EOD/{close("file"n);n++}{print>"file"n}'
> originalfile- Hide quoted text -
>
> - Show quoted text -

Sorry to all of you I have not been looking for a while as I thought
the case was closed.
I tried /usr/xpg4/bin/awk and it does work great.
Thank you very much for your help.
0
Reply ayman_zekry 10/17/2008 1:26:13 PM

18 Replies
214 Views

(page loaded in 0.162 seconds)

5/19/2013 12:11:09 PM


Reply: