I would appreciate any pointers on how to do this.
I need to truncate the first line of a file to 10 chatacters for
example, and the last to 20 characters. The lines between are OK, and
are the correct length, but may be padded with spaces to that leength
as per the interface spec'.
This has arrisen due to the way a old system that should have been
smashed with a big hammer produces data. ie all files are the same
length. As the MPE gurus (that's how they describe them selves - it
says it all really) state that that's the way it is! So it's down to
me to sort out the corrupted headers and footers on the Unix side,
before transmission to a client.
Any help, or a really big hammer in the post would be appreciated.
Rob.B
|
|
0
|
|
|
|
Reply
|
rob.bradford (3)
|
4/26/2010 10:28:36 AM |
|
Rob B wrote:
> I would appreciate any pointers on how to do this.
>
> I need to truncate the first line of a file to 10 chatacters for
> example, and the last to 20 characters. The lines between are OK, and
> are the correct length, but may be padded with spaces to that leength
> as per the interface spec'.
>
> This has arrisen due to the way a old system that should have been
> smashed with a big hammer produces data. ie all files are the same
> length. As the MPE gurus (that's how they describe them selves - it
> says it all really) state that that's the way it is! So it's down to
> me to sort out the corrupted headers and footers on the Unix side,
> before transmission to a client.
>
> Any help, or a really big hammer in the post would be appreciated.
If I understand you correctly, you need to turn for example
longlonglonglonglonglonglongline
abc
....many lines here...
xxx
anotherlonglonglonglonglonglonglongline
into
longlonglo
abc
....many lines here...
xxx
anotherlonglonglongl
If that's correct, try this:
awk 'p{print p} {p = NR>1?$0:substr($0,1,10)}END{print substr(p,1,20)}' file
|
|
0
|
|
|
|
Reply
|
pk
|
4/26/2010 10:41:35 AM
|
|
On 26 Apr, 11:41, pk <p...@pk.invalid> wrote:
> Rob B wrote:
> > I would appreciate any pointers on how to do this.
>
> > I need to truncate the first line of a file to 10 chatacters for
> > example, and the last to 20 characters. =A0The lines between are OK, an=
d
> > are the correct length, but may be padded with spaces to that leength
> > as per the interface spec'.
>
> > This has arrisen due to the way a old system that should have been
> > smashed with a big hammer produces data. ie all files are the same
> > length. =A0As the MPE gurus (that's how they describe them selves - it
> > says it all really) state that that's the way it is! So it's down to
> > me to sort out the corrupted headers and footers on the Unix side,
> > before transmission to a client.
>
> > Any help, or a really big hammer in the post would be appreciated.
>
> If I understand you correctly, you need to turn for example
>
> longlonglonglonglonglonglongline
> abc
> ...many lines here...
> xxx
> anotherlonglonglonglonglonglonglongline
>
> into
>
> longlonglo
> abc
> ...many lines here...
> xxx
> anotherlonglonglongl
>
> If that's correct, try this:
>
> awk 'p{print p} {p =3D NR>1?$0:substr($0,1,10)}END{print substr(p,1,20)}'=
file- Hide quoted text -
>
> - Show quoted text -
Thanks for the line of script, that is what I wanted. I could do
either the first orlast line, but couldn't string them together.
I'm now off to smash that HP3000!!!!!!!!!!!!!!
Rob.
|
|
0
|
|
|
|
Reply
|
Rob
|
4/26/2010 12:16:40 PM
|
|
On Apr 26, 5:41=A0am, pk <p...@pk.invalid> wrote:
> Rob B wrote:
> > I would appreciate any pointers on how to do this.
>
> > I need to truncate the first line of a file to 10 chatacters for
> > example, and the last to 20 characters. =A0The lines between are OK, an=
d
> > are the correct length, but may be padded with spaces to that leength
> > as per the interface spec'.
>
> > This has arrisen due to the way a old system that should have been
> > smashed with a big hammer produces data. ie all files are the same
> > length. =A0As the MPE gurus (that's how they describe them selves - it
> > says it all really) state that that's the way it is! So it's down to
> > me to sort out the corrupted headers and footers on the Unix side,
> > before transmission to a client.
>
> > Any help, or a really big hammer in the post would be appreciated.
>
> If I understand you correctly, you need to turn for example
>
> longlonglonglonglonglonglongline
> abc
> ...many lines here...
> xxx
> anotherlonglonglonglonglonglonglongline
>
> into
>
> longlonglo
> abc
> ...many lines here...
> xxx
> anotherlonglonglongl
>
> If that's correct, try this:
>
> awk 'p{print p} {p =3D NR>1?$0:substr($0,1,10)}END{print substr(p,1,20)}'=
file- Hide quoted text -
>
> - Show quoted text -
I'd make that first test be for "NR>1{...}" rather than "p{...}" so
the script doesn't discard lines that are either all-blanks or zero.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
4/26/2010 4:57:19 PM
|
|
Ed Morton wrote:
>> awk 'p{print p} {p = NR>1?$0:substr($0,1,10)}END{print substr(p,1,20)}'
>> file- Hide quoted text -
>>
>> - Show quoted text -
>
> I'd make that first test be for "NR>1{...}" rather than "p{...}" so
> the script doesn't discard lines that are either all-blanks or zero.
Good catch, thanks.
|
|
0
|
|
|
|
Reply
|
pk
|
4/26/2010 4:58:56 PM
|
|
In article <86f9a1f9-45ab-456d-b7e6-194cd85a8909@g23g2000yqn.googlegroups.com>,
Ed Morton <mortonspam@gmail.com> wrote:
....
>> awk 'p{print p} {p = NR>1?$0:substr($0,1,10)}END{print
>substr(p,1,20)}' file- Hide quoted text -
>>
>> - Show quoted text -
>
>I'd make that first test be for "NR>1{...}" rather than "p{...}" so
>the script doesn't discard lines that are either all-blanks or zero.
>
> Ed.
I wonder if it might be more straightforward, for a beginner, and
assuming the file is not too big (these days, that means less than 10Gb),
to just do (and yes, this is OT, since I am giving a shell command, not
an AWK script):
gawk 'ARGIND == 1 {next} FNR == 1 {nr=NR-1;print substr($0,1,10);next}
FNR == nr {print substr($0,1,20);next}1' file file
Assumes gawk, of course, but these days, if you're not using (at least)
GAWK, yer outta town! (Yes, TAWK has this functionality, too, but the
syntax is different; such is life)
The above introduces some interesting concepts (including ARGIND, which
is a very nice functionality to became familiar with), but eliminates
the need for saving the previous line in a variable - a theme which,
although I've used it several times, I've never been all that
comfortable with. I think it is a lot better if you can avoid it - that
is, always deal with the current line as you are reading it.
--
> No, I haven't, that's why I'm asking questions. If you won't help me,
> why don't you just go find your lost manhood elsewhere.
CLC in a nutshell.
|
|
0
|
|
|
|
Reply
|
gazelle
|
4/26/2010 5:15:58 PM
|
|
On Apr 26, 12:15=A0pm, gaze...@shell.xmission.com (Kenny McCormack)
wrote:
> In article <86f9a1f9-45ab-456d-b7e6-194cd85a8...@g23g2000yqn.googlegroups=
..com>,
> Ed Morton =A0<mortons...@gmail.com> wrote:
> ...
>
> >> awk 'p{print p} {p =3D NR>1?$0:substr($0,1,10)}END{print
> >substr(p,1,20)}' file- Hide quoted text -
>
> >> - Show quoted text -
>
> >I'd make that first test be for "NR>1{...}" rather than "p{...}" so
> >the script doesn't discard lines that are either all-blanks or zero.
>
> > =A0 Ed.
>
> I wonder if it might be more straightforward, for a beginner, and
> assuming the file is not too big (these days, that means less than 10Gb),
> to just do (and yes, this is OT, since I am giving a shell command, not
> an AWK script):
>
> gawk 'ARGIND =3D=3D 1 {next} FNR =3D=3D 1 {nr=3DNR-1;print substr($0,1,10=
);next}
> =A0 =A0 FNR =3D=3D nr {print substr($0,1,20);next}1' file file
>
> Assumes gawk, of course, but these days, if you're not using (at least)
> GAWK, yer outta town! =A0(Yes, TAWK has this functionality, too, but the
> syntax is different; such is life)
Or we could go with the oft-used NR=3D=3DFNR and make it non-gawk spceific
since we're assuming non-empty files anyway:
awk '
NR =3D=3D FNR { nr++; next}
FNR =3D=3D 1 { $0 =3D substr($0,1,10) }
FNR =3D=3D nr { $0 =3D substr($0,1,20) }
{ print }
' file file
I made a couple of other tweaks, just style things that I think makes
it a bit easier to read if we're catering to a beginner...
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
4/26/2010 6:50:57 PM
|
|
On Apr 26, 1:50=A0pm, Ed Morton <mortons...@gmail.com> wrote:
> On Apr 26, 12:15=A0pm, gaze...@shell.xmission.com (Kenny McCormack)
> wrote:
>
>
>
>
>
> > In article <86f9a1f9-45ab-456d-b7e6-194cd85a8...@g23g2000yqn.googlegrou=
ps.com>,
> > Ed Morton =A0<mortons...@gmail.com> wrote:
> > ...
>
> > >> awk 'p{print p} {p =3D NR>1?$0:substr($0,1,10)}END{print
> > >substr(p,1,20)}' file- Hide quoted text -
>
> > >> - Show quoted text -
>
> > >I'd make that first test be for "NR>1{...}" rather than "p{...}" so
> > >the script doesn't discard lines that are either all-blanks or zero.
>
> > > =A0 Ed.
>
> > I wonder if it might be more straightforward, for a beginner, and
> > assuming the file is not too big (these days, that means less than 10Gb=
),
> > to just do (and yes, this is OT, since I am giving a shell command, not
> > an AWK script):
>
> > gawk 'ARGIND =3D=3D 1 {next} FNR =3D=3D 1 {nr=3DNR-1;print substr($0,1,=
10);next}
> > =A0 =A0 FNR =3D=3D nr {print substr($0,1,20);next}1' file file
>
> > Assumes gawk, of course, but these days, if you're not using (at least)
> > GAWK, yer outta town! =A0(Yes, TAWK has this functionality, too, but th=
e
> > syntax is different; such is life)
>
> Or we could go with the oft-used NR=3D=3DFNR and make it non-gawk spceifi=
c
> since we're assuming non-empty files anyway:
>
> awk '
> =A0 NR =A0=3D=3D FNR { nr++; next}
> =A0 FNR =3D=3D 1 =A0 { $0 =3D substr($0,1,10) }
> =A0 FNR =3D=3D nr =A0{ $0 =3D substr($0,1,20) }
> =A0 { print }
> ' file file
>
> I made a couple of other tweaks, just style things that I think makes
> it a bit easier to read if we're catering to a beginner...
>
> =A0 Ed.- Hide quoted text -
>
> - Show quoted text -
Y'know, this may be one of those rare occasions when a getline loop
would be appropriate to tell us how many records are in the file:
awk '
BEGIN { while ( (getline dummy < ARGV[1]) > 0 ) nr++ }
NR =3D=3D 1 { $0 =3D substr($0,1,10) }
NR =3D=3D nr { $0 =3D substr($0,1,20) }
{ print }
' file
rather than forcing the user to pass in the file name twice and
muddying the logic of the main body of the script.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
4/26/2010 7:46:05 PM
|
|
Ed Morton wrote:
> On Apr 26, 1:50 pm, Ed Morton <mortons...@gmail.com> wrote:
>> On Apr 26, 12:15 pm, gaze...@shell.xmission.com (Kenny McCormack)
>> wrote:
>>
>>
>>
>>
>>
>>> In article <86f9a1f9-45ab-456d-b7e6-194cd85a8...@g23g2000yqn.googlegroups.com>,
>>> Ed Morton <mortons...@gmail.com> wrote:
>>> ...
>>>>> awk 'p{print p} {p = NR>1?$0:substr($0,1,10)}END{print
>>>> substr(p,1,20)}' file- Hide quoted text -
>>>>> - Show quoted text -
>>>> I'd make that first test be for "NR>1{...}" rather than "p{...}" so
>>>> the script doesn't discard lines that are either all-blanks or zero.
>>>> Ed.
>>> I wonder if it might be more straightforward, for a beginner, and
>>> assuming the file is not too big (these days, that means less than 10Gb),
>>> to just do (and yes, this is OT, since I am giving a shell command, not
>>> an AWK script):
>>> gawk 'ARGIND == 1 {next} FNR == 1 {nr=NR-1;print substr($0,1,10);next}
>>> FNR == nr {print substr($0,1,20);next}1' file file
>>> Assumes gawk, of course, but these days, if you're not using (at least)
>>> GAWK, yer outta town! (Yes, TAWK has this functionality, too, but the
>>> syntax is different; such is life)
>> Or we could go with the oft-used NR==FNR and make it non-gawk spceific
>> since we're assuming non-empty files anyway:
>>
>> awk '
>> NR == FNR { nr++; next}
>> FNR == 1 { $0 = substr($0,1,10) }
>> FNR == nr { $0 = substr($0,1,20) }
>> { print }
>> ' file file
>>
>> I made a couple of other tweaks, just style things that I think makes
>> it a bit easier to read if we're catering to a beginner...
>>
>> Ed.- Hide quoted text -
>>
>> - Show quoted text -
>
> Y'know, this may be one of those rare occasions when a getline loop
> would be appropriate to tell us how many records are in the file:
>
> awk '
> BEGIN { while ( (getline dummy < ARGV[1]) > 0 ) nr++ }
> NR == 1 { $0 = substr($0,1,10) }
> NR == nr { $0 = substr($0,1,20) }
> { print }
> ' file
>
> rather than forcing the user to pass in the file name twice and
> muddying the logic of the main body of the script.
There's also the possibility to duplicate the ARGV element in the BEGIN
section as an alternative to providing the file name explicitly twice.
I don't quite like the last version with getline because it's on first
glance not that apparent whether ARGV[1] will be implicitly closed after
reading EOF and re-opened a second time for the main awk loop.
Besides pk's fine one-pass solution, I'd even prefer a separate process
invocation to determine the size
awk -v nr=$( any_wc_or_awk_like_tool file ) '
NR == 1 { $0 = substr($0,1,10) }
NR == nr { $0 = substr($0,1,20) }
{ print }
' file
where any_wc_or_awk_like_tool is either awk 'END {print NR}' or wc -l .
Janis
>
> Ed.
|
|
0
|
|
|
|
Reply
|
Janis
|
4/26/2010 10:11:55 PM
|
|
|
8 Replies
131 Views
(page loaded in 0.077 seconds)
Similiar Articles: truncate file - comp.lang.c++.moderatedObviously, only the ones likely to be ... But maybe I should first propose truncate for the C ... actually never had the need in the last ten years to truncate a file. replacing character range to remove diacritics - comp.unix.shell ...The test file i used contained two lines ... Replacing ONLY the first occurence of a substring in a string ... Remove first and last 3 characters of a ... Read lst line of large file - comp.unix.solaris... the last line of a large file(500,000 records) and test ... files which are compressed. I need to read the first 50 lines and ... ONLY the last line out of a very big text file hi ... delete columns csv file - comp.lang.awk... script, edit the file taking out the first line containing: awk -F, ' and edit the last ... value and the test if ... NR is only equal to FNR in the first file (or the ... Splitting text file on page-break - comp.lang.awk... is to read in the file, search for the string, and only ... two scripts: > > # First, write a script to create a test file ... unless jj < text_last_index ... Split text file ... can u suggest me to do better than this? - comp.lang.awk ...... 1.0 7.1 490 3410 In above file each line is having id i.e the last column and i want only lines ... the condition you want to test ... tr0.tr > r > > > First of all ... awk: replace previous newline on match - comp.unix.shell ...... your text does not appear on the very first line ... pattern/{x;s/.*/new text/;x;};x;p;x;h;$p' file I can't test ... is present in the last line, then the last # line won ... How do I check the existence of a file programmatically in C ...The only way I can figure how to do th... ... If you check the existence first, the file could be created ... truncate file - comp.lang.c++.moderated... and posix ... ftnchek: identifier has embedded space - comp.lang.fortran ...my example test.f95 is 5 lines long ... file Warning near line 5 file test ... ftnchek was designed only for FORTRAN-77. -- François Jacq email : first name . last ... sed: add newline if no eol? - comp.unix.solaris... to the last line of STDIN if it does not end in \n? e.g.: cat file_w_no ... number of lines. Good thought! I'll think about how to use this test ... The >line with only an EOL ... Perl Cookbook: Chapter 8, File Contents - O'Reilly Media ...For a file with two lines, N=2. You always keep the first line ... of the last line you've seen. When you've exhausted the file, truncate to ... since it only holds one line at ... Truncate Large Text File in UNIX / Linux - nixCraft: Linux Tips ...Explains how to truncate large text files in ... And largefile.txt file is overwritten if it exits.” Last ... makes this Linux only (or other Unix-like without its own file ... 7/30/2012 4:07:58 AM
|