Have a 'blank' datafile I'm using SED to replace 'Day' with full
'date'. Is there a way to do this in fewer steps?
File1 (%TEMP%.\_TEMP.txt):
Today-7
Today-6
Today-5....etc.
File2:
REPSET RULE ADD "F:\Returns\DATA\Day-7" INC,REC; (<Day-7> =
Today-7)
REPSET RULE ADD "F:\Returns\DATA\Day-6" INC,REC; (<Day-6> =
Today-6)
REPSET RULE ADD "F:\Returns\DATA\Day-5" INC,REC; (<Day-5> =
Today-5....etc.)
Currently I loop, changing one set of data. Move temp file into
'input' file for next step of loop
FOR /F %%A in (%TEMP%.\_TEMP.txt) do (
SET /A Blank+=1
SED "s/Day-!Blank!/%%A/g" INPUT>TEMP
MOVE /Y TEMP INPUT
)
|
|
0
|
|
|
|
Reply
|
i_robot73 (6)
|
8/21/2007 4:41:26 PM |
|
On Tue, 21 Aug 2007 09:41:26 -0700, i_robot73 wrote:
> Have a 'blank' datafile I'm using SED to replace 'Day' with full
> 'date'. Is there a way to do this in fewer steps?
>
> File1 (%TEMP%.\_TEMP.txt):
> Today-7
> Today-6
> Today-5....etc.
>
> File2:
> REPSET RULE ADD "F:\Returns\DATA\Day-7" INC,REC; (<Day-7> =
> Today-7)
> REPSET RULE ADD "F:\Returns\DATA\Day-6" INC,REC; (<Day-6> =
> Today-6)
> REPSET RULE ADD "F:\Returns\DATA\Day-5" INC,REC; (<Day-5> =
> Today-5....etc.)
>
> Currently I loop, changing one set of data. Move temp file into
> 'input' file for next step of loop
>
> FOR /F %%A in (%TEMP%.\_TEMP.txt) do (
> SET /A Blank+=1
> SED "s/Day-!Blank!/%%A/g" INPUT>TEMP
> MOVE /Y TEMP INPUT
> )
Do I understand that you want to replace the strings "Day-7" et al. with
the string "date-7" et al.? Or the string "Day-7" with today's date (in
an unspecified format) suffixed with the substring "-7"? Or is "Day-7"
not a string, but instead is a metasyntatic variable containing unknown
strings? Is "-7" a subtraction or a substring? If a substraction, what is
it subtracted from? What date format do you want (assuming a real date)?
As far as I know, sed doesn't allow you to modify an existing file (it's
risky anyway) but awk does. String substitutios are trivial in awk, so if
you want to do away with the transient files, that's the way to go ... but
we have to know *exactly* what is needed (show literal examples).
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
8/21/2007 8:54:08 PM
|
|
Sorry if I wasn't specific.
In the 'blank' file, I need to substitute
'Day-0' with, say, 08222007
'Day-1' as 08212007
....
'Day-7' 08152007
Actual lines of the blank are:
REPSET RULE ADD "F:\Returns\DATA\Day-0" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-1" INC,REC;
....
REPSET RULE ADD "F:\Returns\DATA\Day-7" INC,REC;
For FOR loop below works, but it's def. not my best coding =P
AWK can do in-file substitutions??
> > File1 (%TEMP%.\_TEMP.txt):
> > Today-7
> > Today-6
> > Today-5....etc.
>
> > File2:
> > REPSET RULE ADD "F:\Returns\DATA\Day-7" INC,REC; (<Day-7> =
> > Today-7)
> > REPSET RULE ADD "F:\Returns\DATA\Day-6" INC,REC; (<Day-6> =
> > Today-6)
> > REPSET RULE ADD "F:\Returns\DATA\Day-5" INC,REC; (<Day-5> =
> > Today-5....etc.)
>
> > Currently I loop, changing one set of data. Move temp file into
> > 'input' file for next step of loop
>
> > FOR /F %%A in (%TEMP%.\_TEMP.txt) do (
> > SET /A Blank+=1
> > SED "s/Day-!Blank!/%%A/g" INPUT>TEMP
> > MOVE /Y TEMP INPUT
> > )
>
> Do I understand that you want to replace the strings "Day-7" et al. with
> the string "date-7" et al.? Or the string "Day-7" with today's date (in
> an unspecified format) suffixed with the substring "-7"? Or is "Day-7"
> not a string, but instead is a metasyntatic variable containing unknown
> strings? Is "-7" a subtraction or a substring? If a substraction, what is
> it subtracted from? What date format do you want (assuming a real date)?
>
> As far as I know, sed doesn't allow you to modify an existing file (it's
> risky anyway) but awk does. String substitutios are trivial in awk, so if
> you want to do away with the transient files, that's the way to go ... but
> we have to know *exactly* what is needed (show literal examples).
>
> --
> T.E.D. (tda...@umr.edu)
|
|
0
|
|
|
|
Reply
|
i_robot73
|
8/22/2007 12:30:27 PM
|
|
On Wed, 22 Aug 2007 05:30:27 -0700, i_robot73 wrote:
Please do not toppost - the only logical place for the material that come
before something is physically before it.
> Sorry if I wasn't specific.
>
> In the 'blank' file, I need to substitute
> 'Day-0' with, say, 08222007
> 'Day-1' as 08212007
> ...
> 'Day-7' 08152007
>
> Actual lines of the blank are:
>
> REPSET RULE ADD "F:\Returns\DATA\Day-0" INC,REC;
> REPSET RULE ADD "F:\Returns\DATA\Day-1" INC,REC;
> ...
> REPSET RULE ADD "F:\Returns\DATA\Day-7" INC,REC;
>
<snip
> AWK can do in-file substitutions??
>
Yes, a simple example without working code is
{
Array[ NR ] = $0
}
END{
close( FILENAME )
for x = 1; x <= NR; x++ ) print Array[ x ] > FILENAME
}
The key concepts are that you have to store the lines in order, close the
file when done, then read the array back to the file in the same order.
This awk code does the entire job: invoke it with wildcards to specify the
files to process.
BEGIN{
FileInWork = ""
LineCount = 0
Now = systime()
SecondsPerDay = 60 * 60 * 24
for( x = 0; x <= 7; x++ ) {
Table[ x ] = strftime( "%m%d%Y", Now - (SecondsPerDay * x ) )
}
}
{
if( FNR == 1 ) EndFile()
if( match( $0, /Day-([0-7])/, Arr ) ) {
sub( Arr[ 0 ], Table[ Arr[ 1 ] ] )
}
Array[ ++LineCount ] = $0
}
END{
EndFile()
}
function EndFile( x){
if( FileInWork != "" ){
close( FileInWork )
for( x = 1; x <= LineCount; x++ ) print Array[ x ] > FileInWork
delete Array
LineCount = 0
}
FileInWork = FILENAME
}
I tested it using five copies of your data in test.txt through test4.txt
with the command
awk -fawktest.awk test?.txt
and all of the files then contained
REPSET RULE ADD "F:\Returns\DATA\08222007" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\08212007" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\08152007" INC,REC;
(today being 22 Aug here)
Note that with multiple files, a rather different framework for in-file
substitutions is required. I've tried to make it as clear as I can
consistent with also making it concise, but there may be some areas that
need elaboration - just ask (I'm particularly concerned that my use of
a match to a regular expression as a regular expression and a part of the
match as in index into an array that is itself an index into an array -
that is a bit esoteric). I'm not sure the code is portable to all flavors
of awk - I used gawk 3.1.3 for Windows from
<http://gnuwin32.sourceforge.net/packages/gawk.htm>.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
8/22/2007 2:13:57 PM
|
|
I've been readying/playing alot; unfort. my gawk and/or awk95 doesn't
seem to
handle time (systime) functions well (if @ all) =S
I AM able to use doff.exe to loop & create an array
So far I've gotten:
# Get tomorrow's date
# and 6 days prior
BEGIN { i=-5
while (i<2) {
"Doff.exe mmddyyyy "i | getline MD
Blank[i] = MD
print Blank[i]
i++
}
CLOSE (cmd)
}
# Read up Data_Blank.txt file
# into an array. One line
# per array record
{
File[NR]=$0
}
Next is to somehow compare the file 'blanks' with the 'Blank[x]' array
to do a substitution:
(EG: Day-5 should equal the date in 'Blank[-5], Day0 equal 'Blank[0]'
That means changing my blank data file; I got NO problems with that =P
REPSET RULE ADD "F:\Returns\DATA\Day-5" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-4" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-3" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-2" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-1" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day0" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day1" INC,REC;
REPSET RULE ADD "F:\Returns\DATA\Day-5.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day-4.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day-3.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day-2.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day-1.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day0.*" INC,NONREC;
REPSET RULE ADD "F:\Returns\DATA\Day1.*" INC,NONREC;
|
|
0
|
|
|
|
Reply
|
i_robot73
|
8/23/2007 5:19:47 PM
|
|
On Thu, 23 Aug 2007 10:19:47 -0700, i_robot73 wrote:
> I've been readying/playing alot; unfort. my gawk and/or awk95 doesn't
> seem to
> handle time (systime) functions well (if @ all) =S
>
So use the latest version of gawk - it's free and open source:
<http://gnuwin32.sourceforge.net/packages/gawk.htm>
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
8/23/2007 8:52:20 PM
|
|
Interesting. I pulled down the latest they had from the link
provided.
Just FYI, this IS on a WinXP box, ran the
gawk "{ systime() }"
And it just HANGS...no nothing. Have to CTRL-C to kill it
Any ideas??
On Aug 23, 4:52 pm, Ted Davis <tda...@umr.edu> wrote:
> On Thu, 23 Aug 2007 10:19:47 -0700, i_robot73 wrote:
> > I've been readying/playing alot; unfort. my gawk and/or awk95 doesn't
> > seem to
> > handle time (systime) functions well (if @ all) =S
>
> So use the latest version of gawk - it's free and open source:
> <http://gnuwin32.sourceforge.net/packages/gawk.htm>
>
> --
> T.E.D. (tda...@umr.edu)
|
|
0
|
|
|
|
Reply
|
i_robot73
|
8/24/2007 2:58:47 PM
|
|
Must learn to READ better =P
gawk "BEGIN{Format = \"%Y%m%d%H%M %b\";print strftime( Format,
systime() )}"
Spits out '200708241129 Aug' JUST fine in WinXP
On Aug 23, 4:52 pm, Ted Davis <tda...@umr.edu> wrote:
> On Thu, 23 Aug 2007 10:19:47 -0700, i_robot73 wrote:
> > I've been readying/playing alot; unfort. my gawk and/or awk95 doesn't
> > seem to
> > handle time (systime) functions well (if @ all) =S
>
> So use the latest version of gawk - it's free and open source:
> <http://gnuwin32.sourceforge.net/packages/gawk.htm>
>
> --
> T.E.D. (tda...@umr.edu)
|
|
0
|
|
|
|
Reply
|
i_robot73
|
8/24/2007 3:30:01 PM
|
|
On Fri, 24 Aug 2007 07:58:47 -0700, i_robot73 wrote:
> Interesting. I pulled down the latest they had from the link
> provided.
>
> Just FYI, this IS on a WinXP box, ran the
>
> gawk "{ systime() }"
>
> And it just HANGS...no nothing. Have to CTRL-C to kill it
>
> Any ideas??
Two:
1) do not toppost
2) You told gawk to do nothing with the systime() function (it returns a
value) and display nothing for each line entered from the console. ^Z
terminates the program cleanly.
I think what you wanted is
awk "BEGIN{ print systime()}"
This displays the number of seconds since 0 January 1970.
The gawk.hlp command from the command line launches the Windows help
feature. I find the Index more useful than the menu.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
8/24/2007 8:42:23 PM
|
|
For those who want something that works ;) (BIG thanks to T. Davis for
his coding)
Batch file to kick off the process:
gawk -f "<below>.awk" "Data_Blank.txt" > "New_Data.txt"
AWK file:
BEGIN{
ShortM = strftime( "%b" )
FullY = strftime( "%Y" )
LineCount = 0
Now = systime()
SecondsPerDay = 60 * 60 * 24
for( x = 0; x <= 7; x++ ) {
Table[ x ] = strftime( "%m%d%Y", Now - (SecondsPerDay
* x ) )
}
}
{
#
# Substitute 'Day-X' for acutal dates
#
if( match( $0, /Day-([0-7])/, Arr ) ) {
sub( Arr[ 0 ], Table[ Arr[ 1 ] ])
}
gsub( /<YEAR>/, FullY ) # Substitute ALL '<YEAR>' instances for
actual year
gsub( /MMM/, ShortM ) # Substitute ALL occurrences of 'MMM'
for short month notation
Array[ ++LineCount ] = $0
}
END{
close( FILENAME ) # File read into array, close it
# so it can be over-written if need be
for( x = 1; x <= LineCount; x++ ) print Array[ x ]
delete Array
}
Some of the 'Blank' file to see what's getting swapped out:
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\<YEAR>*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-6*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-5*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-4*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-3*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-2*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-1*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\Ckk_Collection\FY<YEAR>\MMM
\Day-0*.xls" INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\+MMM*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-6*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-5*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-4*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-3*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-2*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-1*.xls"
INC,NONREC;
REPSET RULE ADD "F:\Bal_Sheets\IP_Balance\FY<YEAR>\MMM\Day-0*.xls"
INC,NONREC;
Again, thanks Ted for the code....took a few to read through & learn
what does what; but does it well =D
|
|
0
|
|
|
|
Reply
|
i_robot73
|
8/29/2007 3:35:42 PM
|
|
i_robot73@hotmail.com wrote:
> For those who want something that works ;) (BIG thanks to T. Davis for
> his coding)
>
> Batch file to kick off the process:
>
> gawk -f "<below>.awk" "Data_Blank.txt" > "New_Data.txt"
>
>
> AWK file:
>
>
> BEGIN{
> ShortM = strftime( "%b" )
> FullY = strftime( "%Y" )
> LineCount = 0
> Now = systime()
> SecondsPerDay = 60 * 60 * 24
> for( x = 0; x <= 7; x++ ) {
> Table[ x ] = strftime( "%m%d%Y", Now - (SecondsPerDay
> * x ) )
> }
> }
>
> {
> #
> # Substitute 'Day-X' for acutal dates
> #
> if( match( $0, /Day-([0-7])/, Arr ) ) {
> sub( Arr[ 0 ], Table[ Arr[ 1 ] ])
> }
> gsub( /<YEAR>/, FullY ) # Substitute ALL '<YEAR>' instances for
> actual year
> gsub( /MMM/, ShortM ) # Substitute ALL occurrences of 'MMM'
> for short month notation
> Array[ ++LineCount ] = $0
> }
>
> END{
> close( FILENAME ) # File read into array, close it
> # so it can be over-written if need be
> for( x = 1; x <= LineCount; x++ ) print Array[ x ]
> delete Array
> }
You didn't post any context for this, so I don't know exactly what
you're trying to do, but just a couple of things to be aware of:
1) You don't need to close(FILENAME) at the end of your script.
2) You don't need to delete Array at the end of your script.
3) By calling strftime() and systime() in multiple places you could end
up calling it across a day/month/year boundary and get unexpected
results - just get "Now" up-front and use it in the strftime() calls.
4) Awk arrays like Table normally start at "1", not "0".
5) The array argument to match() is a gawk extension (which is fine as
long as you don't mind being gawk-specific).
6) There doesn't seem to be any point to creating and later printing the
"Array" array rather than just printing each $0 as you process it.
7) There doesn't seem to be any point to using the LineCount variable
when it will always just have the value of NR.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
9/3/2007 10:45:38 AM
|
|
On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>
> 1) You don't need to close(FILENAME) at the end of your script.
> 6) There doesn't seem to be any point to creating and later printing the
> "Array" array rather than just printing each $0 as you process it.
I'm suprised to find that that does work in gawk - it didn't when I
developed the technique years ago on some other version.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
9/3/2007 2:53:27 PM
|
|
In article <pan.2007.09.03.14.53.20.218000@umr.edu>,
Ted Davis <tdavis@umr.edu> wrote:
>On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>
>>
>> 1) You don't need to close(FILENAME) at the end of your script.
>
>> 6) There doesn't seem to be any point to creating and later printing the
>> "Array" array rather than just printing each $0 as you process it.
>
>I'm suprised to find that that does work in gawk - it didn't when I
>developed the technique years ago on some other version.
Of course it has nothing to do with gawk; it will work that way in any AWK.
It is just basic programming - really doesn't have anything to do with
AWK either.
The point is that you seem to be of the "formalistic" school of
programming (in the extreme case, this is also known as "cargo cult")
That is, you develop general forms of programming and use them in every
program regardless of whether they are needed or not. Here, the
examples of this are:
1) Building up an array of output rather than just printing it
directly.
2) Closing the input file in the END block. Your comment
explains why you do this, and, in situations where you are
re-writing a file in-place, it is, indeed, necessary to do
this.
3) Dumping out the array in the END block.
Now, there is nothing per se wrong with this approach, and many real
world shops encourage this style of programming. If the skills of the
employees are limited, it can be an efficient way to work. Essentially,
it turns programming from a creative art into basically word processing.
However, many of us aspire to better methods (which proponents of the
opposite view will call "wheel re-invention") and, especially in the
context of a newsgroup (as opposed to our real world jobs, in which we
may have to make concessions), we will always do so (that is, strive for
better).
|
|
0
|
|
|
|
Reply
|
gazelle
|
9/3/2007 3:42:16 PM
|
|
On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
> On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>
>>
>> 1) You don't need to close(FILENAME) at the end of your script.
>
>> 6) There doesn't seem to be any point to creating and later printing the
>> "Array" array rather than just printing each $0 as you process it.
>
> I'm suprised to find that that does work in gawk - it didn't when I
> developed the technique years ago on some other version.
Addenda: WARNING - this can trash files.
awk "{print $0 > FILENAME}" messages.1
truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
9/3/2007 4:21:15 PM
|
|
On Mon, 03 Sep 2007 15:42:16 +0000, Kenny McCormack wrote:
> In article <pan.2007.09.03.14.53.20.218000@umr.edu>,
> Ted Davis <tdavis@umr.edu> wrote:
>>On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>
>>>
>>> 1) You don't need to close(FILENAME) at the end of your script.
>>
>>> 6) There doesn't seem to be any point to creating and later printing the
>>> "Array" array rather than just printing each $0 as you process it.
>>
>>I'm suprised to find that that does work in gawk - it didn't when I
>>developed the technique years ago on some other version.
>
> Of course it has nothing to do with gawk; it will work that way in any AWK.
> It is just basic programming - really doesn't have anything to do with
> AWK either.
>
> The point is that you seem to be of the "formalistic" school of
> programming (in the extreme case, this is also known as "cargo cult")
> That is, you develop general forms of programming and use them in every
> program regardless of whether they are needed or not. Here, the
> examples of this are:
> 1) Building up an array of output rather than just printing it
> directly.
Which trashes large files in Windows - Windows is the OS if interest in
this thread.
> 2) Closing the input file in the END block. Your comment
> explains why you do this, and, in situations where you are
> re-writing a file in-place, it is, indeed, necessary to do
> this.
> 3) Dumping out the array in the END block.
> Now, there is nothing per se wrong with this approach, and many real
> world shops encourage this style of programming. If the skills of the
> employees are limited, it can be an efficient way to work. Essentially,
> it turns programming from a creative art into basically word processing.
It has the advantage of working in all the OSs I use, which is not the
case with the suggested alternatives.
>
> However, many of us aspire to better methods (which proponents of the
> opposite view will call "wheel re-invention") and, especially in the
> context of a newsgroup (as opposed to our real world jobs, in which we
> may have to make concessions), we will always do so (that is, strive for
> better).
"Better" is highly subjective. To me it's clearer, more readable, more
understandable (especially by my boss), and less likely to trash files.
When I reply to a question here, I assume the poster is not expert in the
language. I also assume, on somewhat shakier grounds, that he/she is
interested in the logical structure of the supplied solution. That is, I
am more interested in teaching than in showing advanced programming skills.
The Unix/Linux bias here is palpable, and is the main reason I reply
almost exclusively to questions that appear to be about Windows - I have
to use XP, and writing code that runs the same on multiple XP machines is
sometimes difficult, if not imposible. My coding style makes it usually
possible to write complex programs that will run properly under XP (at
least on some machines), Cygwin, and Linux. I have a program in
development that does that (except that some of the subprograms crash
most of the time on one specific XP machine and one fails under Cygwin
because I haven't upgraded its head utility yet), and runs as a command
line program, a scheduler task, or as a CGI process in all three
environemnts. It consists of one main script with a *lot* of overhead to
sort out the OS and environment, and over a hundred subprogram scripts -
if written solely for Linux (the environemnt it actually runs in as a cron
task except for testing), it could have been considerably less complex,
but too much of the Linux environment (and stability) is missing in
Windows, and too much that is similar is subtly different (or flaky) in
XP, not to mention the command line quoting and slash/backslash issues.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
9/3/2007 4:55:16 PM
|
|
In article <pan.2007.09.03.16.21.08.991000@umr.edu>,
Ted Davis <tdavis@umr.edu> wrote:
>On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
>
>> On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>
>>>
>>> 1) You don't need to close(FILENAME) at the end of your script.
>>
>>> 6) There doesn't seem to be any point to creating and later printing the
>>> "Array" array rather than just printing each $0 as you process it.
>>
>> I'm suprised to find that that does work in gawk - it didn't when I
>> developed the technique years ago on some other version.
>
>
>Addenda: WARNING - this can trash files.
>
> awk "{print $0 > FILENAME}" messages.1
Man of straw. Nobody is suggesting this. That is completely not the point.
I suggest you re-read my previous response (this detail is in there,
albeit not in ALL CAPS, as is sometimes necessary).
>truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
Huh?
|
|
0
|
|
|
|
Reply
|
gazelle
|
9/3/2007 5:19:47 PM
|
|
Ted Davis wrote:
> On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
>
>
>>On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>
>>
>>>1) You don't need to close(FILENAME) at the end of your script.
>>
>>>6) There doesn't seem to be any point to creating and later printing the
>>>"Array" array rather than just printing each $0 as you process it.
>>
>>I'm suprised to find that that does work in gawk - it didn't when I
>>developed the technique years ago on some other version.
>
>
>
> Addenda: WARNING - this can trash files.
>
> awk "{print $0 > FILENAME}" messages.1
>
> truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
Well, yes that would trash files, but then you'd never actually try to
write to the same file you're reading, unless you had a very specific
and unusual requirement to do that.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
9/3/2007 7:13:03 PM
|
|
On Mon, 03 Sep 2007 14:13:03 -0500, Ed Morton wrote:
> Ted Davis wrote:
>> On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
>>
>>
>>>On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>>
>>>
>>>>1) You don't need to close(FILENAME) at the end of your script.
>>>
>>>>6) There doesn't seem to be any point to creating and later printing the
>>>>"Array" array rather than just printing each $0 as you process it.
>>>
>>>I'm suprised to find that that does work in gawk - it didn't when I
>>>developed the technique years ago on some other version.
>>
>>
>>
>> Addenda: WARNING - this can trash files.
>>
>> awk "{print $0 > FILENAME}" messages.1
>>
>> truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
>
> Well, yes that would trash files, but then you'd never actually try to
> write to the same file you're reading, unless you had a very specific
> and unusual requirement to do that.
>
You wouldn't, and I wouldn't, but part of the original task was to avoid
using the usual temp file - that's in the Subject.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
9/3/2007 11:57:40 PM
|
|
On Mon, 03 Sep 2007 17:19:47 +0000, Kenny McCormack wrote:
> In article <pan.2007.09.03.16.21.08.991000@umr.edu>,
> Ted Davis <tdavis@umr.edu> wrote:
>>On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
>>
>>> On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>>
>>>>
>>>> 1) You don't need to close(FILENAME) at the end of your script.
>>>
>>>> 6) There doesn't seem to be any point to creating and later printing the
>>>> "Array" array rather than just printing each $0 as you process it.
>>>
>>> I'm suprised to find that that does work in gawk - it didn't when I
>>> developed the technique years ago on some other version.
>>
>>
>>Addenda: WARNING - this can trash files.
>>
>> awk "{print $0 > FILENAME}" messages.1
>
> Man of straw. Nobody is suggesting this. That is completely not the point.
> I suggest you re-read my previous response (this detail is in there,
> albeit not in ALL CAPS, as is sometimes necessary).
Then what do those quotes, especially #6 from Ed's message mean *in the
context of this thread*, and how is my example not an instance of those
recomendations?
>
>>truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
>
> Huh?
521, and 4097 (on my Linux box) are indeed strange numbers: I would expect
a buffer size to be 512 or 4096, and that the command would trash the file
when it emptied the buffer. Even allowing for the different meaning of \n
in the original Linux file and the resulting XP file, I can still account
for only 518 bytes.
--
T.E.D. (tdavis@umr.edu)
|
|
0
|
|
|
|
Reply
|
Ted
|
9/4/2007 12:10:09 AM
|
|
Ted Davis wrote:
> On Mon, 03 Sep 2007 14:13:03 -0500, Ed Morton wrote:
>
>
>>Ted Davis wrote:
>>
>>>On Mon, 03 Sep 2007 09:53:27 -0500, Ted Davis wrote:
>>>
>>>
>>>
>>>>On Mon, 03 Sep 2007 05:45:38 -0500, Ed Morton wrote:
>>>>
>>>>
>>>>
>>>>>1) You don't need to close(FILENAME) at the end of your script.
>>>>
>>>>>6) There doesn't seem to be any point to creating and later printing the
>>>>>"Array" array rather than just printing each $0 as you process it.
>>>>
>>>>I'm suprised to find that that does work in gawk - it didn't when I
>>>>developed the technique years ago on some other version.
>>>
>>>
>>>
>>>Addenda: WARNING - this can trash files.
>>>
>>> awk "{print $0 > FILENAME}" messages.1
>>>
>>>truncated messages.1 (was 6,171,618 bytes) to 521 bytes.
>>
>>Well, yes that would trash files, but then you'd never actually try to
>>write to the same file you're reading, unless you had a very specific
>>and unusual requirement to do that.
>>
>
>
> You wouldn't, and I wouldn't, but part of the original task was to avoid
> using the usual temp file - that's in the Subject.
There is no temp file in the script shown in the posting I responded to,
it just said that the desired command should do:
> gawk -f "<below>.awk" "Data_Blank.txt" > "New_Data.txt"
but maybe there was a temp file in part of what was snipped out of the OP.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
9/4/2007 2:08:44 AM
|
|
In article <kN2dnbv1HPezI0HbnZ2dnUVZ_ozinZ2d@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>There is no temp file in the script shown in the posting I responded to,
>it just said that the desired command should do:
>
>> gawk -f "<below>.awk" "Data_Blank.txt" > "New_Data.txt"
>
>but maybe there was a temp file in part of what was snipped out of the OP.
>
> Ed.
I think the issue here is that although the Subject line contains the
words "replace WITHIN file without TEMP file??", which usually triggers
the "build it up in an array, then dump it back out to the original
filename [*]" response in us, the posted "solution" by "i_robot" did not
employ that bit of trickery. Instead, it just used the normal:
[g]awk -f ... infile > outfile
syntax. So, as often happens, there was a disconnect between what the
OP put in his Subject line and what he puts in the body of the message.
In such cases, it is usually right to go with the text in the body.
[*] Incidentally, you should use ARGV[1] here, not FILENAME.
|
|
0
|
|
|
|
Reply
|
gazelle
|
9/4/2007 3:15:51 AM
|
|
Kenny McCormack wrote:
> In article <kN2dnbv1HPezI0HbnZ2dnUVZ_ozinZ2d@comcast.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
>>There is no temp file in the script shown in the posting I responded to,
>>it just said that the desired command should do:
>>
>>
>>>gawk -f "<below>.awk" "Data_Blank.txt" > "New_Data.txt"
>>
>>but maybe there was a temp file in part of what was snipped out of the OP.
>>
>> Ed.
>
>
> I think the issue here is that although the Subject line contains the
> words "replace WITHIN file without TEMP file??", which usually triggers
> the "build it up in an array, then dump it back out to the original
> filename [*]" response in us, the posted "solution" by "i_robot" did not
> employ that bit of trickery. Instead, it just used the normal:
>
> [g]awk -f ... infile > outfile
>
> syntax. So, as often happens, there was a disconnect between what the
> OP put in his Subject line and what he puts in the body of the message.
> In such cases, it is usually right to go with the text in the body.
>
> [*] Incidentally, you should use ARGV[1] here, not FILENAME.
>
OK, I'll bite - why? If you use ARGV[N] you need to keep track of which
arguments are file names rather than variable assignments, and change
the value of N for every file, so FILENAME is an easier variable to use
for "the current file" rather than ARGV[1] which is "the first argument
passed to awk which may or may not be the current file name and may or
may not be a file name at all". The only benefit I can think of to using
ARGV[N] is that if you use getline in your script and don't get it
precisely right, you could overwrite the value of FILENAME, but in my
mind that's just a plain old bug in your script.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
9/4/2007 5:09:27 PM
|
|
In article <Od6dnU0GP9LXDEDbnZ2dnUVZ_qSonZ2d@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>OK, I'll bite - why? If you use ARGV[N] you need to keep track of which
>arguments are file names rather than variable assignments, and change
>the value of N for every file, so FILENAME is an easier variable to use
>for "the current file" rather than ARGV[1] which is "the first argument
>passed to awk which may or may not be the current file name and may or
>may not be a file name at all". The only benefit I can think of to using
>ARGV[N] is that if you use getline in your script and don't get it
>precisely right, you could overwrite the value of FILENAME, but in my
>mind that's just a plain old bug in your script.
>
> Ed.
I guess I was thinking about the question of whether or not FILENAME is
necessarily set in the "END" block. Seems there is some room for
variation here. In general, I am less trusting of FILENAME (which might
get set in various ways, such as, as you mention, use of getline) than
of ARGV - where, once I get it right, I know it will stay right.
|
|
0
|
|
|
|
Reply
|
gazelle
|
9/8/2007 2:09:31 PM
|
|
|
22 Replies
418 Views
(page loaded in 0.266 seconds)
|