I am trying to read in a data file that is comma separated and match 4
chars and keep the subsequent date chars and add that to my output
file. e.g.
My input file format is something like this:
BBBB:2005/11/01,BBBC:2005/12/01,BBBD:2005/12/07,BBBB:2005/12/08
I want to read the file in and match BBBB and save <date string> and
output a new 4 char name like ZZZZ and append the saved date string for
that match (in this case ":2005/12/01" and ":2005/12/08" ) so that my
file would then read.
BBBB:2005/11/01,ZZZZ:2005/11/01,BBBC:2005/12/01,BBBD:2005/11/07,BBBB:2005/12/08,ZZZZ:2005/12/08
I have scoured the sed & awk and vi book by O'Reily but cannot see a
similar option other than hold space in sed. Maybe perl??
Thanks in advance,
Mike D
|
|
0
|
|
|
|
Reply
|
eeb4u (26)
|
11/2/2005 2:19:01 AM |
|
eeb4u@hotmail.com wrote:
> I am trying to read in a data file that is comma separated and match 4
> chars and keep the subsequent date chars and add that to my output
> file. e.g.
>
> My input file format is something like this:
> BBBB:2005/11/01,BBBC:2005/12/01,BBBD:2005/12/07,BBBB:2005/12/08
>
> I want to read the file in and match BBBB and save <date string> and
> output a new 4 char name like ZZZZ and append the saved date string for
> that match (in this case ":2005/12/01" and ":2005/12/08" ) so that my
> file would then read.
>
> BBBB:2005/11/01,ZZZZ:2005/11/01,BBBC:2005/12/01,BBBD:2005/11/07,BBBB:2005/12/08,ZZZZ:2005/12/08
>
> I have scoured the sed & awk and vi book by O'Reily but cannot see a
> similar option other than hold space in sed. Maybe perl??
>
> Thanks in advance,
>
> Mike D
>
It's not apparent whether your file consists of multiple lines like in
the example above or just a stream of characters in one line. In case
it's a one line stream the following awk program does what you want...
BEGIN { ORS=RS="," ; OFS=FS=":" }
{ print $1,$2 ; if ($1 == "BBBB") print "ZZZZ",$2 }
Janis
|
|
0
|
|
|
|
Reply
|
Janis
|
11/2/2005 3:05:59 AM
|
|
The file consists of several lines, one record per line where one
record can have many dates and publications (BBBB, BBBD etc). I showed
only one line, sorry!
this worked on all but one occurrence during my tests. Additionally, I
had to modify script as my input file actually contains quotes and
reads:
""BBBB:2005/10/31"",""BBBC:2005/11/01"",""BBBD:2005/11/01""
etc.
thus
BEGIN { ORS=RS="," ; OFS=FS=":" }
{ print $1,$2 ; if ($1 == "\"\"BBBB") print "\"\"ZZZZ",$2 }
The only instance that fails is if BBBB is the last record on the line
and isn't followed by the RS comma. The script fails to translate this
single record.
I am almost at the end of my shift. I will continue trying to get this
working or hopefully you may have a solution by tomorrow.
Thanks again for your excellent solution. You turned my 30 line ksh
script that took about 2 hours to run (on my sparc 20) into a
lightspeed one liner!
Mike D
|
|
0
|
|
|
|
Reply
|
eeb4u
|
11/2/2005 6:15:42 AM
|
|
a quick fix is to append each line with a comma using sed, run the awk
one-liner and strip off the comma afterwards.
sed 's/"$/",/g' inputfile > appended.file
awk -f scriptfile appended.file > awked.file
sed 's/",$/"/g' awked.file > datemod.file
I am sure there is a much more elegant, efficient way to accomplish
this!
thanks again
Mike D
|
|
0
|
|
|
|
Reply
|
eeb4u
|
11/2/2005 6:24:12 AM
|
|
eeb4u@hotmail.com wrote:
> a quick fix is to append each line with a comma using sed, run the awk
> one-liner and strip off the comma afterwards.
>
> sed 's/"$/",/g' inputfile > appended.file
> awk -f scriptfile appended.file > awked.file
> sed 's/",$/"/g' awked.file > datemod.file
>
> I am sure there is a much more elegant, efficient way to accomplish
> this!
>
> thanks again
>
> Mike D
>
Please read these before posting again:
http://cfaj.freeshell.org/google
http://en.wikipedia.org/wiki/Top-posting
http://en.wikipedia.org/wiki/Netiquette
Now, wrt your problem, does this do what you want:
$ cat file
""BBBB:2005/10/31"",""BBBC:2005/11/01"",""BBBD:2005/11/01""
""BBBA:2005/10/31"",""BBBC:2005/11/01"",""BBBB:2005/11/01""
$ awk 'BEGIN {OFS=FS="," }{ for (i=1;i<=NF;i++) if ($i ~ /^\"\"BBBB:/) {
tmp = $i; sub(/BBBB/,"ZZZZ",tmp); $i = $i OFS tmp }}1' file
""BBBB:2005/10/31"",""ZZZZ:2005/10/31"",""BBBC:2005/11/01"",""BBBD:2005/11/01""
""BBBA:2005/10/31"",""BBBC:2005/11/01"",""BBBB:2005/11/01"",""ZZZZ:2005/11/01""
Regards,
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
11/2/2005 12:09:13 PM
|
|
eeb4u@hotmail.com wrote:
> The file consists of several lines, one record per line where one
> record can have many dates and publications (BBBB, BBBD etc). I showed
> only one line, sorry!
>
> this worked on all but one occurrence during my tests. Additionally, I
> had to modify script as my input file actually contains quotes and
> reads:
>
> ""BBBB:2005/10/31"",""BBBC:2005/11/01"",""BBBD:2005/11/01""
>
> etc.
>
> thus
>
> BEGIN { ORS=RS="," ; OFS=FS=":" }
> { print $1,$2 ; if ($1 == "\"\"BBBB") print "\"\"ZZZZ",$2 }
>
> The only instance that fails is if BBBB is the last record on the line
> and isn't followed by the RS comma. The script fails to translate this
> single record.
>
> I am almost at the end of my shift. I will continue trying to get this
> working or hopefully you may have a solution by tomorrow.
>
> Thanks again for your excellent solution. You turned my 30 line ksh
> script that took about 2 hours to run (on my sparc 20) into a
> lightspeed one liner!
>
> Mike D
BEGIN { FS=OFS="\"\",\"\"" }
{ gsub( /^""|""$/, "" )
for (i=1;i<=NF;i++)
if ( $i ~ /^BBBB/ )
$i = $i FS "ZZZZ" substr($i,5)
print "\"\"" $0 "\"\""
}
|
|
0
|
|
|
|
Reply
|
William
|
11/2/2005 7:49:49 PM
|
|
On 1 Nov 2005 22:24:12 -0800, eeb4u@hotmail.com wrote:
>a quick fix is to append each line with a comma using sed, run the awk
>one-liner and strip off the comma afterwards.
>
>sed 's/"$/",/g' inputfile > appended.file
>awk -f scriptfile appended.file > awked.file
>sed 's/",$/"/g' awked.file > datemod.file
>
>I am sure there is a much more elegant, efficient way to accomplish
>this!
Hi Mike,
You could just stay with one invocation of 'sed';
sed 's|BBBB:\([^,]*\)\(,*\)|BBBB:\1,ZZZZ:\1\2|g' datemod.file
Provided of course the fields are all in the format given
in the original post ( "BBBB:" can't be embedded in the data).
byefornow
laura
>
>thanks again
>
>Mike D
>
--
echo alru_aafriehdab@ittnreen.tocm |sed 's/\(.\)\(.\)/\2\1/g'
|
|
0
|
|
|
|
Reply
|
run_signature_script
|
11/2/2005 10:11:59 PM
|
|
"laura fairhead" <run_signature_script_for_my_email@INVALID.com> wrote in
message news:436937d8.36534764@news.btinternet.com...
> You could just stay with one invocation of 'sed';
>
> sed 's|BBBB:\([^,]*\)\(,*\)|BBBB:\1,ZZZZ:\1\2|g' datemod.file
>
> Provided of course the fields are all in the format given
> in the original post ( "BBBB:" can't be embedded in the data).
>
> byefornow
> laura
I will try the two new solutions tonight.
Thanks,
Mike
|
|
0
|
|
|
|
Reply
|
Mike
|
11/4/2005 3:39:38 PM
|
|
|
7 Replies
200 Views
(page loaded in 0.092 seconds)
Similiar Articles: sed replace string when line match something - comp.unix.shell ...matched search string - comp.lang.awk Find word, replace line - comp.lang.awk sed replace string when line match something - comp.unix.shell ... matched search string ... search for a string in a file using expect.. - comp.lang.tcl ...how to match "$" in expect, instead of using it as a wildcard ... search for a string in a file using expect.. - comp.lang.tcl ..... matched search string - comp.lang.awk ... find string matching whole word - comp.soft-sys.matlabsearch for a string in a file using expect.. - comp.lang.tcl ... find string matching whole word - comp.soft-sys.matlab search for a string in a file using expect.. - comp ... wildcard matching algorithm - comp.lang.rexxmatched search string - comp.lang.awk wildcard matching algorithm - comp.lang.rexx... end end if match then say '*** MATCHED - mask:"'request'" matched string:"'str ... (sh/bash) How to check for a string matching -*? - comp.unix.shell ...matched search string - comp.lang.awk... have scoured the sed & awk and vi book by O'Reily but cannot see a ... Bash Test for Partial Match of String - comp.unix.shell ... find string in file - comp.unix.shellBash Test for Partial Match of String - comp.unix.shell find string matching whole word - comp.soft-sys.matlab fuzzy matching - comp ... search for a string in a file ... ksh or sed search & replace - comp.unix.programmermatched search string - comp.lang.awk You turned my 30 line ksh > script that took about 2 hours to run ... string when line match something - comp.unix.shell ... matched ... how to match "$" in expect, instead of using it as a wildcard ...matched search string - comp.lang.awk how to match "$" in expect, instead of using it as a wildcard ... search for a string in a file using expect.. - comp.lang.tcl ... Bash Test for Partial Match of String - comp.unix.shell(sh/bash) How to check for a string matching -*? - comp.unix.shell ... Bash Test for Partial Match of String - comp.unix.shell bash search for a pattern within a string ... Replacing ONLY the first occurence of a substring in a string ...matched search string - comp.lang.awk I showed only one line, sorry! this worked on all but one occurrence during my ... comp.lang.awk sed replace string ... method will ... String searching algorithm - Wikipedia, the free encyclopediaIn computer science, string searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place ... JavaScript match() Method - W3Schools Online Web TutorialsNote: If the regular expression does not include the g modifier (to perform a global search), the match() method will return only the first match in the string. 7/21/2012 2:01:20 AM
|