multiple columns values in a single row of an uniq id...

  • Follow


Hi everybody,

I have data like below..

---------------------------------------
empid        itm   no     amount
---------------------------------------
01112655, 014,  6,    22365.00
01112654, 075,  1,     6205.00
01112654, 091, 24,    96200.00
01112654, 149,  2,    12217.00
01112654, 179, 14,   100645.00
01112654, 185,  5,    46172.00
01112654, 188,  8,   265000.00
01112654, 191,  4,   167000.00
01112654, 193,  1,    10000.00

01140654, 075,  1,     3151.00
01140654, 089,  2,     8236.00
01140654, 091,  9,    17662.00
01140654, 102,  3,     7568.00
01140654, 133,  1,      319.00
01140654, 149,  6,    25802.00
01140654, 179,  6,    18798.00
01140654, 188,  3,   100000.00
01140654, 191,  2,    40000.00
01140654, 193,  1,    10000.00
01140654, 195,  2,    49440.00

01187654, 014,  3,    11956.00
01187654, 075,  1,     3502.00
01187654, 091,  8,    14453.00
01187654, 149, 22,   149480.00
01187654, 159,  1,    18750.00
01187654, 188,  1,    10000.00
01187654, 193,  1,    10000.00
01187654, 195,  1,    50145.00

i want to produce above data in a columnar form  like
     emdpi         014                    075                   and so
on.......
--------------------------------------------------------------------------------------------------------
01112655   6    22365.00     1    6205.00           and so on.......
01140654   --     --                1    3151.00

or

i want a particular empid's all the $2 and $3 and $4 values in a one
row...
0
Reply nag 1/24/2009 12:53:55 PM

I don't understand the -- in the second line of your example, but otherwise,

BEGIN{ FS = ", +" }

NR < 4 { next }  # skip the header lines

NR == 4 { prev = $1; printf $1}  # don't want a break on the first line

$1 != prev {prev = $1; print ""; printf $1 }

{ i=1; while(i <= 4) printf "\t" $++i }

END { print "" }


nag wrote:
> Hi everybody,
> 
> I have data like below..
> 
> ---------------------------------------
> empid        itm   no     amount
> ---------------------------------------
> 01112655, 014,  6,    22365.00
> 01112654, 075,  1,     6205.00
> %< snip >%

> i want to produce above data in a columnar form  like
>      emdpi         014                    075                   and so
> on.......
> --------------------------------------------------------------------------------------------------------
> 01112655   6    22365.00     1    6205.00           and so on.......
> 01140654   --     --                1    3151.00
> 
> or
> 
> i want a particular empid's all the $2 and $3 and $4 values in a one
> row...
0
Reply Jim 1/25/2009 4:57:50 PM


"flattening" columnar data is a common pattern, whose structure is

1 - skip any header lines

2 - so as not to have a blank line at the top of the output, make an 
exception for the first data record:
   a - set a variable holding the key of the previous record to the key
       of the current record
   b - print the current key with no newline at the end

3- if the current key isn't equal to the previous key:
   a - set a variable holding the key of the previous record to the key
       of the current record
   b - print a newline to end the record for the previous key
   c - print the current key with no newline at the end

4 - on every record, for each non-key field:
   a - print the output column delimiter with no newline at the end
   b - print the field value with no newline at the end

5 - at the end, print a newline to close the last record

In your case, there are also blank lines to skip.

Note that this code requires an AWK version that can handle regular 
expressions as a field separator.


BEGIN{ FS = ", +" }

/^$/ { next }  # skip blank lines

NR < 4 { next }  # skip the header lines

NR == 4 { prev = $1; printf $1}  # don't want a break on the first data
                                  # line

$1 != prev {prev = $1; print ""; printf $1 } # break between keys

{ for(i=2;i <= 4; i++) printf "\t" $i } # print the data fields

END { print "" }  # close the last output line



> nag wrote:
>> Hi everybody,
>>
>> I have data like below..
>>
>> ---------------------------------------
>> empid        itm   no     amount
>> ---------------------------------------
>> 01112655, 014,  6,    22365.00
>> 01112654, 075,  1,     6205.00
>> %< snip >%
> 
>> i want to produce above data in a columnar form  like
>>      emdpi         014                    075                   and so
>> on.......
>> -------------------------------------------------------------------------------------------------------- 
>>
>> 01112655   6    22365.00     1    6205.00           and so on.......
>> 01140654   --     --                1    3151.00
>>
>> or
>>
>> i want a particular empid's all the $2 and $3 and $4 values in a one
>> row...
0
Reply Jim 1/25/2009 5:33:58 PM

Jim Hart wrote:
> [...]
> 
> Note that this code requires an AWK version that can handle regular 
> expressions as a field separator.

Note that the awk fieldseparator FS can always be used with a regular
expression. (Or is there any awk version that doesn't conform to that
standard?)

Janis

> 
> 
> [...]
0
Reply Janis 1/25/2009 5:49:55 PM

Snipped for terseness>
> BEGIN{ FS = ", +" }
>
> /^$/ { next }  # skip blank lines
>
> NR < 4 { next }  # skip the header lines
>
> NR == 4 { prev = $1; printf $1}  # don't want a break on the first data
>                                  # line
>
> $1 != prev {prev = $1; print ""; printf $1 } # break between keys
>
> { for(i=2;i <= 4; i++) printf "\t" $i } # print the data fields
>
> END { print "" }  # close the last output line
>
>
>
>> nag wrote:
>>> Hi everybody,
>>>
>>> I have data like below..
>>>
>>> ---------------------------------------
>>> empid        itm   no     amount
>>> ---------------------------------------
>>> 01112655, 014,  6,    22365.00
>>> 01112654, 075,  1,     6205.00
>>> %< snip >%
>>
>>> i want to produce above data in a columnar form  like
>>>      emdpi         014                    075                   and so
>>> on.......
>>> --------------------------------------------------------------------------------------------------------
>>>
>>> 01112655   6    22365.00     1    6205.00           and so on.......
>>> 01140654   --     --                1    3151.00
>>>
>>> or
>>>
>>> i want a particular empid's all the $2 and $3 and $4 values in a one
>>> row...

Looks like OP requires column 2 as header and the rest of the table be 
transposed.

Something like below?
BEGIN { FS = ", +" }
$1 + 0	{
	empid[$1]
	itm[$2]
	namt[$1, $2] = $3 "," $4
}
END {
	str = "EmpID"
	for (item in itm) str = str ",," item
	print str
	for (emp in empid) {
		str = emp
		for (item in itm) {
			if  (!(emp, item) in namt) namt[emp, item] = "--,--"
			str = str "," namt[emp, item]
		}
		print str
	}
} 

0
Reply Rajan 1/26/2009 2:45:15 AM

Thank you rajan...you gave me the exact result what i was looking
for...thank you alot..
i have re-written the output to make it a columnar output having each
column equal widths.

I can frankly tell that i am a beginner in gawk. I wrote so many progs
with simple awk functions...but i am poor in using arrays. Could you
please explain me your solution...line by line...from BEGIN to END.

One more thing the output header column 2 is not sorted
(164,194,091,014...) like it is coming.. Is it possible to sort the
header(i.e., $2).


0
Reply nag 1/26/2009 11:06:48 AM

Snipped
>>> nag wrote:
>>>> Hi everybody,
>>>>
>>>> I have data like below..
>>>>
>>>> ---------------------------------------
>>>> empid        itm   no     amount
>>>> ---------------------------------------
>>>> 01112655, 014,  6,    22365.00
>>>> 01112654, 075,  1,     6205.00
>>>> %< snip >%
>>>
>>>> i want to produce above data in a columnar form  like
>>>>      emdpi         014                    075                   and so
>>>> on.......
>>>> --------------------------------------------------------------------------------------------------------
>>>>
>>>> 01112655   6    22365.00     1    6205.00           and so on.......
>>>> 01140654   --     --                1    3151.00
>>>>
>>>> or
>>>>
>>>> i want a particular empid's all the $2 and $3 and $4 values in a one
>>>> row...
>
> Looks like OP requires column 2 as header and the rest of the table be 
> transposed.
>
> Something like below?
> BEGIN { FS = ", +" }
Split fields by ",<and spaces>"
> $1 + 0 {
> empid[$1]
> itm[$2]
> namt[$1, $2] = $3 "," $4
> }
create a key value pair with fields 1,2 as key and 3,4 as value
> END {
> str = "EmpID"
> for (item in itm) str = str ",," item
> print str
> for (emp in empid) {
> str = emp
> for (item in itm) {
> if  (!(emp, item) in namt) namt[emp, item] = "--,--"
> str = str "," namt[emp, item]
> }
> print str
> }
> }
print if there is a value in the array for itm else print "--"

for sorting, read "Which countries contribute most to the total" which is 
interesting 

0
Reply Rajan 1/30/2009 3:29:24 AM

On Jan 30, 8:29=A0am, "Rajan" <no-spam@rajan> wrote:
> Snipped
>
> >>> nag wrote:
> >>>> Hi everybody,
>
> >>>> I have data like below..
>
> >>>> ---------------------------------------
> >>>> empid =A0 =A0 =A0 =A0itm =A0 no =A0 =A0 amount
> >>>> ---------------------------------------
> >>>> 01112655, 014, =A06, =A0 =A022365.00
> >>>> 01112654, 075, =A01, =A0 =A0 6205.00
> >>>> %< snip >%
>
> >>>> i want to produce above data in a columnar form =A0like
> >>>> =A0 =A0 =A0emdpi =A0 =A0 =A0 =A0 014 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0075 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 and so
> >>>> on.......
> >>>> --------------------------------------------------------------------=
------------------------------------
>
> >>>> 01112655 =A0 6 =A0 =A022365.00 =A0 =A0 1 =A0 =A06205.00 =A0 =A0 =A0 =
=A0 =A0 and so on.......
> >>>> 01140654 =A0 -- =A0 =A0 -- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A01 =A0 =A03=
151.00
>
> >>>> or
>
> >>>> i want a particular empid's all the $2 and $3 and $4 values in a one
> >>>> row...
>
> > Looks like OP requires column 2 as header and the rest of the table be
> > transposed.
>
> > Something like below?
> > BEGIN { FS =3D ", +" }
>
> Split fields by ",<and spaces>"> $1 + 0 {
> > empid[$1]
> > itm[$2]
> > namt[$1, $2] =3D $3 "," $4
> > }
>
> create a key value pair with fields 1,2 as key and 3,4 as value> END {
> > str =3D "EmpID"
> > for (item in itm) str =3D str ",," item
> > print str
> > for (emp in empid) {
> > str =3D emp
> > for (item in itm) {
> > if =A0(!(emp, item) in namt) namt[emp, item] =3D "--,--"
> > str =3D str "," namt[emp, item]
> > }
> > print str
> > }
> > }
>
> print if there is a value in the array for itm else print "--"
>
> for sorting, read "Which countries contribute most to the total" which is
> interesting


Sorry! I am not getting any clue to sort the header...any idea..Just
give me some hint.
0
Reply nag 2/21/2009 4:47:53 PM


"nag" <visitnag@gmail.com> wrote in message 
news:d7101352-50ee-4101-b6d2-5d5027998480@m29g2000prd.googlegroups.com...
> On Jan 30, 8:29 am, "Rajan" <no-spam@rajan> wrote:
>> Snipped
>>
>> >>> nag wrote:
>> >>>> Hi everybody,
>>
>> >>>> I have data like below..
>>
>> >>>> ---------------------------------------
>> >>>> empid itm no amount
>> >>>> ---------------------------------------
>> >>>> 01112655, 014, 6, 22365.00
>> >>>> 01112654, 075, 1, 6205.00
>> >>>> %< snip >%
>>
>> >>>> i want to produce above data in a columnar form like
>> >>>> emdpi 014 075 and so
>> >>>> on.......
>> >>>> --------------------------------------------------------------------------------------------------------
>>
>> >>>> 01112655 6 22365.00 1 6205.00 and so on.......
>> >>>> 01140654 -- -- 1 3151.00
>>
>> >>>> or
>>
>> >>>> i want a particular empid's all the $2 and $3 and $4 values in a one
>> >>>> row...
>>
>> > Looks like OP requires column 2 as header and the rest of the table be
>> > transposed.
>>
>> > Something like below?
>> > BEGIN { FS = ", +" }
>>
>> Split fields by ",<and spaces>"> $1 + 0 {
>> > empid[$1]
>> > itm[$2]
>> > namt[$1, $2] = $3 "," $4
>> > }
>>
>> create a key value pair with fields 1,2 as key and 3,4 as value> END {
>> > str = "EmpID"
>> > for (item in itm) str = str ",," item
>> > print str
>> > for (emp in empid) {
>> > str = emp
>> > for (item in itm) {
>> > if (!(emp, item) in namt) namt[emp, item] = "--,--"
>> > str = str "," namt[emp, item]
>> > }
>> > print str
>> > }
>> > }
>>
>> print if there is a value in the array for itm else print "--"
>>
>> for sorting, read "Which countries contribute most to the total" which is
>> interesting
>
>
> Sorry! I am not getting any clue to sort the header...any idea..Just
> give me some hint.

Look at asort and asorti functions. 

0
Reply Rajan 2/22/2009 9:51:01 AM

8 Replies
156 Views

(page loaded in 0.1 seconds)

Similiar Articles:













7/25/2012 3:36:54 PM


Reply: