printing unique values in 4th field

  • Follow


this is the code that i am using in an attempt to get all the unique values that occur in the fourth
field of each record in a file that i call from the command line.

#!  /bin/awk -f

BEGIN {RS = "\n"; FS = " "}

{
	array123[$4] = 1
	#assign a bogus value to the "key" of the 4th field

	for (i in array123) {print i}
	# print out each unique value
}

END {
print "processing done"
}

sample data file :

123 abc 987 widget1 873
138 deabc 987 widget221 983
1298 ajc 987 widget8 8731
249 abk 963 widget23 93
123 abc 917 widget1 983
120 abc 97 widget23 8573


the output i am getting is each of the widget numbers in the above data file, when what i want is
each of the unique values (widget1, widget221, widget8, and widget23).

suggestions?

thanks

joe

ps. this works : awk -f program_file_name data_file_name | sort | uniq 
but i would prefer to do all of it in awk so i can understand what i am doing wrong
0
Reply Joseph 6/11/2004 5:24:26 PM

On Fri, 11 Jun 2004 11:24:26 -0600, this message was posted :

> this is the code that i am using in an attempt to get all the unique values that occur in the
> fourth field of each record in a file that i call from the command line.
> 
> #!  /bin/awk -f
> 
> BEGIN {RS = "\n"; FS = " "}
> 
> {
> 	array123[$4] = 1
> 	#assign a bogus value to the "key" of the 4th field
> 
> 	for (i in array123) {print i}
> 	# print out each unique value
> }
> 
> END {
> print "processing done"
> }
> 
> sample data file :
> 
> 123 abc 987 widget1 873
> 138 deabc 987 widget221 983
> 1298 ajc 987 widget8 8731
> 249 abk 963 widget23 93
> 123 abc 917 widget1 983
> 120 abc 97 widget23 8573
> 
> 
> the output i am getting is each of the widget numbers in the above data file, when what i want is
> each of the unique values (widget1, widget221, widget8, and widget23).
> 
> suggestions?
> 
> thanks
> 
> joe
> 
> ps. this works : awk -f program_file_name data_file_name | sort | uniq but i would prefer to do
> all of it in awk so i can understand what i am doing wrong

never mind.
i figured it out.
should have done the printing of unique values inside the END{} instead of the main part of the
code.

joe
0
Reply Joseph 6/11/2004 6:01:11 PM



Joseph Paish wrote:
> this is the code that i am using in an attempt to get all the unique values that occur in the fourth
> field of each record in a file that i call from the command line.
<snip>
> ps. this works : awk -f program_file_name data_file_name | sort | uniq 
> but i would prefer to do all of it in awk so i can understand what i am doing wrong

No need for all of that, just this would work:

	cut -d' ' -f4 data_file_name | sort -u

Regards,
	
	Ed.

0
Reply Ed 6/11/2004 6:13:22 PM

In article <lVlyc.1164$0A.9011@localhost>,
Joseph Paish <jpaish@freenet.edmonton.ab.ca> wrote:
>this is the code that i am using in an attempt to get all the unique
>values that occur in the fourth
>field of each record in a file that i call from the command line.
>
>#!  /bin/awk -f
>
>BEGIN {RS = "\n"; FS = " "}
>
>{
>	array123[$4] = 1
>	#assign a bogus value to the "key" of the 4th field
>
>	for (i in array123) {print i}
>	# print out each unique value
>}
>
>END {
>print "processing done"
>}
>
>sample data file :
>
>123 abc 987 widget1 873
>138 deabc 987 widget221 983
>1298 ajc 987 widget8 8731
>249 abk 963 widget23 93
>123 abc 917 widget1 983
>120 abc 97 widget23 8573
>
>
>the output i am getting is each of the widget numbers in the above data
>file, when what i want is
>each of the unique values (widget1, widget221, widget8, and widget23).

awk 'a[$4]++==0 {print $4}' infile


Chuck Demas

-- 
  Eat Healthy        |   _ _   | Nothing would be done at all,
  Stay Fit           |   @ @   | If a man waited to do it so well,
  Die Anyway         |    v    | That no one could find fault with it.
  demas@theworld.com |  \___/  | http://world.std.com/~cpd
0
Reply demas 6/11/2004 6:35:05 PM

On Fri, 11 Jun 2004 12:01:11 -0600, Joseph Paish wrote:
> On Fri, 11 Jun 2004 11:24:26 -0600, this message was posted :
>> this is the code that i am using in an attempt to get all the unique values that occur in the
>> fourth field of each record in a file that i call from the command line.
>> 
>> #!  /bin/awk -f
[snipped script & data]

> never mind.
> i figured it out.
> should have done the printing of unique values inside the END{} instead of the main part of the
> code.

I suppose, if you want to (re)sort it in some particular way. I thought
Charles Demas solution was really neat! (1st glance I thought it was wrong)

-- 
Juhan Leemet
Logicognosis, Inc.

0
Reply Juhan 10/14/2004 10:07:53 PM

4 Replies
1427 Views

(page loaded in 0.101 seconds)

Similiar Articles:













7/21/2012 9:45:12 PM


Reply: