COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Awk arrays and specific character matching

• Follow

```am relativity new to awk but what I'm trying to do is take a list of
serial numbers and parse out the duplicates, but the problem is I only
want to remove duplicate entrys if they are at certain characters of
the serial number.
(or i could just use uniq)

It's a 17 digit serial number but i only want to remove duplicates if
they appear in digit 1,2,3,4,5,6,7,8, 10,11,12 disregarding any
duplication in digit 9 or 13-17

is this posible with a awk array?

or do i need to use somthing else to acomplish this?

thanks
```
 0

```# Compare 17-digit serial numbers based on digits 1--8 and
{ fixed = fix(\$0)
if ( !(fixed in a) )
{ print
a[fixed]++
}
}

function fix(s)
{ return substr(s,1,8) substr(s,10,3)
}

Does this do what you want?
```
 0

```
GREG_D wrote:
> am relativity new to awk but what I'm trying to do is take a list of
> serial numbers and parse out the duplicates, but the problem is I only
> want to remove duplicate entrys if they are at certain characters of
> the serial number.
> (or i could just use uniq)
>
> It's a 17 digit serial number but i only want to remove duplicates if
> they appear in digit 1,2,3,4,5,6,7,8, 10,11,12 disregarding any
> duplication in digit 9 or 13-17
>
> is this posible with a awk array?
>
> or do i need to use somthing else to acomplish this?

Something like this should do it if you want to keep the last occurrence
of the serial number:

awk 'BEGIN{FS=""}
{a[\$1\$2\$3\$4\$5\$6\$7\$8\$10\$11\$12]=\$0}
END{for (i in a) print a[i]}'

If you want to keep the first occurrence then it's:

awk 'BEGIN{FS=""}
{i=\$1\$2\$3\$4\$5\$6\$7\$8\$10\$11\$12}
!(i in a){a[i]=\$0}
END{for (i in a) print a[i]}'

Regards,

Ed.
```
 0

```gregdodds@canada.com (GREG_D) wrote in message news:<f530cee6.0412022331.341444a3@posting.google.com>...
> am relativity new to awk but what I'm trying to do is take a list of
> serial numbers and parse out the duplicates, but the problem is I only
> want to remove duplicate entrys if they are at certain characters of
> the serial number.
> (or i could just use uniq)
>
> It's a 17 digit serial number but i only want to remove duplicates if
> they appear in digit 1,2,3,4,5,6,7,8, 10,11,12 disregarding any
> duplication in digit 9 or 13-17
>
> is this posible with a awk array?
>
> or do i need to use somthing else to acomplish this?
>
>
>
> thanks

hey thanks to both of you for all your help

that worked great, just what i needed
```
 0

3 Replies
252 Views

Similiar Articles:

7/23/2012 9:37:35 AM