I have a problem using comm. I have a sample mail message, m5, and from
that I've saved a base64 encoded virus into the file virus, it's about
2000 lines long. I had the simple scheme in mind,
comm -12 m5 virus | cmp -2 virus
which would return 0 if the virus text is contained in m5, and 1 if not.
But the comm output starts somewhere in the middle of the virus file, so
the "common" output doesn't match what's in the virus file and I always
get a no-match.
I've tried it with smaller files and it works fine. And the "virus" file
is literally just a hunk pulled out of the very same file m5 that I'm
comparing it with. The first line in "virus" is unique, it appears in one
and only one place in "m5", although some other lines are duplicated in
various places, notably some rows of A. I've inspected "virus" manually,
and the beginning and end, at least, look the way they should.
What could my problem be?
--
"When the fool walks through the street, in his lack of understanding he
calls everything foolish." -- Ecclesiastes 10:3, New American Bible
|
|
0
|
|
|
|
Reply
|
glhansen (396)
|
12/7/2003 3:57:08 AM |
|
In article <bqu8ek$8g8$1@hood.uits.indiana.edu>,
glhansen@steel.ucs.indiana.edu (Gregory L. Hansen) wrote:
> I have a problem using comm. I have a sample mail message, m5, and from
> that I've saved a base64 encoded virus into the file virus, it's about
> 2000 lines long. I had the simple scheme in mind,
>
> comm -12 m5 virus | cmp -2 virus
>
> which would return 0 if the virus text is contained in m5, and 1 if not.
> But the comm output starts somewhere in the middle of the virus file, so
> the "common" output doesn't match what's in the virus file and I always
> get a no-match.
>
> I've tried it with smaller files and it works fine. And the "virus" file
> is literally just a hunk pulled out of the very same file m5 that I'm
> comparing it with. The first line in "virus" is unique, it appears in one
> and only one place in "m5", although some other lines are duplicated in
> various places, notably some rows of A. I've inspected "virus" manually,
> and the beginning and end, at least, look the way they should.
>
> What could my problem be?
comm requires the files to be sorted. Are they?
--
Barry Margolin, barmar@alum.mit.edu
Woburn, MA
|
|
0
|
|
|
|
Reply
|
barmar (5629)
|
12/7/2003 8:11:08 AM
|
|
In article <barmar-1DA37B.03110907122003@netnews.attbi.com>,
Barry Margolin <barmar@alum.mit.edu> wrote:
>In article <bqu8ek$8g8$1@hood.uits.indiana.edu>,
> glhansen@steel.ucs.indiana.edu (Gregory L. Hansen) wrote:
>
>> I have a problem using comm. I have a sample mail message, m5, and from
>> that I've saved a base64 encoded virus into the file virus, it's about
>> 2000 lines long. I had the simple scheme in mind,
>>
>> comm -12 m5 virus | cmp -2 virus
>>
>> which would return 0 if the virus text is contained in m5, and 1 if not.
>> But the comm output starts somewhere in the middle of the virus file, so
>> the "common" output doesn't match what's in the virus file and I always
>> get a no-match.
>>
>> I've tried it with smaller files and it works fine. And the "virus" file
>> is literally just a hunk pulled out of the very same file m5 that I'm
>> comparing it with. The first line in "virus" is unique, it appears in one
>> and only one place in "m5", although some other lines are duplicated in
>> various places, notably some rows of A. I've inspected "virus" manually,
>> and the beginning and end, at least, look the way they should.
>>
>> What could my problem be?
>
>comm requires the files to be sorted. Are they?
They're not. In this context, it doesn't make sense to sort the files.
But I've tried the above with small, deliberately unsorted files,
like
a
b
c c
z z
y and y
x x
w w
d
e
f
And it worked the way I wanted it to. I thought sorting
must be a precaution in the case of something like a
list of things that's not in the same order, rather
than for comparing a small peice to a larger file.
--
"When the fool walks through the street, in his lack of understanding he
calls everything foolish." -- Ecclesiastes 10:3, New American Bible
|
|
0
|
|
|
|
Reply
|
glhansen (396)
|
12/7/2003 2:07:26 PM
|
|
|
2 Replies
12 Views
(page loaded in 0.041 seconds)
|