f



Conversion List -> Multiset

[Please redirect me to some appropriate group if I am off topic here]

For some practical application I need (something similar to...) some
algorithm that does the following:

Given a

  list of size n containing pairs (l_i, 1)
  (where possibly l_i == l_j for some i != j)

transform this to a

  list containing pairs (u_j, card {i : l_i = u_j} ) 
  (with u_i != u_j for i != j)

[This is essentially counting the duplicated elements]

Now I know that I can simply sort the first list in time O(n log n), and
collect identical elements in O(n) afterwards. So this is O(n log n) total.

I wonder whether there could be some better algorithm as I don't need the
u_i _ordered_ in the end.

Hashing comes to mind, but the l_i are 'large', and for practical reasons
I cannot copy the original list to a complete new structure like a hash
map.

[In case numbers might help: n is about 10^4...10^8, and there are not
more than about 20 repetitions of any element in the original list]

Andre'
0
poenitz (46)
7/1/2003 10:19:19 AM
comp.theory 5139 articles. 1 followers. marty.musatov (1143) is leader. Post Follow

2 Replies
367 Views

Similar Articles

[PageSpeed] 36

Andr� P�nitz <poenitz@gmx.net> writes:

> [Please redirect me to some appropriate group if I am off topic here]
> 
> For some practical application I need (something similar to...) some
> algorithm that does the following:
> 
> Given a
> 
>   list of size n containing pairs (l_i, 1)
>   (where possibly l_i == l_j for some i != j)
> 
> transform this to a
> 
>   list containing pairs (u_j, card {i : l_i = u_j} ) 
>   (with u_i != u_j for i != j)
> 
> [This is essentially counting the duplicated elements]
> 
> Now I know that I can simply sort the first list in time O(n log n), and
> collect identical elements in O(n) afterwards. So this is O(n log n) total.
> 
> I wonder whether there could be some better algorithm as I don't need the
> u_i _ordered_ in the end.

Look up "semi-sorting".  Semi-sorting is making identical elements
adjacent, but with no requirements about the ordering of groups of
identical elements.  I believe linear-time algorithms for semi-sorting
exist, but the linearity may be in the range of possible numbers
rather than the length of the list.

	Torben Mogensen
0
torbenm265 (288)
7/1/2003 11:03:32 AM
Torben �gidius Mogensen <torbenm@diku.dk> wrote:
>> I wonder whether there could be some better algorithm as I don't need the
>> u_i _ordered_ in the end.
> 
> Look up "semi-sorting".
> Semi-sorting is making identical elements adjacent, but with no
> requirements about the ordering of groups of identical elements. 

This seems to be indeed what I am looking for.

Unfortunately, google is not too chatty on that subject, so I'll have to
dig deeper.

Thanks for the hint!

Andre'

-- 
Those who desire to give up Freedom in order to gain Security, will not have,
nor do they deserve, either one.     (T. Jefferson or B. Franklin or both...)
0
poenitz (46)
7/2/2003 7:49:34 AM
Reply: