[Please redirect me to some appropriate group if I am off topic here] For some practical application I need (something similar to...) some algorithm that does the following: Given a list of size n containing pairs (l_i, 1) (where possibly l_i == l_j for some i != j) transform this to a list containing pairs (u_j, card {i : l_i = u_j} ) (with u_i != u_j for i != j) [This is essentially counting the duplicated elements] Now I know that I can simply sort the first list in time O(n log n), and collect identical elements in O(n) afterwards. So this is O(n log n) total. I wonder whether there could be some better algorithm as I don't need the u_i _ordered_ in the end. Hashing comes to mind, but the l_i are 'large', and for practical reasons I cannot copy the original list to a complete new structure like a hash map. [In case numbers might help: n is about 10^4...10^8, and there are not more than about 20 repetitions of any element in the original list] Andre'
Andr� P�nitz <poenitz@gmx.net> writes: > [Please redirect me to some appropriate group if I am off topic here] > > For some practical application I need (something similar to...) some > algorithm that does the following: > > Given a > > list of size n containing pairs (l_i, 1) > (where possibly l_i == l_j for some i != j) > > transform this to a > > list containing pairs (u_j, card {i : l_i = u_j} ) > (with u_i != u_j for i != j) > > [This is essentially counting the duplicated elements] > > Now I know that I can simply sort the first list in time O(n log n), and > collect identical elements in O(n) afterwards. So this is O(n log n) total. > > I wonder whether there could be some better algorithm as I don't need the > u_i _ordered_ in the end. Look up "semi-sorting". Semi-sorting is making identical elements adjacent, but with no requirements about the ordering of groups of identical elements. I believe linear-time algorithms for semi-sorting exist, but the linearity may be in the range of possible numbers rather than the length of the list. Torben Mogensen
Torben �gidius Mogensen <torbenm@diku.dk> wrote: >> I wonder whether there could be some better algorithm as I don't need the >> u_i _ordered_ in the end. > > Look up "semi-sorting". > Semi-sorting is making identical elements adjacent, but with no > requirements about the ordering of groups of identical elements. This seems to be indeed what I am looking for. Unfortunately, google is not too chatty on that subject, so I'll have to dig deeper. Thanks for the hint! Andre' -- Those who desire to give up Freedom in order to gain Security, will not have, nor do they deserve, either one. (T. Jefferson or B. Franklin or both...)