STL implementation performance

  • Permalink
  • submit to reddit
  • Email
  • Follow


Hello,

We are using standard containers in our code except task specific parts
where we write our own specific containers that can handle huge data
and/or optimized for specific algorithms. But there are places where
some standard containers are used but implementation could be faster.
Correct me if i'm wrong, but in both Dinkumware & STLPort
implementations of rb_tree::equal_range are similar:

pair<iterator,iterator> equal_range(const key_type& x)
{
return pair<iterator, iterator>(lower_bound(x), upper_bound(x));
}

which has double cost for set/map and is not optimal for
multiset/multimap. Sure in case of set/map the usage of equal_range can
be replaced by lower_bound, but having standard function with obvious
flaw is not a good practice.

What do you think?

Regards,
    Armen


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply armen_grigoryan (4) 10/19/2005 8:48:40 AM

See related articles to this posting


"Nemra�" <armen_grigoryan@hotmail.com> wrote in message
news:1129707345.606089.294840@o13g2000cwo.googlegroups.com...

> We are using standard containers in our code except task specific parts
> where we write our own specific containers that can handle huge data
> and/or optimized for specific algorithms. But there are places where
> some standard containers are used but implementation could be faster.
> Correct me if i'm wrong, but in both Dinkumware & STLPort
> implementations of rb_tree::equal_range are similar:
>
> pair<iterator,iterator> equal_range(const key_type& x)
> {
> return pair<iterator, iterator>(lower_bound(x), upper_bound(x));
> }
>
> which has double cost for set/map and is not optimal for
> multiset/multimap. Sure in case of set/map the usage of equal_range can
> be replaced by lower_bound, but having standard function with obvious
> flaw is not a good practice.
>
> What do you think?

I think you need to find a realistic case where the difference
amounts to a hill of beans. If you do, you already know how
to do it faster, don't you?

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com



      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply P 10/20/2005 11:25:43 AM

Nemra� wrote:
> Hello,
>
> We are using standard containers in our code except task specific parts
> where we write our own specific containers that can handle huge data
> and/or optimized for specific algorithms. But there are places where
> some standard containers are used but implementation could be faster.
> Correct me if i'm wrong, but in both Dinkumware & STLPort
> implementations of rb_tree::equal_range are similar:
>
> pair<iterator,iterator> equal_range(const key_type& x)
> {
> return pair<iterator, iterator>(lower_bound(x), upper_bound(x));
> }
>
> which has double cost for set/map and is not optimal for
> multiset/multimap. Sure in case of set/map the usage of equal_range can
> be replaced by lower_bound, but having standard function with obvious
> flaw is not a good practice.
>
> What do you think?

I think you could call corresponding member function of set/map
directly. Or provide your own my::equal_range that does the right thing
for you, or hack the library and overload std::equal_range() for
std::set/map.


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Maxim 10/20/2005 12:53:24 PM

Nemra� wrote:
> Hello,
>
> We are using standard containers in our code except task specific parts
> where we write our own specific containers that can handle huge data
> and/or optimized for specific algorithms. But there are places where
> some standard containers are used but implementation could be faster.
> Correct me if i'm wrong, but in both Dinkumware & STLPort
> implementations of rb_tree::equal_range are similar:
>
> pair<iterator,iterator> equal_range(const key_type& x)
> {
> return pair<iterator, iterator>(lower_bound(x), upper_bound(x));
> }
>
> which has double cost for set/map and is not optimal for
> multiset/multimap. Sure in case of set/map the usage of equal_range can
> be replaced by lower_bound, but having standard function with obvious
> flaw is not a good practice.
>
> What do you think?

I think that describing the equal_range implementation above as
"obviously flawed" is not justified. On the contrary, the
implementation is obviously correct, combining as it does its two
logical sub-operations. And correctness has tremendous value,
especially in a library.

I would be disappointed though with any routine, that is the logical
combination of other routines, not to take advantage of any obvious
optimizations that the combination makes possible. To keep the
discussion abstract and free of reference to any existing library, I
will choose as an example, a different routine, say
std::string::replace. Now calling string::replace logically deletes
then inserts characters into a std::string. But I would not expect that
an implementation of std::string simply to call erase() and insert().
For one, an obvious opportunity for optimizing this operation would
have been lost, and worse, there is no further opportunity left to the
client to optimize this operation outside of std::string itself. In
other words, unless the optimization is made in the library, then it
cannot be made at all.

But there is a complication. Even though the library author is often in
the best position to make optimizations, it is the application author
who will know best which optimizations are needed for a particular
program. So there has to be some information flow back to the library
author from the library clients to the ensure that useful optimizations
can be made in the appropriate places.

So as a client of the library I would point out (or even supply code
with the suggested improvement) to the library author where a useful
optimization could be made. I would be careful to do so in a
constructive manner, especially when the existing implementation is
obviously correct, if not optimally efficient.

Greg


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Greg 10/21/2005 9:12:50 AM

Greg Herlihy wrote:

    [...]
> I would be disappointed though with any routine, that is the
> logical combination of other routines, not to take advantage
> of any obvious optimizations that the combination makes
> possible.  To keep the discussion abstract and free of
> reference to any existing library, I will choose as an
> example, a different routine, say std::string::replace.  Now
> calling string::replace logically deletes then inserts
> characters into a std::string. But I would not expect that an
> implementation of std::string simply to call erase() and
> insert().  For one, an obvious opportunity for optimizing this
> operation would have been lost, and worse, there is no further
> opportunity left to the client to optimize this operation
> outside of std::string itself.  In other words, unless the
> optimization is made in the library, then it cannot be made at
> all.

Not really relevant to the current discussion (unless to say
that you chose a bad example), but it would seem more reasonable
for erase and insert to be implemented in terms of replace.
After all, erase() is just replace() with an empty string, and
insert() replace a substring of length 0.

I don't know about current implementations, but that's the way
my pre-standard string was implemented.

(For the rest of what you said, concerning library correctness
and ways of communicating with library implementors, I can only
concur.)

--
James Kanze                                           GABI Software
Conseils en informatique orient�e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply kanze 10/21/2005 4:15:59 PM

Even if multiset is not the most used container, but here nobody could
make search faster unless he will hack the library code. So its usage
for "heavy" cases will bring to performance problem without easy
solution.
As i already wrote we replaced equal_range by lower_bound because we
know that we are using std::set, but not every user knows that they
differ in a time twice for some libraries. Becides we need to replace
this simple code:

std::pair<SetIt, SetIt> range = someSet.equal_range(someVal);
if (range.first != range.second)
{
     range.first = someSet.insert(range.first, someVal);
}
// use range.first which points to the required value

With the following:

SetIt it = someSet.lower_bound(someVal);
if (it == someSet.end() || someSet.key_comp(*it, someVal) ||
someSet.key_comp(someVal, *it))
{
     it = someSet.insert(it, someVal);
}
// use it which points to the required value

I'd prefer using first one...

Regards,
     Armen


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply iso 10/22/2005 1:49:19 PM

"Greg Herlihy" <greghe@pacbell.net> wrote in message 
news:1129850301.991311.94290@g14g2000cwa.googlegroups.com...

> But there is a complication. Even though the library author is often in
> the best position to make optimizations, it is the application author
> who will know best which optimizations are needed for a particular
> program. So there has to be some information flow back to the library
> author from the library clients to the ensure that useful optimizations
> can be made in the appropriate places.
>
> So as a client of the library I would point out (or even supply code
> with the suggested improvement) to the library author where a useful
> optimization could be made. I would be careful to do so in a
> constructive manner, especially when the existing implementation is
> obviously correct, if not optimally efficient.

And the library author might politely decline to make the
"obvious" improvement, in some cases, particularly  if there
is any chance that improving one use of the library might
deoptimize another use. Library authors must continually
guess, on the basis of incomplete information, what are
the best tradeoffs to make.

In the case originally cited in this thread, the OP presumed
that equal_range is suboptimal if it merely calls lower_bound
and upper_bound. But that is often the best approach, even
for a set or map with unique elements. If a user has reason
to believe that a *particular* set or map can find an
equal range faster -- say by calling lower_bound and testing
the element it designates (if any) -- and if the user has
reason to believe that the difference in performance *warrants*
writing, debugging, and maintaining a bespoke function,
then the user is of course free to do so. But there are
enough "if"s here to dissuade most library writers from
tinkering much with equal_range.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com



      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply P 10/22/2005 1:56:52 PM

On 22 Oct 2005 09:49:19 -0400, "Nemra�" <armen_grigoryan@hotmail.com>
wrote:

> As i already wrote we replaced equal_range by lower_bound because we
> know that we are using std::set, but not every user knows that they
> differ in a time twice for some libraries. Becides we need to replace
> this simple code:

> std::pair<SetIt, SetIt> range = someSet.equal_range(someVal);
> if (range.first != range.second)
> {
>      range.first = someSet.insert(range.first, someVal);
> }
> // use range.first which points to the required value

Since it does nothing, it is a good thing to replace it.  I'm sure the
real code must have == rather than !=.

> With the following:

> SetIt it = someSet.lower_bound(someVal);
> if (it == someSet.end() || someSet.key_comp(*it, someVal) ||
> someSet.key_comp(someVal, *it))

Some knowledge of lower_bound would help.  It returns end, something
larger, or the target.  Something less is impossible because that would
not be a lower bound.  Remove the redundant middle test.

> {
>      it = someSet.insert(it, someVal);
> }
> // use it which points to the required value

> I'd prefer using first one...

I can't imagine why.  The logic of the code makes no sense for a
multi_set because it prevents insertion of duplicates.  The best
possible code for equal_range in a set would be to determine lower_bound
and then make the above test to determine whether upper_bound is the
same or next item.  Now your code adds an extra test on top of that not
to mention returning two iterators when one was enough.  Using
equal_range on a set makes no sense.

John

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply John 10/24/2005 12:16:09 AM

John, thanks for pointing out mistakes!
My statement was that either equal_range should be optimized or it have
to be well documented, because it is easy for user who doesnt know the
library implementation well to get a performance flaw. Concerning
multi_set/multi_map: it might be hard to imagine where equal_range can
be used in a way that it is hard to replace it with different
processing, but again we come to the same conclusion: if equal_range
should be replaced for performance then why we need it at all?

Regards,
    Armen


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply iso 10/24/2005 9:01:21 AM

In article <1129707345.606089.294840@o13g2000cwo.googlegroups.com>,
 "Nemra�" <armen_grigoryan@hotmail.com> wrote:

> Hello,
> 
> We are using standard containers in our code except task specific parts
> where we write our own specific containers that can handle huge data
> and/or optimized for specific algorithms. But there are places where
> some standard containers are used but implementation could be faster.
> Correct me if i'm wrong, but in both Dinkumware & STLPort
> implementations of rb_tree::equal_range are similar:
> 
> pair<iterator,iterator> equal_range(const key_type& x)
> {
> return pair<iterator, iterator>(lower_bound(x), upper_bound(x));
> }
> 
> which has double cost for set/map and is not optimal for
> multiset/multimap. Sure in case of set/map the usage of equal_range can
> be replaced by lower_bound, but having standard function with obvious
> flaw is not a good practice.
> 
> What do you think?

Here's a test case I'm running:

#include <iostream>
#include <vector>
#include <numeric>
#include <algorithm>
#include <time.h>

template <class F>
float
test(F f)
{
    std::vector<float> t;
    int i;
    for (i = 0; i < 10; ++i)
        t.push_back(f());
    double total_time = std::accumulate(t.begin(), t.end(), 0.0);
    while (i < 1000 && total_time < 1.0)
    {
        t.push_back(f());
        total_time += t.back();
        ++i;
    }
    std::sort(t.begin(), t.end());
    t.resize(t.size() * 9 / 10);
    return (float)(std::accumulate(t.begin(), t.end(), 0.0) / t.size());
}

const int M = 5;
const int N = 1000000;
const int F = N-1;

#include <set>

std::multiset<int> s;

void load(std::multiset<int>& s)
{
    for (int j = 0; j < N; ++j)
        for (int i = 0; i < M; ++i)
            s.insert(j);
}

float
time_multiset_equal_range()
{
    clock_t t = clock();
    {
    std::pair<std::set<int>::iterator, std::set<int>::iterator> p = 
s.equal_range(F);
    }
    t = clock() - t;
    return (float)(t/(double)CLOCKS_PER_SEC);
}

float
time_multiset_lower_upper()
{
    clock_t t = clock();
    {
    std::pair<std::set<int>::iterator, std::set<int>::iterator> p;
    p.first = s.lower_bound(F);
    p.second = s.upper_bound(F);
    }
    t = clock() - t;
    return (float)(t/(double)CLOCKS_PER_SEC);
}

int main()
{
    load(s);
    float t1 = test(time_multiset_equal_range);
    float t2 = test(time_multiset_lower_upper);
    std::cout << t2/t1 << '\n';
}

For me it prints out (approximately):

1.24239

YMMV.

The point of the test is to compare equal_range to calls to 
lower/upper_bound.  In the implementation I'm running, something has 
obviously been done to consistently increase the speed of equal_range 
over calls to lower/upper_bound by about 24%.  Maybe that is significant 
for you, maybe it isn't.  Obviously the number you get is going to vary 
with the tweaking parameters M, N and F.  But if you start consistently 
getting a number very close to 1 (and you can define "very close" 
yourself), perhaps you and your vendor should have a little talk.  If 
the number I've reported here doesn't amount to a hill of beans for you, 
then no action at all is warranted.

-Howard

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Howard 10/25/2005 12:45:34 PM

On 24 Oct 2005 05:01:21 -0400, "Nemra�" <armen_grigoryan@hotmail.com>
wrote:

> My statement was that either equal_range should be optimized or it have
> to be well documented, because it is easy for user who doesnt know the
> library implementation well to get a performance flaw. Concerning
> multi_set/multi_map: it might be hard to imagine where equal_range can
> be used in a way that it is hard to replace it with different
> processing, but again we come to the same conclusion: if equal_range
> should be replaced for performance then why we need it at all?

Well, we need it when it makes sense.  It might make sense for a
multi-set/map or other sorted container with duplicates.  Your real
question is about the associatives not using a more efficient algorithm.
My implementation has an efficient <algorithm> and your less than
efficient <tree>.  Since many libraries are derived from the original
HP/SGI library, yours may be the same.

Here is a little test program which compares the <algorithm> to the
<tree>.  The <algorithm> version is very inefficient, but is used to
show the number of compares which is the unit to count for search.

#include <algorithm>
#include <cstdlib>
#include <iostream>
#include <iterator>
#include <set>
using namespace std;
template <class N>
struct LinearProgression {
    N val;
    N inc;
    LinearProgression (N const& val = 0, N const& inc = 1)
: val(val - inc), inc(inc) {
        }
    N operator() () {
        return val += inc;
        }
    };
struct I {
    int i;
    I (int i = 0) : i(i) {
        }
    static int compares;
    };
int I::compares;
bool operator< (I lhs, I rhs) {
    ++ I::compares;
    return lhs.i < rhs.i;
    }
struct Test {
    multiset<I> const& s;
    int ca, cs;
    Test (multiset<I> const& s) : s(s), ca(0), cs(0) {
        }
    void operator() (I t) {
        I::compares = 0;
        equal_range(s.begin(), s.end(), t);
        ca += I::compares;
        I::compares = 0;
        s.equal_range(t);
        cs += I::compares;
        }
    ostream& print (ostream& os) const {
        return os << "algo " << ca << " member " << cs << endl;
        }
    };
int main (int, char** argv) {
    int max(atoi(argv[1]));
    max |= 1;
    multiset<I> data;
    for (int x(1); x != max; x += 2)
        for (int y = x; y & 1; y /= 2)
            data.insert(x);
    set<I> test;
    generate_n(inserter(test, test.end()), max / 2,
            LinearProgression<int>(1, 2));
    for_each(test.begin(), test.end(),
            Test(data)).print(cout << "Hit  ");
    test.clear();
    generate_n(inserter(test, test.end()), max / 2 + 1,
            LinearProgression<int>(0, 2));
    for_each(test.begin(), test.end(),
            Test(data)).print(cout << "Miss ");
    test.clear();
    generate_n(inserter(test, test.end()), max,
            LinearProgression<int>(0, 1));
    for_each(test.begin(), test.end(),
            Test(data)).print(cout << "All  ");
    }

My results for 1000 were:

Hit  algo 8183 member 10362
Miss algo 7526 member 10388
All  algo 15709 member 20750

The efficient algorithm is only 25% better than the inefficient member.
Is the difference worth changing the <tree> algorithm?  I wonder why the
two algorithms are not the same in a library.  History?

Maybe Code Warrior beats this with an efficient <tree>?

Maybe nobody cares?

John

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply John 10/25/2005 1:21:58 PM

On 25 Oct 2005 08:45:34 -0400, Howard Hinnant <howard.hinnant@gmail.com>
wrote:

> Here's a test case I'm running:

[snip]

> const int M = 5;
> const int N = 1000000;
> const int F = N-1;

[snip]

> float
> time_multiset_equal_range()
> {
>     clock_t t = clock();
>     {
>     std::pair<std::set<int>::iterator, std::set<int>::iterator> p =
> s.equal_range(F);

Is your implementation that slow or your clock that fast?

>     }
>     t = clock() - t;
>     return (float)(t/(double)CLOCKS_PER_SEC);
> }

[snip]

> int main()
> {
>     load(s);
>     float t1 = test(time_multiset_equal_range);
>     float t2 = test(time_multiset_lower_upper);
>     std::cout << t2/t1 << '\n';
> }

> For me it prints out (approximately):

> 1.24239

Knowing that it is your library, I will assume that you remember or
checked that equal_range in the tree really does do something other than
call lower and upper like most libraries.  However, I wonder what you
really measured with this test.

For me it prints NaNq because nothing is ever measured.  Adding a for
loop between the two calls of clock and executing 100 times gives
results.  I kicked it up to 100000 to get meaningful results.

Now I get 0.99 to 1.01 as expected for the gcc implementation.

> The point of the test is to compare equal_range to calls to
> lower/upper_bound.  In the implementation I'm running, something has
> obviously been done to consistently increase the speed of equal_range
> over calls to lower/upper_bound by about 24%.  Maybe that is significant
> for you, maybe it isn't.  Obviously the number you get is going to vary
> with the tweaking parameters M, N and F.  But if you start consistently
> getting a number very close to 1 (and you can define "very close"
> yourself), perhaps you and your vendor should have a little talk.  If
> the number I've reported here doesn't amount to a hill of beans for you,
> then no action at all is warranted.

Replacing the multiset with vector and using the general algorithms
where I know that equal_range is done well gives nice results.  Kicked
the loop up to 1000000 to get meaningful results.  Considering that you
used a search for N-1 which should go to a leaf before a match, I would
expect equal_range to be about twice as fast as the two calls.

I get 1.62 to 1.68.

If that 1.24 is really measuring something other than random clock tics,
maybe you should have a talk with your vendor.  :)

John

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply John 10/26/2005 1:07:27 PM

On 25 Oct 2005 08:45:34 -0400, Howard Hinnant <howard.hinnant@gmail.com>
wrote:

> For me it prints out (approximately):

> 1.24239

I modified the gcc library to do a "smart" equal_range.  Using your
tests with the loops to get meaningful results, I got about 2.  This is
a best case.  Changing F to 0 which searches down the other side making
two compares at each step, I got about 1.3.  This is a worst case.  On
average, we should expect about 1.5.

With the builtin < for int, turning on optimization gives more
improvements because the two compares are reduced to one.  The worst
case changed to about 1.7.  The unoptomized results above are realistic
for a complex compare.

John

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply John 10/26/2005 1:16:57 PM

In article <sJx7f.113$2y.40@newsread2.news.atl.earthlink.net>,
 John Potter <jpotter@lhup.edu> wrote:

> Is your implementation that slow or your clock that fast?

The clock is that fast.

> > For me it prints out (approximately):
> 
> > 1.24239
> 
> Knowing that it is your library, I will assume that you remember or
> checked that equal_range in the tree really does do something other than
> call lower and upper like most libraries.

<nod>

> However, I wonder what you
> really measured with this test.
> 
> For me it prints NaNq because nothing is ever measured.  Adding a for
> loop between the two calls of clock and executing 100 times gives
> results.  I kicked it up to 100000 to get meaningful results.

I should have mentioned the fast clock.

> Replacing the multiset with vector and using the general algorithms
> where I know that equal_range is done well gives nice results.  Kicked
> the loop up to 1000000 to get meaningful results.  Considering that you
> used a search for N-1 which should go to a leaf before a match, I would
> expect equal_range to be about twice as fast as the two calls.
> 
> I get 1.62 to 1.68.
> 
> If that 1.24 is really measuring something other than random clock tics,
> maybe you should have a talk with your vendor.  :)

To tell you the truth I was expecting something closer to what you got 
too.  But ...

In article <N4B7f.224$2y.141@newsread2.news.atl.earthlink.net>,
 John Potter <jpotter@lhup.edu> wrote:

> I modified the gcc library to do a "smart" equal_range.  Using your
> tests with the loops to get meaningful results, I got about 2.  This is
> a best case.  Changing F to 0 which searches down the other side making
> two compares at each step, I got about 1.3.  This is a worst case.  On
> average, we should expect about 1.5.
> 
> With the builtin < for int, turning on optimization gives more
> improvements because the two compares are reduced to one.  The worst
> case changed to about 1.7.  The unoptomized results above are realistic
> for a complex compare.

I'll just trust you to check that in, and thanks. :-)

-Howard

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Howard 10/27/2005 9:29:25 AM

Hello,

Thanks for benchmarking! As you see difference can be big, because even
20% is a big number, when working with long-lasting algorithm.
Anyway, it's up to library implementors to provide effective solution
vs. easy-to-maintain one, so let's hope they will do us a favour...

Regards,
     Armen


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply iso 10/27/2005 10:36:30 AM

Nemra� wrote:
> Hello,
> 
> Thanks for benchmarking! As you see difference can be big, because even
> 20% is a big number, when working with long-lasting algorithm.
> Anyway, it's up to library implementors to provide effective solution
> vs. easy-to-maintain one, so let's hope they will do us a favour...
> 

Both STLport and libstdc++-v3 (the implementation of the C++ standard
library in g++) are open source. I know libstdc++-v3 is always happy to
consider patches which improve the library, and I'm sure STLport is
similar. Remember a reasonable mumber of the "library implementators"
are just C++ users who decided to start hacking at whatever
implementation of the standard library they happened to have lying around :)

Chris

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply chris 10/27/2005 4:40:43 PM
comp.lang.c++.moderated 10649 articles. 9 followers. Post

15 Replies
192 Views

Similar Articles

[PageSpeed] 43


  • Permalink
  • submit to reddit
  • Email
  • Follow


Reply:

Similar Artilces:

STL implementation
I am working for a company that will not use the STL because the implementation is "unreliable" - we use MS Visual Studio v6. Can anybody point me to an article shedding any light on this? Is the implementation supplied with .NET 2003 any more "reliable"? Thanks, Guy [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] One thing up front: 'the STL' was a library invented by Hewlett Packard, which later strongly influenced the C++ standardlibrary. Both are distinct things now,...

Set implementation in STL
How are sets generally implemented in STL. I have read they are implemented as BST, are they some kind of balanced BST like Red Black trees? -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] On 2010-08-08, MC <manan.chopra@gmail.com> wrote: > > How are sets generally implemented in STL. I have read they are > implemented as BST, are they some kind of balanced BST like Red Black > trees? > The standard specifies only complexity requirements: insertion and search must...

stl vector performance
Hello, Is there any advantage in performing a reserve operation before inserting a large number of zeroes in an empty vector ? Let me be more specific. In the code below, does the second line bring any increase in performance ? vector<float> v; v.reserve(1000); v.insert(v.end(),1000,0.0); Note: I am not interested in using directly the constructor, e.g. vector<float> v(1000,0.0); Thank you. Cristian Barbarosie http://cmaf.ptmat.fc.ul.pt/~barbaros/ barbaros wrote: > Is there any advantage in performing a reserve operation before > inserting a large number of zeroes in an...

STL Performance Question
Why the huge drop in performance in STL from VC6.0 to VC7.1? Particularly with vector? The following code shows what I mean... Any thoughts? Thanks, B //Test the speed of operations on primitive arrays and vectors #pragma warning(disable: 4786) #include <vector> #include <time.h> #include <iostream> using namespace std; typedef vector<int> IntVec; int main(int argc, char* argv[]) { clock_t start, finish; start = clock(); const int dim1 = 1000; const int dim2 = 10000; int* matrix[dim1]; for(int i = 0; i < dim1; i++) { matrix[i] = new int[dim2]; fo...

implementing trees in STL
Hi, which stl class is good for creating search trees? I am looking for something flexible, that allows me to search for a required state (of matrices, graphs, etc.) quickly. thanks in advance, Craig. C++ Shark wrote: > Hi, > > which stl class is good for creating search trees? I am looking for > something flexible, that allows me to search for a required state (of > matrices, graphs, etc.) quickly. > Check out std::map and std::multimap. Gianni Mariani wrote: > C++ Shark wrote: > >> Hi, >> >> which stl class is good for creating search tr...

Broken STL implementation?
Is my implementation correct to print only a blank line when the following program is compiled and run? #include <iostream> #include <map> int main() { std::map< const char *, std::string > m; m["foo"]="bar"; std::cout << m["foo"] << std::endl; return 0; } g++ prints "bar", which is of course the behavior I want. If this is a(nother) problem with my implementation, what, if anything, can I do to work around it? -- Christopher Benson-Manica | I *should* know what I'm talking about - if I ataru(a...

implementation dependent performance
Hi, I wonder a bit about differences in implementation dependend = performance. I wrote a batch job which runs accross a database via = Uffi-Calls on unixODBC and found the following times for part A of my = batch CMUCL 692 seconds sbcl 24 seconds LispWorks 13.9 seconds Perhaps a bit unfair because on CMUCL there have been a lot of warnings = that optimizations aren't possible. Is this a know issue? Andreas Den Mon, 28 Jan 2008 05:59:33 +0100 skrev Andreas Thiele: > I wonder a bit about differences in implementation dependend > performance. I wrote a batch job which runs acc...

Performance Counters Implementation
Hello, I tried the examples in " Python Programming on Win32" book related to Python Services & Performance Counters. But in my PerfMon, performance objects are not listed. Could any one help me to solve that problem. Thanx in Advance. Saravanan D. I did the following the steps for registering Perfmon objects. * Created Performance registry entry for the Python service. * lodctr PipeService2_install.ini lodctr ...

Portable STL implementation?
Hi I'm looking for a header-file only STL implementation. I know the vendors provide one with compilers but often they are either buggy or has some other problems. My problem is that under Windows the different versions of the Microsoft compiler are not compatible with each other. This means you can't compile file with version X and link with version Y because you end up with linker errors. I've tried STLport but with lack of success. I have some problems getting it working with AMD64 and under Mac OS X it doesn't compile cleanly either. Any suggestions? -- John "Jo...

STL sort implementation
Hi, I have a question regarding SGI STL sort implementation: In case of equal elements, will they be output in the same order each time I sort? (I understand that it is not a stable sort, and by "same order" I mean "same order *each time I sort*"). Or is there a random number used in the splitter for qsort? Thanks Varun Kacholia wrote: > Hi, > I have a question regarding SGI STL sort implementation: > In case of equal elements, will they be output in the same order each > time I sort? > (I understand that it is not a stable sort, and by "same orde...

C++ STL List implementation
I wanted to know how the C++ STL list is implemented. Specifically, I wanted to understand the memory usage by the list objects. I have the following code snippet: std::list <int> L; L.push_back(10); L.push_back(20); printf("List size %d\n", sizeof(L)); This is returning the object size of L to be 4 bytes. How is this possible? What data members does the list object contain? Where are previous pointer, next pointer etc stored? Any insights would be appreciated. >> This is returning the object size of L to be 4 bytes. How is this possible? char * buffer = ne...

Benchmarks for STL/IOStream performance
What are the widely used benchmarks available for evaluating STL (and IOStream) performance? (other than Bench++). I am particularly interested in free versions. Thanks, -Ganesh [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] "Ganesh" <sgganesh@gmail.com> wrote in message news:619f36eb.0406280702.758ee456@posting.google.com... > What are the widely used benchmarks available for evaluating STL (and > IOStream) performance? (other than Bench++). I am particularly > interested in fr...

How to perform a move between STL containers
Hi, Is there is a way to *move* elements from one container to another with a single STL function? That is, can I move an element (or range of elements) between containers without doing a copy followed by a remove? I've got two containers (a queue and a list) holding shared_ptr's, and I'd like to move elements back and forth between them. I only need to move one element at a time, though I suspect a general solution would also accept a range. Thanks, Dennis Dennis Jones wrote: > I've got two containers (a queue and a list) holding shared_ptr's, and I'...

stl string class performance
Does anyone know if the stl string class implements some sort of pooling or chunking of memory for performance optimization? I know that the Lucent SCL has this built in, but not sure if the same is true of STL. The platform is Solaris 2.8. Thanks, Thomas "Thomas" <tdineen@att.com> wrote in message news:bl1obc$2ts13@kcweb01.netnews.att.com... > Does anyone know if the stl string class implements some sort of pooling or > chunking of memory for performance optimization? I know that the Lucent SCL > has this built in, but not sure if the same is true of STL. The ...

STL implementation of list::sort
I was wondering how is list member function sort implemented in STL. Since list has bidirectional iterator so any of the random access iterator based sorting algorithms (heap sort, quick sort etc) can not be applied on lists. Still the complexity of list::sort is NlogN. So which algorithm does it apply? I tried to look at algorithm.h where some sort algos use introsort ( a mix of quick and heap sort) , but still could not get how it is done for list. There is one way which I can think of the way STL might implement list::sort is that is copies list elements into a vector , applies an...

Performance of allocatable components implementation
I was reading the interesting text "Performance Comparison of Data Structure Implementations in Fortran 90" available at http://www.cs.rpi.edu/~nortonc/OOF90/penalty.html when I decide to test the performance of the new allocatable components feature using Intel Fortran Compiler 9.1.025. The test code used was practically the same that is available from the above mentioned page, I've just add the new allocatable components method along with the already existing pointer, layer and simple allocatable arrays implementations. To my surprise the implementation using allocatable compon...

link against stl (different implementation of)
hello everybody, I have an application that uses a 3D engine library compiled as a static library called EnRG.lib. My own code is normally compiled against the MS Visual C implementation of the stl, and everything works fine. I wanted to try to link the same code against the stlport. but I get a lot of linking errors (i report one of those at the end of the message) and I don't understand why... could it be because the EnRG lib is compiled with the MS's stl?? being it a static lib, shouldn't it incorporate the code it needs and leave me free to link my own code against the library...

Sources for checked STL implementations?
Having chased through some code that deleted a std::map<> element twice and then tried to print it out, I'd like to start using a checked implementation of the STL. Can anyone suggest sources? Open Source or commercial, either is fine. We use Gnu GCC and Microsoft VC++, if it matters. If a package works with both GCC and VC++ that'd be nice, but not necessary. Thanks! -- Al Dunstan, Software Engineer OptiMetrics, Inc. 3115 Professional Drive Ann Arbor, MI 48104-5131 A. W. Dunstan wrote: > Having chased through some code that deleted a std::map<> element twice a...

STL algorithm for implementing operator+=
Is there an STL algorithm that does something like for (iterator i = begin(); i != end(); ++i) f(*i,value); where f results in the expression *i += value. I already found the algorithm 'transform' with the 'plus' function object, but it does evaluate to *i = *i + value. And this is not exactly what I want. On Wed, 26 Jan 2005 11:28:25 +0100, Jef Driesen <jefdriesen@hotmail.com.nospam> wrote: > Is there an STL algorithm that does something like > > for (iterator i = begin(); i != end(); ++i) > f(*i,value); > > where f results in the expres...

std::string performance (Sun implementation)
Hi, I'm about to develop a new framework for my corporative applications and my first decision point is what kind of strings to use: std::string or classical C char*. Performance in my system is quite importante - it's not a realtime system, but almost - and I concern about std::string performance in terms of speed. No doubt to use std implementation is a lot easier but I can't sacrifice speed. I'm using Sun Workshop 6. A very basic test shows processing with std::string can be 3 times slower than using char*. Is there any improvement in later versions? Thanks, Jorge Ortiz ...

Scalability Vs Performance in J2EE implementation
Hi all,I am investigating the scalability of our product that deals toretrieve 45000 rows for each client. Max number of client is 5.I decided to migrate the Java code to J2EE code so that we can achievemuch scalability for the above mentioned scenario.My superiors concern was scalability can be achieved but at the sametime, performance will be major hit as there will be lot of lookupswill happen between the client code to server, ejb to db and so on forother j2ee components. And also, J2EE is used to ease the work of thedevelopers to develop the large scale applications.My Question is How can...

std::string performance (Sun implementation)
Hi, I'm about to develop a new framework for my corporative applications and my first decision point is what kind of strings to use: std::string or classical C char*. Performance in my system is quite importante - it's not a realtime system, but almost - and I concern about std::string performance in terms of speed. No doubt to use std implementation is a lot easier but I can't sacrifice speed. I'm using Sun Workshop 6. A very basic test shows processing with std::string can be 3 times slower than using char*. Is there any improvement in later versions? Thanks, Jorge Ortiz ...

Flaw in g++ implementation of stl vector
Hello, I noticed there is a flaw in the vector implementation of g++ (3.4). Basically when you erase an element from a vector, it calls the wrong destructor. In most cases this is not such a big issue, but if one were to store a class holding a resource, then the wrong resource would be freed. Here is a test case to show what I mean: (Disregard the destructors called for the temporaries). #include <vector> #include <iostream> class A { public: A(int i) : i_(i) { std::cout << "Making " << i_ << std::endl; } A(const A& other) ...

Implementing high-performance cache in Java
Hello, I would like to ask for your expert opinion regarding high-performance cache implementation in Java. My goal is to create a cache implementation that would handle large amount of concurrent reads (preferably without any locking) and can handle write concurrency in the best possible way. The updating of the records in the cache is of no concern; all updates are versioned and therefore represent separate records in the cache. I noticed the properties of the JDK's ConcurrentHashMap which seem to fit the objective very well - the class supports unrestricted retrieval concurrency with...