Benchmark C vs C++

  • Follow


Consider this benchmark

--------------------------------------------------------C++
#include <iostream>
#include <list>
using namespace std;

// Simple example uses type int
#define MAXIT 50000000
int main(void)
{
    list<int> L;
    int i;

    for (i=0; i<MAXIT;i++)
        L.push_back(i);              // Insert a new element at the end
    list<int>::iterator it;
    int sum=0;
    for (it=L.begin(); it != L.end(); ++it)
         sum += *it;
    printf("%d\n",sum);
    return 0;
}

--------------------------------------------------------C
/* compile with list.c and containererror.c */
#include "containers.h"
#include <stdio.h>
int main(void)
{
#define MAX_IT 50000000

	List *l = newList(sizeof(int));
	list_element *le;
	int i,sum=0;
	for (i=0; i<MAX_IT; i++) {
		l->lpVtbl->Add(l,&i); // Insert at the end
	}
	for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
		sum += *(int *)le->Data;
	}
	printf("%d\n",sum);
	return 0;
}

---------------------------------------------------------

I think it would be interesting to see the results in as much
machines as possible.

Thanks
0
Reply jacob24 (973) 10/4/2009 9:59:19 PM

In article <hab5re$16u$1@aioe.org>, jacob navia  <jn@nospam.org> wrote:
>Consider this benchmark
>
>--------------------------------------------------------C++
>#include <iostream>
>#include <list>
>using namespace std;
>
>// Simple example uses type int
>#define MAXIT 50000000
>int main(void)
>{
>    list<int> L;
>    int i;
>
>    for (i=0; i<MAXIT;i++)
>        L.push_back(i);              // Insert a new element at the end
>    list<int>::iterator it;
>    int sum=0;
>    for (it=L.begin(); it != L.end(); ++it)
>         sum += *it;
>    printf("%d\n",sum);
>    return 0;
>}
>
>--------------------------------------------------------C
>/* compile with list.c and containererror.c */
>#include "containers.h"
>#include <stdio.h>
>int main(void)
>{
>#define MAX_IT 50000000
>
>	List *l = newList(sizeof(int));
>	list_element *le;
>	int i,sum=0;
>	for (i=0; i<MAX_IT; i++) {
>		l->lpVtbl->Add(l,&i); // Insert at the end
>	}
>	for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
>		sum += *(int *)le->Data;
>	}
>	printf("%d\n",sum);
>	return 0;
>}
>
>---------------------------------------------------------
>
>I think it would be interesting to see the results in as much
>machines as possible.

On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
and needs 1065 MB RAM.

0
Reply ike5 (222) 10/4/2009 10:33:00 PM


For a PC with intel 8 core CPU+12GB RAM
I see:

cl -Ox -EHsc l1.cpp
timethis l1.cpp
TimeThis :  Command Line :  l1
TimeThis :    Start Time :  Mon Oct 05 00:38:01 2009

1283106752

TimeThis :  Command Line :  l1
TimeThis :    Start Time :  Mon Oct 05 00:38:01 2009
TimeThis :      End Time :  Mon Oct 05 00:38:05 2009
TimeThis :  Elapsed Time :  00:00:04.167

For the C version I see:
timethis listtest

TimeThis :  Command Line :  listtest
TimeThis :    Start Time :  Mon Oct 05 00:36:46 2009

1283106752

TimeThis :  Command Line :  listtest
TimeThis :    Start Time :  Mon Oct 05 00:36:46 2009
TimeThis :      End Time :  Mon Oct 05 00:36:50 2009
TimeThis :  Elapsed Time :  00:00:04.049

That version is compiled with lcc-win (lcc -O)
0
Reply jacob24 (973) 10/4/2009 10:41:40 PM

On Sun, 04 Oct 2009 23:59:19 +0200, jacob navia <jacob@nospam.org> wrote:
> Consider this benchmark
>
> --------------------------------------------------------C++
> #include <iostream>
> #include <list>
> using namespace std;
>
> // Simple example uses type int
> #define MAXIT 50000000
> int main(void)
> {
>    list<int> L;
>    int i;
>
>    for (i=0; i<MAXIT;i++)
>        L.push_back(i);              // Insert a new element at the end
>    list<int>::iterator it;
>    int sum=0;
>    for (it=L.begin(); it != L.end(); ++it)
>         sum += *it;
>    printf("%d\n",sum);
>    return 0;
> }
>
> --------------------------------------------------------C
> /* compile with list.c and containererror.c */
> #include "containers.h"
> #include <stdio.h>
> int main(void)
> {
> #define MAX_IT 50000000
>
> 	List *l = newList(sizeof(int));
> 	list_element *le;
> 	int i,sum=0;
> 	for (i=0; i<MAX_IT; i++) {
> 		l->lpVtbl->Add(l,&i); // Insert at the end
> 	}
> 	for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
> 		sum += *(int *)le->Data;
> 	}
> 	printf("%d\n",sum);
> 	return 0;
> }
>
> ---------------------------------------------------------
>
> I think it would be interesting to see the results in as much
> machines as possible.

Hi Jacob,

Can you please post a link where we can fetch the container files too?
I've seen the related thread in c.l.c a few days ago, but the posts have
expired from my newsreader cache and are no longer easy to find.

A download link with a ZIP or plain tar(1) archive would be very nice.

0
Reply keramida (459) 10/4/2009 11:14:16 PM

> Hi Jacob,
> 
> Can you please post a link where we can fetch the container files too?
> I've seen the related thread in c.l.c a few days ago, but the posts have
> expired from my newsreader cache and are no longer easy to find.
> 
> A download link with a ZIP or plain tar(1) archive would be very nice.
> 

I posted the files again 4-5 hours ago.

They should still be  there.

jacob
0
Reply jacob24 (973) 10/4/2009 11:43:02 PM

Ike Naar a �crit :
> In article <hab5re$16u$1@aioe.org>, jacob navia  <jn@nospam.org> wrote:
>> Consider this benchmark
>>
>> --------------------------------------------------------C++
>> #include <iostream>
>> #include <list>
>> using namespace std;
>>
>> // Simple example uses type int
>> #define MAXIT 50000000
>> int main(void)
>> {
>>    list<int> L;
>>    int i;
>>
>>    for (i=0; i<MAXIT;i++)
>>        L.push_back(i);              // Insert a new element at the end
>>    list<int>::iterator it;
>>    int sum=0;
>>    for (it=L.begin(); it != L.end(); ++it)
>>         sum += *it;
>>    printf("%d\n",sum);
>>    return 0;
>> }
>>
>> --------------------------------------------------------C
>> /* compile with list.c and containererror.c */
>> #include "containers.h"
>> #include <stdio.h>
>> int main(void)
>> {
>> #define MAX_IT 50000000
>>
>> 	List *l = newList(sizeof(int));
>> 	list_element *le;
>> 	int i,sum=0;
>> 	for (i=0; i<MAX_IT; i++) {
>> 		l->lpVtbl->Add(l,&i); // Insert at the end
>> 	}
>> 	for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
>> 		sum += *(int *)le->Data;
>> 	}
>> 	printf("%d\n",sum);
>> 	return 0;
>> }
>>
>> ---------------------------------------------------------
>>
>> I think it would be interesting to see the results in as much
>> machines as possible.
> 
> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
> and needs 1065 MB RAM.
> 

The benchmark makes 50 million calls to malloc.

If malloc is well done, this is the same as C++. If not, it will be quite
bad performance.

The performance of the C version hasn't been optimized at all.
0
Reply jacob24 (973) 10/4/2009 11:44:23 PM

jacob navia <jacob@nospam.org> writes:

> Consider this benchmark

C++:
> #include <list>

C:
> #include "containers.h"

Your list is singly-linked is it not?  The STL's list is a
doubly-linked list and is therefore more flexible.  Comparing them for
speed alone is maybe a little biased.

-- 
Ben.
0
Reply ben.usenet (6515) 10/5/2009 12:59:29 AM

On Mon, 05 Oct 2009 01:43:02 +0200, jacob navia <jacob@nospam.org> wrote:
>> Hi Jacob,
>> Can you please post a link where we can fetch the container files
>> too?  I've seen the related thread in c.l.c a few days ago, but the
>> posts have expired from my newsreader cache and are no longer easy to
>> find.
>>
>> A download link with a ZIP or plain tar(1) archive would be very nice.
>
> I posted the files again 4-5 hours ago.
> They should still be  there.

Ok, I tracked these down and I just finished running 30 iterations of
each test program on my laptop:

: Machine: Thinkpad X61s
: OS:      FreeBSD 9.0-CURRENT [svn rev /head@197735]
: Arch:    i386
: -- dmesg output --
: CPU: Intel(R) Core(TM)2 Duo CPU     L7700  @ 1.80GHz (1795.51-MHz
:     686-class CPU)
:   Origin = "GenuineIntel"  Id = 0x6fb  Stepping = 11
:   Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
:   Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
:   AMD Features=0x20100000<NX,LM>
:   AMD Features2=0x1<LAHF>
:   TSC: P-state invariant
: real memory  = 3221225472 (3072 MB)
: avail memory = 3119816704 (2975 MB)

Comparing the real/user/sys times of each program with the `ministat'
utility shows that there is a marked difference in wall clock times and
userland/user-level times:

:                                Real Times
:                                ==========
:
: x real-cpp
: + real-stdc
: +------------------------------------------------------------------------------+
: |  ++                                                   x                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: |  ++                                                  xx                      |
: | +++                                                  xx                      |
: | +++                                                  xx                      |
: | +++                                                  xxx                     |
: | ++++            +                                    xxx            x       x|
: ||_MA_|                                             |___MA____|                |
: +------------------------------------------------------------------------------+
:     N           Min           Max        Median           Avg        Stddev
: x  30        12.116        13.127         12.17       12.2203    0.20618943
: +  30         9.833        10.526         9.907     9.9256667    0.11781205
: Difference at 95.0% confidence
:         -2.29463 +/- 0.0867998
:         -18.7772% +/- 0.710292%
:         (Student's t, pooled s = 0.167919)


:                                User Times
:                                ==========
:
: x user-cpp
: + user-stdc
: +------------------------------------------------------------------------------+
: |       +                                                                      |
: |       +                                                                      |
: |      ++                                                                      |
: |      ++                                                   x                  |
: |      +++                                                  xx                 |
: |      +++                                                 xxx                 |
: |      +++++                                            x  xxxx                |
: |      +++++                                        x   x xxxxxx               |
: |+    ++++++           +                            x  xx xxxxxxx             x|
: |    |__MA__|                                          |____A___|              |
: +------------------------------------------------------------------------------+
:     N           Min           Max        Median           Avg        Stddev
: x  30        10.241        11.347        10.579     10.569833     0.1915676
: +  30         8.044          8.98          8.35     8.3697333    0.14194388
: Difference at 95.0% confidence
:         -2.2001 +/- 0.0871474
:         -20.8149% +/- 0.824491%
:         (Student's t, pooled s = 0.168592)

But a much smaller difference in system/kernel times:

:                               System Times
:                               ============
:
: x sys-cpp
: + sys-stdc
: +------------------------------------------------------------------------------+
: |     +                 ++     +                                               |
: |     +           x   + ++ x   ++ + xx                x                        |
: |x    +* x +   +* x xx**+* **++** + xx*  x  +x       xxx   x          +      xx|
: |            |_|_________MA_____M_A____|_____________|                         |
: +------------------------------------------------------------------------------+
:     N           Min           Max        Median           Avg        Stddev
: x  30         1.279         1.786         1.484     1.4974333    0.12388724
: +  30         1.309         1.735         1.438        1.4421   0.084948808
: Difference at 95.0% confidence
:         -0.0553333 +/- 0.0549054
:         -3.69521% +/- 3.66663%
:         (Student's t, pooled s = 0.106218)

0
Reply keramida (459) 10/5/2009 1:39:26 AM

jacob navia wrote:
(code elided)

> I think it would be interesting to see the results in as much
> machines as possible.

It might be interesting, but I don't think its really particularly 
useful at all to do these sort of comparisons (except as some sort of 
exercise in zealotry or to start flame-wars).

The performance depends too much on the compiler, optimiser, compiler 
settings chosen, what else is running on the box, hardware spec etc.


-- 
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
0
Reply markmcintyre2 (407) 10/5/2009 6:54:18 PM

On Oct 5, 12:44=A0am, jacob navia <ja...@nospam.org> wrote:
>
> The benchmark makes 50 million calls to malloc.
>
> If malloc is well done, this is the same as C++. If not, it will be quite
> bad performance.

What happens in the case of memory exhaustion?
In the C++ case, push_back() throws std::bad_alloc.
For similar semantics, do you need to check a return value from
l->lpVtbl->Add() ?
0
Reply gwowen (520) 10/6/2009 11:23:15 AM

gwowen a �crit :
> On Oct 5, 12:44 am, jacob navia <ja...@nospam.org> wrote:
>> The benchmark makes 50 million calls to malloc.
>>
>> If malloc is well done, this is the same as C++. If not, it will be quite
>> bad performance.
> 
> What happens in the case of memory exhaustion?
> In the C++ case, push_back() throws std::bad_alloc.
> For similar semantics, do you need to check a return value from
> l->lpVtbl->Add() ?

There are two mechanisms.
First, a callback will be called with the name of the function that failed, and
a numeric code indicating which error happened.

Second, (if the callback returns) a NULL value will be returned.

0
Reply jacob24 (973) 10/6/2009 11:31:27 AM

In article <habc0c$790$2@aioe.org>, jacob navia  <jn@nospam.org> wrote:
>Ike Naar a �crit :
>> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
>> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
>> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
>> and needs 1065 MB RAM.
>
>The benchmark makes 50 million calls to malloc.
>If malloc is well done, this is the same as C++. If not, it will be quite
>bad performance.

In the above implementation, the C++ Standard Library uses malloc
under the hood.
0
Reply ike5 (222) 10/6/2009 9:08:19 PM

Ike Naar a �crit :
> In article <habc0c$790$2@aioe.org>, jacob navia  <jn@nospam.org> wrote:
>> Ike Naar a �crit :
>>> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
>>> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
>>> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
>>> and needs 1065 MB RAM.
>> The benchmark makes 50 million calls to malloc.
>> If malloc is well done, this is the same as C++. If not, it will be quite
>> bad performance.
> 
> In the above implementation, the C++ Standard Library uses malloc
> under the hood.

Sure, but not 50 million times. I am implementing a simple heap manager
(that will be used by all  containers) to avoid calling malloc 50
million times.

It was very instructive to see where the weak points are. For instance
I tried to use a little memory as possible, but since I am allocating
50 million small blocks of 8 bytes, overhead for malloc goes up to 100%!

It is better to allocate bigger blocks and not try to optimize memory as
this version does.

Thanks for your interest

jacob
0
Reply jacob24 (973) 10/6/2009 9:19:41 PM

12 Replies
33 Views

(page loaded in 0.334 seconds)

Similiar Articles:


















7/2/2012 5:37:36 PM


Reply: