Consider this benchmark
--------------------------------------------------------C++
#include <iostream>
#include <list>
using namespace std;
// Simple example uses type int
#define MAXIT 50000000
int main(void)
{
list<int> L;
int i;
for (i=0; i<MAXIT;i++)
L.push_back(i); // Insert a new element at the end
list<int>::iterator it;
int sum=0;
for (it=L.begin(); it != L.end(); ++it)
sum += *it;
printf("%d\n",sum);
return 0;
}
--------------------------------------------------------C
/* compile with list.c and containererror.c */
#include "containers.h"
#include <stdio.h>
int main(void)
{
#define MAX_IT 50000000
List *l = newList(sizeof(int));
list_element *le;
int i,sum=0;
for (i=0; i<MAX_IT; i++) {
l->lpVtbl->Add(l,&i); // Insert at the end
}
for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
sum += *(int *)le->Data;
}
printf("%d\n",sum);
return 0;
}
---------------------------------------------------------
I think it would be interesting to see the results in as much
machines as possible.
Thanks
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/4/2009 9:59:19 PM |
|
In article <hab5re$16u$1@aioe.org>, jacob navia <jn@nospam.org> wrote:
>Consider this benchmark
>
>--------------------------------------------------------C++
>#include <iostream>
>#include <list>
>using namespace std;
>
>// Simple example uses type int
>#define MAXIT 50000000
>int main(void)
>{
> list<int> L;
> int i;
>
> for (i=0; i<MAXIT;i++)
> L.push_back(i); // Insert a new element at the end
> list<int>::iterator it;
> int sum=0;
> for (it=L.begin(); it != L.end(); ++it)
> sum += *it;
> printf("%d\n",sum);
> return 0;
>}
>
>--------------------------------------------------------C
>/* compile with list.c and containererror.c */
>#include "containers.h"
>#include <stdio.h>
>int main(void)
>{
>#define MAX_IT 50000000
>
> List *l = newList(sizeof(int));
> list_element *le;
> int i,sum=0;
> for (i=0; i<MAX_IT; i++) {
> l->lpVtbl->Add(l,&i); // Insert at the end
> }
> for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
> sum += *(int *)le->Data;
> }
> printf("%d\n",sum);
> return 0;
>}
>
>---------------------------------------------------------
>
>I think it would be interesting to see the results in as much
>machines as possible.
On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
and needs 1065 MB RAM.
|
|
0
|
|
|
|
Reply
|
ike5 (222)
|
10/4/2009 10:33:00 PM
|
|
For a PC with intel 8 core CPU+12GB RAM
I see:
cl -Ox -EHsc l1.cpp
timethis l1.cpp
TimeThis : Command Line : l1
TimeThis : Start Time : Mon Oct 05 00:38:01 2009
1283106752
TimeThis : Command Line : l1
TimeThis : Start Time : Mon Oct 05 00:38:01 2009
TimeThis : End Time : Mon Oct 05 00:38:05 2009
TimeThis : Elapsed Time : 00:00:04.167
For the C version I see:
timethis listtest
TimeThis : Command Line : listtest
TimeThis : Start Time : Mon Oct 05 00:36:46 2009
1283106752
TimeThis : Command Line : listtest
TimeThis : Start Time : Mon Oct 05 00:36:46 2009
TimeThis : End Time : Mon Oct 05 00:36:50 2009
TimeThis : Elapsed Time : 00:00:04.049
That version is compiled with lcc-win (lcc -O)
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/4/2009 10:41:40 PM
|
|
On Sun, 04 Oct 2009 23:59:19 +0200, jacob navia <jacob@nospam.org> wrote:
> Consider this benchmark
>
> --------------------------------------------------------C++
> #include <iostream>
> #include <list>
> using namespace std;
>
> // Simple example uses type int
> #define MAXIT 50000000
> int main(void)
> {
> list<int> L;
> int i;
>
> for (i=0; i<MAXIT;i++)
> L.push_back(i); // Insert a new element at the end
> list<int>::iterator it;
> int sum=0;
> for (it=L.begin(); it != L.end(); ++it)
> sum += *it;
> printf("%d\n",sum);
> return 0;
> }
>
> --------------------------------------------------------C
> /* compile with list.c and containererror.c */
> #include "containers.h"
> #include <stdio.h>
> int main(void)
> {
> #define MAX_IT 50000000
>
> List *l = newList(sizeof(int));
> list_element *le;
> int i,sum=0;
> for (i=0; i<MAX_IT; i++) {
> l->lpVtbl->Add(l,&i); // Insert at the end
> }
> for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
> sum += *(int *)le->Data;
> }
> printf("%d\n",sum);
> return 0;
> }
>
> ---------------------------------------------------------
>
> I think it would be interesting to see the results in as much
> machines as possible.
Hi Jacob,
Can you please post a link where we can fetch the container files too?
I've seen the related thread in c.l.c a few days ago, but the posts have
expired from my newsreader cache and are no longer easy to find.
A download link with a ZIP or plain tar(1) archive would be very nice.
|
|
0
|
|
|
|
Reply
|
keramida (459)
|
10/4/2009 11:14:16 PM
|
|
> Hi Jacob,
>
> Can you please post a link where we can fetch the container files too?
> I've seen the related thread in c.l.c a few days ago, but the posts have
> expired from my newsreader cache and are no longer easy to find.
>
> A download link with a ZIP or plain tar(1) archive would be very nice.
>
I posted the files again 4-5 hours ago.
They should still be there.
jacob
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/4/2009 11:43:02 PM
|
|
Ike Naar a �crit :
> In article <hab5re$16u$1@aioe.org>, jacob navia <jn@nospam.org> wrote:
>> Consider this benchmark
>>
>> --------------------------------------------------------C++
>> #include <iostream>
>> #include <list>
>> using namespace std;
>>
>> // Simple example uses type int
>> #define MAXIT 50000000
>> int main(void)
>> {
>> list<int> L;
>> int i;
>>
>> for (i=0; i<MAXIT;i++)
>> L.push_back(i); // Insert a new element at the end
>> list<int>::iterator it;
>> int sum=0;
>> for (it=L.begin(); it != L.end(); ++it)
>> sum += *it;
>> printf("%d\n",sum);
>> return 0;
>> }
>>
>> --------------------------------------------------------C
>> /* compile with list.c and containererror.c */
>> #include "containers.h"
>> #include <stdio.h>
>> int main(void)
>> {
>> #define MAX_IT 50000000
>>
>> List *l = newList(sizeof(int));
>> list_element *le;
>> int i,sum=0;
>> for (i=0; i<MAX_IT; i++) {
>> l->lpVtbl->Add(l,&i); // Insert at the end
>> }
>> for (le = GetFirst(l); le != NULL; le = GetNext(le)) {
>> sum += *(int *)le->Data;
>> }
>> printf("%d\n",sum);
>> return 0;
>> }
>>
>> ---------------------------------------------------------
>>
>> I think it would be interesting to see the results in as much
>> machines as possible.
>
> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
> and needs 1065 MB RAM.
>
The benchmark makes 50 million calls to malloc.
If malloc is well done, this is the same as C++. If not, it will be quite
bad performance.
The performance of the C version hasn't been optimized at all.
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/4/2009 11:44:23 PM
|
|
jacob navia <jacob@nospam.org> writes:
> Consider this benchmark
C++:
> #include <list>
C:
> #include "containers.h"
Your list is singly-linked is it not? The STL's list is a
doubly-linked list and is therefore more flexible. Comparing them for
speed alone is maybe a little biased.
--
Ben.
|
|
0
|
|
|
|
Reply
|
ben.usenet (6515)
|
10/5/2009 12:59:29 AM
|
|
On Mon, 05 Oct 2009 01:43:02 +0200, jacob navia <jacob@nospam.org> wrote:
>> Hi Jacob,
>> Can you please post a link where we can fetch the container files
>> too? I've seen the related thread in c.l.c a few days ago, but the
>> posts have expired from my newsreader cache and are no longer easy to
>> find.
>>
>> A download link with a ZIP or plain tar(1) archive would be very nice.
>
> I posted the files again 4-5 hours ago.
> They should still be there.
Ok, I tracked these down and I just finished running 30 iterations of
each test program on my laptop:
: Machine: Thinkpad X61s
: OS: FreeBSD 9.0-CURRENT [svn rev /head@197735]
: Arch: i386
: -- dmesg output --
: CPU: Intel(R) Core(TM)2 Duo CPU L7700 @ 1.80GHz (1795.51-MHz
: 686-class CPU)
: Origin = "GenuineIntel" Id = 0x6fb Stepping = 11
: Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
: Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
: AMD Features=0x20100000<NX,LM>
: AMD Features2=0x1<LAHF>
: TSC: P-state invariant
: real memory = 3221225472 (3072 MB)
: avail memory = 3119816704 (2975 MB)
Comparing the real/user/sys times of each program with the `ministat'
utility shows that there is a marked difference in wall clock times and
userland/user-level times:
: Real Times
: ==========
:
: x real-cpp
: + real-stdc
: +------------------------------------------------------------------------------+
: | ++ x |
: | ++ xx |
: | ++ xx |
: | ++ xx |
: | ++ xx |
: | ++ xx |
: | ++ xx |
: | ++ xx |
: | +++ xx |
: | +++ xx |
: | +++ xxx |
: | ++++ + xxx x x|
: ||_MA_| |___MA____| |
: +------------------------------------------------------------------------------+
: N Min Max Median Avg Stddev
: x 30 12.116 13.127 12.17 12.2203 0.20618943
: + 30 9.833 10.526 9.907 9.9256667 0.11781205
: Difference at 95.0% confidence
: -2.29463 +/- 0.0867998
: -18.7772% +/- 0.710292%
: (Student's t, pooled s = 0.167919)
: User Times
: ==========
:
: x user-cpp
: + user-stdc
: +------------------------------------------------------------------------------+
: | + |
: | + |
: | ++ |
: | ++ x |
: | +++ xx |
: | +++ xxx |
: | +++++ x xxxx |
: | +++++ x x xxxxxx |
: |+ ++++++ + x xx xxxxxxx x|
: | |__MA__| |____A___| |
: +------------------------------------------------------------------------------+
: N Min Max Median Avg Stddev
: x 30 10.241 11.347 10.579 10.569833 0.1915676
: + 30 8.044 8.98 8.35 8.3697333 0.14194388
: Difference at 95.0% confidence
: -2.2001 +/- 0.0871474
: -20.8149% +/- 0.824491%
: (Student's t, pooled s = 0.168592)
But a much smaller difference in system/kernel times:
: System Times
: ============
:
: x sys-cpp
: + sys-stdc
: +------------------------------------------------------------------------------+
: | + ++ + |
: | + x + ++ x ++ + xx x |
: |x +* x + +* x xx**+* **++** + xx* x +x xxx x + xx|
: | |_|_________MA_____M_A____|_____________| |
: +------------------------------------------------------------------------------+
: N Min Max Median Avg Stddev
: x 30 1.279 1.786 1.484 1.4974333 0.12388724
: + 30 1.309 1.735 1.438 1.4421 0.084948808
: Difference at 95.0% confidence
: -0.0553333 +/- 0.0549054
: -3.69521% +/- 3.66663%
: (Student's t, pooled s = 0.106218)
|
|
0
|
|
|
|
Reply
|
keramida (459)
|
10/5/2009 1:39:26 AM
|
|
jacob navia wrote:
(code elided)
> I think it would be interesting to see the results in as much
> machines as possible.
It might be interesting, but I don't think its really particularly
useful at all to do these sort of comparisons (except as some sort of
exercise in zealotry or to start flame-wars).
The performance depends too much on the compiler, optimiser, compiler
settings chosen, what else is running on the box, hardware spec etc.
--
Mark McIntyre
CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
|
|
0
|
|
|
|
Reply
|
markmcintyre2 (407)
|
10/5/2009 6:54:18 PM
|
|
On Oct 5, 12:44=A0am, jacob navia <ja...@nospam.org> wrote:
>
> The benchmark makes 50 million calls to malloc.
>
> If malloc is well done, this is the same as C++. If not, it will be quite
> bad performance.
What happens in the case of memory exhaustion?
In the C++ case, push_back() throws std::bad_alloc.
For similar semantics, do you need to check a return value from
l->lpVtbl->Add() ?
|
|
0
|
|
|
|
Reply
|
gwowen (520)
|
10/6/2009 11:23:15 AM
|
|
gwowen a �crit :
> On Oct 5, 12:44 am, jacob navia <ja...@nospam.org> wrote:
>> The benchmark makes 50 million calls to malloc.
>>
>> If malloc is well done, this is the same as C++. If not, it will be quite
>> bad performance.
>
> What happens in the case of memory exhaustion?
> In the C++ case, push_back() throws std::bad_alloc.
> For similar semantics, do you need to check a return value from
> l->lpVtbl->Add() ?
There are two mechanisms.
First, a callback will be called with the name of the function that failed, and
a numeric code indicating which error happened.
Second, (if the callback returns) a NULL value will be returned.
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/6/2009 11:31:27 AM
|
|
In article <habc0c$790$2@aioe.org>, jacob navia <jn@nospam.org> wrote:
>Ike Naar a �crit :
>> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
>> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
>> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
>> and needs 1065 MB RAM.
>
>The benchmark makes 50 million calls to malloc.
>If malloc is well done, this is the same as C++. If not, it will be quite
>bad performance.
In the above implementation, the C++ Standard Library uses malloc
under the hood.
|
|
0
|
|
|
|
Reply
|
ike5 (222)
|
10/6/2009 9:08:19 PM
|
|
Ike Naar a �crit :
> In article <habc0c$790$2@aioe.org>, jacob navia <jn@nospam.org> wrote:
>> Ike Naar a �crit :
>>> On a SUNW,UltraSPARC-IIIi, using the Sun C++ 5.9 and Sun C 5.9 compilers,
>>> the C++ version runs in 8.1 seconds CPU and needs 622 MB RAM,
>>> the C version (with list.c and containererror.c) runs in 16.4 seconds CPU
>>> and needs 1065 MB RAM.
>> The benchmark makes 50 million calls to malloc.
>> If malloc is well done, this is the same as C++. If not, it will be quite
>> bad performance.
>
> In the above implementation, the C++ Standard Library uses malloc
> under the hood.
Sure, but not 50 million times. I am implementing a simple heap manager
(that will be used by all containers) to avoid calling malloc 50
million times.
It was very instructive to see where the weak points are. For instance
I tried to use a little memory as possible, but since I am allocating
50 million small blocks of 8 bytes, overhead for malloc goes up to 100%!
It is better to allocate bigger blocks and not try to optimize memory as
this version does.
Thanks for your interest
jacob
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
10/6/2009 9:19:41 PM
|
|
|
12 Replies
33 Views
(page loaded in 0.334 seconds)
Similiar Articles: any free "c" benchmark codes ? - comp.compilersi would like to know are there any good and FREE benchmark codes in C or C++ , related to various fields, and which are nearly comparable to SPEC benc... matlab vs octave - comp.soft-sys.matlabHi, could anybody comment this terrible result of indexing benchmark: http://artax ... Code Performance - comp.soft-sys.matlab speed: Matlab vs C code - comp.soft-sys ... Is Fortran faster than C? - comp.programmingIt shows comparisons between a Fortran and a C implementation: <http ... GCC versions, which has nothing at all to do with > Fortran versus C. Fortran in the benchmarks ... AMD vs Intel timing on this code... - comp.lang.asm.x86linting C source code with embedded asm? - comp.unix.solaris ... rep movs ... conroe e6600 review benchmark vs amd fx 62 vs fx 60 vs intel 955 xe L1 code cache: 32 Kbyte ... Performance Prolog vs. Java - comp.lang.prologI am not attaching numbers, > because test is not complete > That means that the benchmark you used is not a good one. Some systems spend time preprocessing ... S10 malloc performance vs tcmalloc - comp.unix.solaris>Should have been: >tcmalloc (http://goog-perftools.sourceforge.net/doc/tcmalloc.html) I'd benchmark against libumem. Casper -- Expressed in this posting are my opinions. Count substrings in string, scan too slow - comp.lang.ruby ...An implementation using String#index and a loop is a little bit faster, but not too much: require 'benchmark' TIMES =3D 100_000 s =3D "you like to play with your ... Galois Multiplier - comp.lang.vhdlMy VHDL model have 64 AND and a few XOR !!! Thanks for good ideas... ... any free "c" benchmark codes ? - comp.compilers ... question about filter design vhdl ... what are these overhead cycles? - comp.dspHi all, In their DSP library literature, TI documents the benchmark in term of core and overhead cycles. For example, for cfir function in C55x DSP l... MEX in Matlab 7.10, 64 bit, Windows 7 - comp.soft-sys.matlab ...... open file 'kernel32.lib' NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 9 ... [Phoronix] Mac OS X 10.6.3 vs. Windows 7 vs. Ubuntu 10.04 Benchmarks ... C compiler from scratch. - comp.compilersHi , I am looking for C grammar that is suitable for building a (hand written) compiler from scratch.Does the grammar provided in appendix of K&R book... BufferedReader vs NIO Buffer - comp.lang.java.programmer ...Some benchmarks are in > order. > Isn't the main Payback of nio that you can use ... cost, having a thread around costs a bit in RAM (actually address space), and a ... Where did Fortran go? - comp.lang.fortran... if many see GCC just as backup compiler: At least for the Polyhedron benchmark on ... It kept complaining during >>> installation it wants Microsoft Visual studio be ... generate N perfect numbers. - comp.lang.prologThank you for another benchmark :-) [I am not mocking you, but I do at this point not care about the correctness of your program versus your intentions] I tested ... Character Recognition: Constant and Linear Models - comp.soft-sys ...Analysis of dataset in the large text compression benchmark - comp ..... its own internal word > > > > > recognition models ... Zipf's constant is about 7-20 million ... String Benchmark: Java vs. Objective-C vs. CSam Pullara of BEA got into a friendly bet on the speed of Java versus Objective-C. His hypothesis was that Java should actually be pretty damn fast due to the amount ... Head-to-head benchmark: C++ vs .NET - CodeProjectHow fast is C++ compared to C#? Let's compare code ported directly between the two languages.; Author: Qwertie; Updated: 5 Jul 2011; Section: Cross Platform; Chapter ... 7/2/2012 5:37:36 PM
|