I am looking for a fast memcpy() that will work well on
32-bit machines for both Linux and Windows.
|
|
0
|
|
|
|
Reply
|
Peter
|
4/27/2010 4:20:23 AM |
|
On 04/27/10 04:20 PM, Peter Olcott wrote:
> I am looking for a fast memcpy() that will work well on
> 32-bit machines for both Linux and Windows.
What's wrong with the standard one?
--
Ian Collins
|
|
0
|
|
|
|
Reply
|
Ian
|
4/27/2010 4:24:37 AM
|
|
On Apr 27, 7:20=A0am, "Peter Olcott" <NoS...@OCR4Screen.com> wrote:
> I am looking for a fast memcpy() that will work well on
> 32-bit machines for both Linux and Windows.
That is perhaps a question for comp.lang.asm.x86 group. Scan the posts
there, no doubt someone has published something. We just discuss safe
and conformant C++ programming techniques here.
|
|
0
|
|
|
|
Reply
|
ISO
|
4/27/2010 5:12:45 AM
|
|
"Ian Collins" <ian-news@hotmail.com> wrote in message
news:83n785Fb0lU1@mid.individual.net...
> On 04/27/10 04:20 PM, Peter Olcott wrote:
>> I am looking for a fast memcpy() that will work well on
>> 32-bit machines for both Linux and Windows.
>
> What's wrong with the standard one?
>
> --
> Ian Collins
Is it smart enough to copy 32 bits at a time, even when the
memory is not aligned on a 32 bit boundary?
|
|
0
|
|
|
|
Reply
|
Peter
|
4/27/2010 2:07:25 PM
|
|
On 2010-04-27, Peter Olcott <NoSpam@OCR4Screen.com> wrote:
>
> "Ian Collins" <ian-news@hotmail.com> wrote in message
> news:83n785Fb0lU1@mid.individual.net...
>> On 04/27/10 04:20 PM, Peter Olcott wrote:
>>> I am looking for a fast memcpy() that will work well on
>>> 32-bit machines for both Linux and Windows.
>>
>> What's wrong with the standard one?
>>
>> --
>> Ian Collins
>
> Is it smart enough to copy 32 bits at a time, even when the
> memory is not aligned on a 32 bit boundary?
>
If you aren't sure, it would be trivial to write a wrapper
that would copy the first %32 bits manually then copy the
remaining (32-bit aligned) memory using memcpy.
But I would expect any library memcpy to already do this.
--
Andrew Poelstra
http://www.wpsoftware.net/andrew
|
|
0
|
|
|
|
Reply
|
Andrew
|
4/27/2010 2:42:12 PM
|
|
On Tue, 2010-04-27, Peter Olcott wrote:
>
> "Ian Collins" <ian-news@hotmail.com> wrote in message
> news:83n785Fb0lU1@mid.individual.net...
>> On 04/27/10 04:20 PM, Peter Olcott wrote:
>>> I am looking for a fast memcpy() that will work well on
>>> 32-bit machines for both Linux and Windows.
>>
>> What's wrong with the standard one?
> Is it smart enough to copy 32 bits at a time, even when the
> memory is not aligned on a 32 bit boundary?
Why don't you check for yourself?
The one in OpenBSD is; I've read it.
Also don't forget the one builtin to your compiler; the one that knows
a whole lot more about the arguments than a library function does.
Depending on what you're doing, even a perfect library function can be
much slower.
/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
|
|
0
|
|
|
|
Reply
|
Jorgen
|
4/27/2010 2:44:06 PM
|
|
Peter Olcott wrote:
> I am looking for a fast memcpy() that will work well on
> 32-bit machines for both Linux and Windows.
>
>
Use the library one.
It's going to be memory bus limited on any modern system, even if all it
does is load and store single bytes. (BTW the Windows one is actually
pretty good)
Andy
|
|
0
|
|
|
|
Reply
|
Andy
|
4/27/2010 8:03:47 PM
|
|
On 27-04-2010 06:20, Peter Olcott wrote:
> I am looking for a fast memcpy() that will work well on
> 32-bit machines for both Linux and Windows.
>
>
I tested various memory copy routines during my studies. One of the best
approach was mixing fixed and floating point instructions - 3 times
faster that standard loop. It is worth to try SSE unit.
Regards
Marek
|
|
0
|
|
|
|
Reply
|
Marek
|
4/27/2010 8:39:53 PM
|
|
On Apr 27, 11:42=A0pm, Andrew Poelstra <apoels...@localhost.localdomain>
wrote:
> On 2010-04-27, Peter Olcott <NoS...@OCR4Screen.com> wrote:
> > "Ian Collins" <ian-n...@hotmail.com> wrote in message
> >news:83n785Fb0lU1@mid.individual.net...
> >> On 04/27/10 04:20 PM, Peter Olcott wrote:
> >>> I am looking for a fast memcpy() that will work well on
> >>> 32-bit machines for both Linux and Windows.
>
> >> What's wrong with the standard one?
>
> >> --
> >> Ian Collins
>
> > Is it smart enough to copy 32 bits at a time, even when the
> > memory is not aligned on a 32 bit boundary?
>
> If you aren't sure, it would be trivial to write a wrapper
> that would copy the first %32 bits manually then copy the
> remaining (32-bit aligned) memory using memcpy.
>
> But I would expect any library memcpy to already do this.
>
> Andrew Poelstrahttp://www.wpsoftware.net/andrew
Exactly... people have thought of all these issues decades ago... not
just 32 bits at a time, but various things that can only be done in
machine code inexpressible in C++, as well as loop unrolling and
inlining, using known alignment and size information etc.... Best
idea, forget all about it - especially if this probably ties in to
your other thread which I got the impression was about trying to use
memcpy() to outperform compiler-generated copy constructors or
something...?
Cheers,
Tony
|
|
0
|
|
|
|
Reply
|
tonydee
|
4/28/2010 8:50:28 AM
|
|
I beat std::string by a factor of 125-fold once, thus I
expect that improvements can be made to memcpy() at least on
my platform.
"tonydee" <tony_in_da_uk@yahoo.co.uk> wrote in message
news:a310a9d7-0bf0-4178-afb5-cb4b6adc9d5e@j36g2000prj.googlegroups.com...
On Apr 27, 11:42 pm, Andrew Poelstra
<apoels...@localhost.localdomain>
wrote:
> On 2010-04-27, Peter Olcott <NoS...@OCR4Screen.com> wrote:
> > "Ian Collins" <ian-n...@hotmail.com> wrote in message
> >news:83n785Fb0lU1@mid.individual.net...
> >> On 04/27/10 04:20 PM, Peter Olcott wrote:
> >>> I am looking for a fast memcpy() that will work well
> >>> on
> >>> 32-bit machines for both Linux and Windows.
>
> >> What's wrong with the standard one?
>
> >> --
> >> Ian Collins
>
> > Is it smart enough to copy 32 bits at a time, even when
> > the
> > memory is not aligned on a 32 bit boundary?
>
> If you aren't sure, it would be trivial to write a wrapper
> that would copy the first %32 bits manually then copy the
> remaining (32-bit aligned) memory using memcpy.
>
> But I would expect any library memcpy to already do this.
>
> Andrew Poelstrahttp://www.wpsoftware.net/andrew
Exactly... people have thought of all these issues decades
ago... not
just 32 bits at a time, but various things that can only be
done in
machine code inexpressible in C++, as well as loop unrolling
and
inlining, using known alignment and size information etc....
Best
idea, forget all about it - especially if this probably ties
in to
your other thread which I got the impression was about
trying to use
memcpy() to outperform compiler-generated copy constructors
or
something...?
Cheers,
Tony
|
|
0
|
|
|
|
Reply
|
Peter
|
4/28/2010 2:20:38 PM
|
|
"Andy Champ" <no.way@nospam.invalid> wrote in message
news:IOCdnbbiCq283krWnZ2dnUVZ8nVi4p2d@eclipse.net.uk...
> Peter Olcott wrote:
>> I am looking for a fast memcpy() that will work well on
>> 32-bit machines for both Linux and Windows.
>
> Use the library one.
>
> It's going to be memory bus limited on any modern system,
> even if all it does is load and store single bytes. (BTW
> the Windows one is actually pretty good)
>
> Andy
This one looks pretty good, and it is standard C/C++
http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
|
|
0
|
|
|
|
Reply
|
Peter
|
4/28/2010 2:47:42 PM
|
|
Peter Olcott wrote:
> "Andy Champ" <no.way@nospam.invalid> wrote in message
> news:IOCdnbbiCq283krWnZ2dnUVZ8nVi4p2d@eclipse.net.uk...
>> Peter Olcott wrote:
>>> I am looking for a fast memcpy() that will work well on
>>> 32-bit machines for both Linux and Windows.
>>
>> Use the library one.
>>
>> It's going to be memory bus limited on any modern system,
>> even if all it does is load and store single bytes. (BTW
>> the Windows one is actually pretty good)
>>
>> Andy
>
> This one looks pretty good, and it is standard C/C++
> http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
And you believe that nobody on the compiler team hasn't already read
this? In the 1980s?
We aren't doing Motorola 68000 anymore.
Bo Persson
|
|
0
|
|
|
|
Reply
|
Bo
|
4/28/2010 6:33:32 PM
|
|
"Bo Persson" <bop@gmb.dk> wrote in message
news:83rdbsFpjlU1@mid.individual.net...
> Peter Olcott wrote:
>> "Andy Champ" <no.way@nospam.invalid> wrote in message
>> news:IOCdnbbiCq283krWnZ2dnUVZ8nVi4p2d@eclipse.net.uk...
>>> Peter Olcott wrote:
>>>> I am looking for a fast memcpy() that will work well on
>>>> 32-bit machines for both Linux and Windows.
>>>
>>> Use the library one.
>>>
>>> It's going to be memory bus limited on any modern
>>> system,
>>> even if all it does is load and store single bytes.
>>> (BTW
>>> the Windows one is actually pretty good)
>>>
>>> Andy
>>
>> This one looks pretty good, and it is standard C/C++
>> http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
>
> And you believe that nobody on the compiler team hasn't
> already read this? In the 1980s?
Yes I am sure that no one read this February 2010 article in
the 1980's
>
>
> We aren't doing Motorola 68000 anymore.
>
>
> Bo Persson
>
>
|
|
0
|
|
|
|
Reply
|
Peter
|
4/28/2010 7:20:30 PM
|
|
On Apr 28, 5:47=A0pm, "Peter Olcott" <NoS...@OCR4Screen.com> wrote:
> "Andy Champ" <no....@nospam.invalid> wrote in message
>
> > Use the library one.
>
> This one looks pretty good, and it is standard C/C++
> =A0 =A0http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
This is maybe faster than on some embedded system with junky C runtime
librares. You better do some tests and compare with assignment
operator of C++ compiled with some modern compiler on Windows and
Linux. For these you originally asked that thing. I bet it loses.
|
|
0
|
|
|
|
Reply
|
ISO
|
4/28/2010 7:32:23 PM
|
|
On Apr 28, 11:20=A0pm, "Peter Olcott" <NoS...@OCR4Screen.com> wrote:
> I beat std::string by a factor of 125-fold once, thus I
> expect that improvements can be made to memcpy() at least on
> my platform.
It's not a meaningful comparison. The design compromises in strings
are massively more complicated than a memcpy (e.g. dynamic memory
allocation vs embedded short strings or better suited capacity
resizing defaults, reference counting). You won't have outperformed
std::string because anything was simply overlooked in its
implementation.
Cheers,
Tony
|
|
0
|
|
|
|
Reply
|
tonydee
|
4/30/2010 1:50:23 AM
|
|
|
14 Replies
564 Views
(page loaded in 0.196 seconds)
Similiar Articles: Fast memcpy() ??? - comp.lang.c++I am looking for a fast memcpy() that will work well on 32-bit machines for both Linux and Windows. ... fast memset/memcpy - comp.lang.asm.x86Probably most of the experts in this group know this, but I was surprised to find that the rep* instructions I've been using for memset/memcpy type op... mencpy 128 bytes - comp.lang.asm.x86hi all, i am very newbie i search a very fast method for copy 128 bytes on intel xeon 5400 serie for now i use memcpy (intensive use: builtin_memcp... Reading packets into a buffer - comp.os.linux.networking ...The transferring of data to/from CPU/GPU and the GPU processing is very fast (multi-Gbps). The bottleneck lies in the memcpy-ing packets to buffer portion, which can ... memcpy from within matlab - comp.soft-sys.matlabFast memcpy() ??? - comp.lang.c++ memcpy from within matlab - comp.soft-sys.matlab Fast bit-reverse on an x86? - comp.dsp optimizing memcpy() functions e.g. by prefetching ... Convert month name to month number faster - comp.lang.python ...Fast memcpy() ??? - comp.lang.c++ Fast memcpy() ??? - comp.lang.c++ Convert month name to month number faster - comp.lang ... Converting month name to month number - Wrox ... Image processing the frame buffer under OpenGL - comp.graphics.api ...Fast memcpy() ??? - comp.lang.c++ Framebuffer size - comp.graphics.api.opengl Fast memcpy() ??? - comp.lang.c++ Image processing the frame buffer under OpenGL - comp ... Slow string search/fast binary search - comp.lang.asm.x86 ...Fast memcpy() ??? - comp.lang.c++... asm.x86 hi all, i am very newbie i search a very fast ... bit-reverse on an x86? - comp.dsp fast is still better than real slow. ... 128-bit MMX versus 32-bit memory copy - comp.lang.asm.x86 ...Fast memcpy() ??? - comp.lang.c++ > > -- > Ian Collins Is it smart enough to copy 32 bits at a time, even when the memory is not aligned on a 32 bit boundary? ... newbie i ... Counting people in image sequence - comp.soft-sys.matlab ...Fast memcpy() ??? - comp.lang.c++... www.wpsoftware.net/andrew Exactly... people have ... suited capacity resizing defaults, reference counting). ... fast binary search ... Microblaze and external block memory - comp.arch.fpgafast memset/memcpy - comp.lang.asm.x86 Any one has a example of MBCS_EXTERNAL_TO_TANDEM_ - comp.sys ... fast memset/memcpy ... to copy data between memory locations? rep movs instruction - comp.lang.asm.x86fast memset/memcpy - comp.lang.asm.x86... this group know this, but I was surprised to find that the rep* instructions I ... Lots of variables, like size of block and ... Fast bit-reverse on an x86? - comp.dspCustomer says it's not real time, but real fast is still better than real slow. ... However, there are several articles dealing with optimizing memcpy() functions e.g ... Sockets in gfortran? - comp.lang.fortranfast memset/memcpy - comp.lang.asm.x86 Sockets in gfortran? - comp.lang.fortran fast memset/memcpy - comp.lang.asm.x86 Sockets in gfortran? - comp.lang.fortran... find ... Speed-up the reading of large binary files with complex structures ...The loop for the snapshot_counter is fast, but the loop regarding to the point ... you mean that the data is copied from some original address to new memory ala memcpy ... Daniel's Software Blog: Fast memcpy in cThis article describes a fast and portable memcpy implementation that can replace the standard library version of memcpy when higher performance is needed. c - Very fast memcpy for image processing? - Stack OverflowI am doing image processing in C that requires copying large chunks of data around memory - the source and destination never overlap. What is the absolute fastest way ... 7/25/2012 6:10:49 PM
|