PI program again

  • Follow


Hi.

I noticed I haven't gotten many more answers on the thread "Fast pi
program?" about the pi program. I'd really be curious to know if
specifically the multiplication routines can be made faster than
what's already in there (since that's what seems to be taking up most
of the time according to the profiler. Not that I'm surprised.). The
source code file is still available for download.

0
Reply mike3 9/4/2007 6:49:52 PM

On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
> Hi.
>
> I noticed I haven't gotten many more answers on the thread "Fast pi
> program?" about the pi program. I'd really be curious to know if
> specifically the multiplication routines can be made faster than
> what's already in there (since that's what seems to be taking up most
> of the time according to the profiler. Not that I'm surprised.). The
> source code file is still available for download.

Any answer?

0
Reply mike3 9/7/2007 6:29:56 PM


On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
> On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > Hi.
>
> > I noticed I haven't gotten many more answers on the thread "Fast pi
> > program?" about the pi program. I'd really be curious to know if
> > specifically the multiplication routines can be made faster than
> > what's already in there (since that's what seems to be taking up most
> > of the time according to the profiler. Not that I'm surprised.). The
> > source code file is still available for download.
>
> Any answer?

It's a fairly interesting topic.  But I found your license confusing
and the first time I tried it, there were too many problems to
continue.

Just profile it, and speed up the hot spots.

0
Reply user923005 9/7/2007 11:11:26 PM

On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
> On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > Hi.
>
> > > I noticed I haven't gotten many more answers on the thread "Fast pi
> > > program?" about the pi program. I'd really be curious to know if
> > > specifically the multiplication routines can be made faster than
> > > what's already in there (since that's what seems to be taking up most
> > > of the time according to the profiler. Not that I'm surprised.). The
> > > source code file is still available for download.
>
> > Any answer?
>
> It's a fairly interesting topic.  But I found your license confusing
> and the first time I tried it, there were too many problems to
> continue.
>
> Just profile it, and speed up the hot spots.

My license agreement was confusing? Could you explain,
please? I was just saying that you shouldn't redistribute
the program or any modified version without my permission,
as I was just releasing it for help with the speed, not a "full"
release. If I do go with a full release, then I will probably
release under a more relaxed license.

What were the problems you had when you tried it? You
were using GNU GCC to compile weren't you? Also, did
you get the most recent download, which does *not* use
a time zone library called "libtz"? If not, and that is related
to your problem, then you can get the new download here:

http://www.mediafire.com/?9mzltzjyizn

By the way I already profiled with gprof, and the hot spots
seem to be the multiplication routines, by the way, with the
FFTs/NTTs and all that. I'd also like some advice on the disk
math routines as I'm not sure if they could be improved
or not in terms of performance.

0
Reply mike3 9/8/2007 12:05:42 AM

On Sep 7, 5:05 pm, mike3 <mike4...@yahoo.com> wrote:
> On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
>
>
>
>
>
> > On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > Hi.
>
> > > > I noticed I haven't gotten many more answers on the thread "Fast pi
> > > > program?" about the pi program. I'd really be curious to know if
> > > > specifically the multiplication routines can be made faster than
> > > > what's already in there (since that's what seems to be taking up most
> > > > of the time according to the profiler. Not that I'm surprised.). The
> > > > source code file is still available for download.
>
> > > Any answer?
>
> > It's a fairly interesting topic.  But I found your license confusing
> > and the first time I tried it, there were too many problems to
> > continue.
>
> > Just profile it, and speed up the hot spots.
>
> My license agreement was confusing? Could you explain,
> please? I was just saying that you shouldn't redistribute
> the program or any modified version without my permission,
> as I was just releasing it for help with the speed, not a "full"
> release. If I do go with a full release, then I will probably
> release under a more relaxed license.
>
> What were the problems you had when you tried it? You
> were using GNU GCC to compile weren't you? Also, did
> you get the most recent download, which does *not* use
> a time zone library called "libtz"? If not, and that is related
> to your problem, then you can get the new download here:
>
> http://www.mediafire.com/?9mzltzjyizn
>
> By the way I already profiled with gprof, and the hot spots
> seem to be the multiplication routines, by the way, with the
> FFTs/NTTs and all that. I'd also like some advice on the disk
> math routines as I'm not sure if they could be improved
> or not in terms of performance

I can build it with gcc:
dcorbit@DCORBIT64 /c/junk/pisrc
$ make
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c nttxfm.c
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c crt.c
primes.h:18: warning: 'NTTroots' defined but not used
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c bigmul.c
crtmath.h:22: warning: 'crtcopy32' defined but not used
crtmath.h:53: warning: 'crtmulbsm32' defined but not used
crtmath.h:108: warning: 'crtmod32' defined but not used
primes.h:18: warning: 'NTTroots' defined but not used
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c blockint.c
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c diskint.c
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c newton.c
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c agm.c
gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c pib26.c
gcc -g -pg -o pib26 nttxfm.o crt.o bigmul.o blockint.o diskint.o
newton.o agm.o pib26.o -lm


but my best profiler tools are Windows based (Intel's VTUNE and
Microsoft's Profiler that comes with the Enterprise version of their
tool set).

Since your file is chock full of inline assembly in GAS syntax, there
is little hope of compiling it successfully using the Intel or MSVC++
compilers.

The things I could find with the gprof are the same things that you
found so I doubt that I can be of any help.

0
Reply user923005 9/8/2007 1:53:49 AM

On Sep 7, 7:53 pm, user923005 <dcor...@connx.com> wrote:
> On Sep 7, 5:05 pm, mike3 <mike4...@yahoo.com> wrote:
>
>
>
>
>
> > On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
>
> > > On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > > > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > Hi.
>
> > > > > I noticed I haven't gotten many more answers on the thread "Fast pi
> > > > > program?" about the pi program. I'd really be curious to know if
> > > > > specifically the multiplication routines can be made faster than
> > > > > what's already in there (since that's what seems to be taking up most
> > > > > of the time according to the profiler. Not that I'm surprised.). The
> > > > > source code file is still available for download.
>
> > > > Any answer?
>
> > > It's a fairly interesting topic.  But I found your license confusing
> > > and the first time I tried it, there were too many problems to
> > > continue.
>
> > > Just profile it, and speed up the hot spots.
>
> > My license agreement was confusing? Could you explain,
> > please? I was just saying that you shouldn't redistribute
> > the program or any modified version without my permission,
> > as I was just releasing it for help with the speed, not a "full"
> > release. If I do go with a full release, then I will probably
> > release under a more relaxed license.
>
> > What were the problems you had when you tried it? You
> > were using GNU GCC to compile weren't you? Also, did
> > you get the most recent download, which does *not* use
> > a time zone library called "libtz"? If not, and that is related
> > to your problem, then you can get the new download here:
>
> >http://www.mediafire.com/?9mzltzjyizn
>
> > By the way I already profiled with gprof, and the hot spots
> > seem to be the multiplication routines, by the way, with the
> > FFTs/NTTs and all that. I'd also like some advice on the disk
> > math routines as I'm not sure if they could be improved
> > or not in terms of performance
>
> I can build it with gcc:
> dcorbit@DCORBIT64 /c/junk/pisrc
> $ make
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c nttxfm.c
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c crt.c
> primes.h:18: warning: 'NTTroots' defined but not used
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c bigmul.c
> crtmath.h:22: warning: 'crtcopy32' defined but not used
> crtmath.h:53: warning: 'crtmulbsm32' defined but not used
> crtmath.h:108: warning: 'crtmod32' defined but not used
> primes.h:18: warning: 'NTTroots' defined but not used
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c blockint.c
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c diskint.c
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c newton.c
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c agm.c
> gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c pib26.c
> gcc -g -pg -o pib26 nttxfm.o crt.o bigmul.o blockint.o diskint.o
> newton.o agm.o pib26.o -lm
>
> but my best profiler tools are Windows based (Intel's VTUNE and
> Microsoft's Profiler that comes with the Enterprise version of their
> tool set).
>
> Since your file is chock full of inline assembly in GAS syntax, there
> is little hope of compiling it successfully using the Intel or MSVC++
> compilers.
>

I suppose I could rewrite the AT&T syntax (it's not called "GAS"
syntax)
assembler in Intel syntax, but since I was working with gcc, I
did not do it. I use gcc since I do not have the money to buy those
other compilers you mentioned.

> The things I could find with the gprof are the same things that you
> found so I doubt that I can be of any help.

So you don't think gprof is a good enough profiler, then? And I'd
bet that Enterprise edition of the Microsoft stuff would probably
cost sweet amounts of money I don't have.

And why couldn't the results from gprof be of any help,
anyways?


0
Reply mike3 9/8/2007 2:50:33 AM

On Sep 7, 7:50 pm, mike3 <mike4...@yahoo.com> wrote:
> On Sep 7, 7:53 pm, user923005 <dcor...@connx.com> wrote:
>
>
>
>
>
> > On Sep 7, 5:05 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
>
> > > > On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > > Hi.
>
> > > > > > I noticed I haven't gotten many more answers on the thread "Fast pi
> > > > > > program?" about the pi program. I'd really be curious to know if
> > > > > > specifically the multiplication routines can be made faster than
> > > > > > what's already in there (since that's what seems to be taking up most
> > > > > > of the time according to the profiler. Not that I'm surprised.). The
> > > > > > source code file is still available for download.
>
> > > > > Any answer?
>
> > > > It's a fairly interesting topic.  But I found your license confusing
> > > > and the first time I tried it, there were too many problems to
> > > > continue.
>
> > > > Just profile it, and speed up the hot spots.
>
> > > My license agreement was confusing? Could you explain,
> > > please? I was just saying that you shouldn't redistribute
> > > the program or any modified version without my permission,
> > > as I was just releasing it for help with the speed, not a "full"
> > > release. If I do go with a full release, then I will probably
> > > release under a more relaxed license.
>
> > > What were the problems you had when you tried it? You
> > > were using GNU GCC to compile weren't you? Also, did
> > > you get the most recent download, which does *not* use
> > > a time zone library called "libtz"? If not, and that is related
> > > to your problem, then you can get the new download here:
>
> > >http://www.mediafire.com/?9mzltzjyizn
>
> > > By the way I already profiled with gprof, and the hot spots
> > > seem to be the multiplication routines, by the way, with the
> > > FFTs/NTTs and all that. I'd also like some advice on the disk
> > > math routines as I'm not sure if they could be improved
> > > or not in terms of performance
>
> > I can build it with gcc:
> > dcorbit@DCORBIT64 /c/junk/pisrc
> > $ make
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c nttxfm.c
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c crt.c
> > primes.h:18: warning: 'NTTroots' defined but not used
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c bigmul.c
> > crtmath.h:22: warning: 'crtcopy32' defined but not used
> > crtmath.h:53: warning: 'crtmulbsm32' defined but not used
> > crtmath.h:108: warning: 'crtmod32' defined but not used
> > primes.h:18: warning: 'NTTroots' defined but not used
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c blockint.c
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c diskint.c
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c newton.c
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c agm.c
> > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c pib26.c
> > gcc -g -pg -o pib26 nttxfm.o crt.o bigmul.o blockint.o diskint.o
> > newton.o agm.o pib26.o -lm
>
> > but my best profiler tools are Windows based (Intel's VTUNE and
> > Microsoft's Profiler that comes with the Enterprise version of their
> > tool set).
>
> > Since your file is chock full of inline assembly in GAS syntax, there
> > is little hope of compiling it successfully using the Intel or MSVC++
> > compilers.
>
> I suppose I could rewrite the AT&T syntax (it's not called "GAS"
> syntax)
> assembler in Intel syntax, but since I was working with gcc, I
> did not do it. I use gcc since I do not have the money to buy those
> other compilers you mentioned.
>
> > The things I could find with the gprof are the same things that you
> > found so I doubt that I can be of any help.
>
> So you don't think gprof is a good enough profiler, then?

Actually, it might be good enough.  I don't use it unless there is no
other choice and so my lack of experience with that profiler may be
the real limiting step and not the capability of the profiler.
Some of the things that the high end profilers do is give you
suggestions about better formulations and show you what the bottleneck
in the process is (much more salient than where the time is going).

> And I'd
> bet that Enterprise edition of the Microsoft stuff would probably
> cost sweet amounts of money I don't have.

They do cost a bazillion dollars.  The Intel profiler is cheaper than
the MS profiler and just as good (but it is Intel specific and
disables much of the really excellent functionality if you try to use
it on AMD).

I think you can download the Intel compiler for Linux for free.  Maybe
you can get the profiler also.  You might check this stuff out:
http://www.intel.com/cd/software/products/asmo-na/eng/download/eval/219690.htm

I guess that you will get 20% faster just by using the Intel compiler
instead of GCC.

> And why couldn't the results from gprof be of any help,
> anyways?

Well you have them.  Did they help?

0
Reply user923005 9/10/2007 7:41:18 PM

On Sep 10, 1:41 pm, user923005 <dcor...@connx.com> wrote:
> On Sep 7, 7:50 pm, mike3 <mike4...@yahoo.com> wrote:
>
>
>
>
>
> > On Sep 7, 7:53 pm, user923005 <dcor...@connx.com> wrote:
>
> > > On Sep 7, 5:05 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
>
> > > > > On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > > > Hi.
>
> > > > > > > I noticed I haven't gotten many more answers on the thread "Fast pi
> > > > > > > program?" about the pi program. I'd really be curious to know if
> > > > > > > specifically the multiplication routines can be made faster than
> > > > > > > what's already in there (since that's what seems to be taking up most
> > > > > > > of the time according to the profiler. Not that I'm surprised.). The
> > > > > > > source code file is still available for download.
>
> > > > > > Any answer?
>
> > > > > It's a fairly interesting topic.  But I found your license confusing
> > > > > and the first time I tried it, there were too many problems to
> > > > > continue.
>
> > > > > Just profile it, and speed up the hot spots.
>
> > > > My license agreement was confusing? Could you explain,
> > > > please? I was just saying that you shouldn't redistribute
> > > > the program or any modified version without my permission,
> > > > as I was just releasing it for help with the speed, not a "full"
> > > > release. If I do go with a full release, then I will probably
> > > > release under a more relaxed license.
>
> > > > What were the problems you had when you tried it? You
> > > > were using GNU GCC to compile weren't you? Also, did
> > > > you get the most recent download, which does *not* use
> > > > a time zone library called "libtz"? If not, and that is related
> > > > to your problem, then you can get the new download here:
>
> > > >http://www.mediafire.com/?9mzltzjyizn
>
> > > > By the way I already profiled with gprof, and the hot spots
> > > > seem to be the multiplication routines, by the way, with the
> > > > FFTs/NTTs and all that. I'd also like some advice on the disk
> > > > math routines as I'm not sure if they could be improved
> > > > or not in terms of performance
>
> > > I can build it with gcc:
> > > dcorbit@DCORBIT64 /c/junk/pisrc
> > > $ make
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c nttxfm.c
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c crt.c
> > > primes.h:18: warning: 'NTTroots' defined but not used
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c bigmul.c
> > > crtmath.h:22: warning: 'crtcopy32' defined but not used
> > > crtmath.h:53: warning: 'crtmulbsm32' defined but not used
> > > crtmath.h:108: warning: 'crtmod32' defined but not used
> > > primes.h:18: warning: 'NTTroots' defined but not used
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c blockint.c
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c diskint.c
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c newton.c
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c agm.c
> > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c pib26.c
> > > gcc -g -pg -o pib26 nttxfm.o crt.o bigmul.o blockint.o diskint.o
> > > newton.o agm.o pib26.o -lm
>
> > > but my best profiler tools are Windows based (Intel's VTUNE and
> > > Microsoft's Profiler that comes with the Enterprise version of their
> > > tool set).
>
> > > Since your file is chock full of inline assembly in GAS syntax, there
> > > is little hope of compiling it successfully using the Intel or MSVC++
> > > compilers.
>
> > I suppose I could rewrite the AT&T syntax (it's not called "GAS"
> > syntax)
> > assembler in Intel syntax, but since I was working with gcc, I
> > did not do it. I use gcc since I do not have the money to buy those
> > other compilers you mentioned.
>
> > > The things I could find with the gprof are the same things that you
> > > found so I doubt that I can be of any help.
>
> > So you don't think gprof is a good enough profiler, then?
>
> Actually, it might be good enough.  I don't use it unless there is no
> other choice and so my lack of experience with that profiler may be
> the real limiting step and not the capability of the profiler.
> Some of the things that the high end profilers do is give you
> suggestions about better formulations and show you what the bottleneck
> in the process is (much more salient than where the time is going).
>

gprof simply tells, at least as far as I know, where the time
is going, into what routines. Although I have not had a huge
amount of experience with it either.

> > And I'd
> > bet that Enterprise edition of the Microsoft stuff would probably
> > cost sweet amounts of money I don't have.
>
> They do cost a bazillion dollars.  The Intel profiler is cheaper than
> the MS profiler and just as good (but it is Intel specific and
> disables much of the really excellent functionality if you try to use
> it on AMD).
>

How did you get this stuff, then? You must make a lot
of money.

> I think you can download the Intel compiler for Linux for free.  Maybe
> you can get the profiler also.  You might check this stuff out:http://www.intel.com/cd/software/products/asmo-na/eng/download/eval/2...
>

Looks like you can get a non-commercial version of both
items for free. Since this program is not a commercial
venture in any way, I might give this a try.

> I guess that you will get 20% faster just by using the Intel compiler
> instead of GCC.
>
> > And why couldn't the results from gprof be of any help,
> > anyways?
>
> Well you have them.  Did they help?

They told me what routines ate most of the time. It
appears the two NTT routines take up the most,
followed by the routine that emits digits with the
Chinese Remainder Theorem. (All are used to
multiply the big numbers.)

Perhaps someone else here could offer some more
help?

0
Reply mike3 9/10/2007 11:18:23 PM

"mike3" <mike4ty4@yahoo.com> wrote in message 
news:1189466303.056643.157520@o80g2000hse.googlegroups.com...
> On Sep 10, 1:41 pm, user923005 <dcor...@connx.com> wrote:

> Perhaps someone else here could offer some more
> help?
I always think that the book _Pi and the AGM_ by the Canadian mathematicians 
Borwein and Borwein is relevant to this topic.  The Canadians are the 
currert record-holders for the number of digits on pi.  User10^6 is always 
relevant for speed.
-- 
Wade Ward 


0
Reply Wade 9/11/2007 6:54:32 AM

On Sep 10, 5:18 pm, mike3 <mike4...@yahoo.com> wrote:
> On Sep 10, 1:41 pm, user923005 <dcor...@connx.com> wrote:
>
>
>
> > On Sep 7, 7:50 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > On Sep 7, 7:53 pm, user923005 <dcor...@connx.com> wrote:
>
> > > > On Sep 7, 5:05 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > On Sep 7, 5:11 pm, user923005 <dcor...@connx.com> wrote:
>
> > > > > > On Sep 7, 11:29 am, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > > > On Sep 4, 12:49 pm, mike3 <mike4...@yahoo.com> wrote:
>
> > > > > > > > Hi.
>
> > > > > > > > I noticed I haven't gotten many more answers on the thread "Fastpi> > > > > > > program?" about thepi program. I'd really be curious to know if
> > > > > > > > specifically the multiplication routines can be made faster than
> > > > > > > > what's already in there (since that's what seems to be taking up most
> > > > > > > > of the time according to the profiler. Not that I'm surprised.). The
> > > > > > > > source code file is still available for download.
>
> > > > > > > Any answer?
>
> > > > > > It's a fairly interesting topic.  But I found your license confusing
> > > > > > and the first time I tried it, there were too many problems to
> > > > > > continue.
>
> > > > > > Just profile it, and speed up the hot spots.
>
> > > > > My license agreement was confusing? Could you explain,
> > > > > please? I was just saying that you shouldn't redistribute
> > > > > the program or any modified version without my permission,
> > > > > as I was just releasing it for help with the speed, not a "full"
> > > > > release. If I do go with a full release, then I will probably
> > > > > release under a more relaxed license.
>
> > > > > What were the problems you had when you tried it? You
> > > > > were using GNU GCC to compile weren't you? Also, did
> > > > > you get the most recent download, which does *not* use
> > > > > a time zone library called "libtz"? If not, and that is related
> > > > > to your problem, then you can get the new download here:
>
> > > > >http://www.mediafire.com/?9mzltzjyizn
>
> > > > > By the way I already profiled with gprof, and the hot spots
> > > > > seem to be the multiplication routines, by the way, with the
> > > > > FFTs/NTTs and all that. I'd also like some advice on the disk
> > > > > math routines as I'm not sure if they could be improved
> > > > > or not in terms of performance
>
> > > > I can build it with gcc:
> > > > dcorbit@DCORBIT64 /c/junk/pisrc
> > > > $ make
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c nttxfm.c
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c crt.c
> > > > primes.h:18: warning: 'NTTroots' defined but not used
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c bigmul.c
> > > > crtmath.h:22: warning: 'crtcopy32' defined but not used
> > > > crtmath.h:53: warning: 'crtmulbsm32' defined but not used
> > > > crtmath.h:108: warning: 'crtmod32' defined but not used
> > > > primes.h:18: warning: 'NTTroots' defined but not used
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c blockint.c
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c diskint.c
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c newton.c
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c agm.c
> > > > gcc -g -pg -O3 -Wall -ffast-math -funroll-loops -c pib26.c
> > > > gcc -g -pg -o pib26 nttxfm.o crt.o bigmul.o blockint.o diskint.o
> > > > newton.o agm.o pib26.o -lm
>
> > > > but my best profiler tools are Windows based (Intel's VTUNE and
> > > > Microsoft's Profiler that comes with the Enterprise version of their
> > > > tool set).
>
> > > > Since your file is chock full of inline assembly in GAS syntax, there
> > > > is little hope of compiling it successfully using the Intel or MSVC++
> > > > compilers.
>
> > > I suppose I could rewrite the AT&T syntax (it's not called "GAS"
> > > syntax)
> > > assembler in Intel syntax, but since I was working with gcc, I
> > > did not do it. I use gcc since I do not have the money to buy those
> > > other compilers you mentioned.
>
> > > > The things I could find with the gprof are the same things that you
> > > > found so I doubt that I can be of any help.
>
> > > So you don't think gprof is a good enough profiler, then?
>
> > Actually, it might be good enough.  I don't use it unless there is no
> > other choice and so my lack of experience with that profiler may be
> > the real limiting step and not the capability of the profiler.
> > Some of the things that the high end profilers do is give you
> > suggestions about better formulations and show you what the bottleneck
> > in the process is (much more salient than where the time is going).
>
> gprof simply tells, at least as far as I know, where the time
> is going, into what routines. Although I have not had a huge
> amount of experience with it either.
>
> > > And I'd
> > > bet that Enterprise edition of the Microsoft stuff would probably
> > > cost sweet amounts of money I don't have.
>
> > They do cost a bazillion dollars.  The Intel profiler is cheaper than
> > the MS profiler and just as good (but it is Intel specific and
> > disables much of the really excellent functionality if you try to use
> > it on AMD).
>
> How did you get this stuff, then? You must make a lot
> of money.
>
> > I think you can download the Intel compiler for Linux for free.  Maybe
> > you can get the profiler also.  You might check this stuff out:http://www.intel.com/cd/software/products/asmo-na/eng/download/eval/2...
>
> Looks like you can get a non-commercial version of both
> items for free. Since this program is not a commercial
> venture in any way, I might give this a try.
>
> > I guess that you will get 20% faster just by using the Intel compiler
> > instead of GCC.
>
> > > And why couldn't the results from gprof be of any help,
> > > anyways?
>
> > Well you have them.  Did they help?
>
> They told me what routines ate most of the time. It
> appears the two NTT routines take up the most,
> followed by the routine that emits digits with the
> Chinese Remainder Theorem. (All are used to
> multiply the big numbers.)
>
> Perhaps someone else here could offer some more
> help?

Any answers? Looks like someone sent a response
but it I can't get the text of the response to show up
here on Google.

0
Reply mike3 9/15/2007 1:57:43 AM

9 Replies
78 Views

(page loaded in 0.08 seconds)

Similiar Articles:













7/18/2012 1:31:52 AM


Reply: