Is there any way to identify whether gfortran inlined a particular
function? I am working on a performance issue where I strongly suspect
that inlining may made a substantial difference. However I can not find
any mechanism for optimization reporting in gfortran at all. Something
like the "-opt-report" options of Intel fortran would fit the bill, but
perhaps I have missed something..
Keith Refson
|
|
0
|
|
|
|
Reply
|
Keith
|
12/5/2010 12:17:38 PM |
|
Keith Refson wrote:
> Is there any way to identify whether gfortran inlined a particular
> function?
You could try -fdump-tree-original and inspect the file. Or use
-fdump-ipa-inline.
> I am working on a performance issue where I strongly suspect
> that inlining may made a substantial difference.
Regarding inlining: Except for internal functions, GCC < 4.6 did not do
much inlining; this has changed with GCC 4.6, which should inline much
more functions - which might not always lead to better code. The default
value for -finline-limit= seems to be a bit too low for gfortran. You
could play with that parameter, e.g. setting it to -finline-limit=400.
Additionally, you can try "-flto -fwhole-program" (on GCC 4.6) which
might also help. (LTO = link-time optimization, i.e. optimization
between different files; -fwhole-program = the single file (or with LTO:
all given files) is encompass the complete program.)
For curiosity: Which gfortran version do you use and do you have the
feeling that there is too little, too much or the wrong kind of inlining?
Tobias
|
|
0
|
|
|
|
Reply
|
Tobias
|
12/5/2010 12:41:18 PM
|
|
On 05/12/10 12:41, Tobias Burnus wrote:
> Keith Refson wrote:
>> Is there any way to identify whether gfortran inlined a particular
>> function?
>
> You could try -fdump-tree-original and inspect the file. Or use
> -fdump-ipa-inline.
Thanks. I will try that.
>> I am working on a performance issue where I strongly suspect
>> that inlining may made a substantial difference.
>
> Regarding inlining: Except for internal functions, GCC < 4.6 did not do
> much inlining; this has changed with GCC 4.6, which should inline much
> more functions - which might not always lead to better code. The default
> value for -finline-limit= seems to be a bit too low for gfortran. You
> could play with that parameter, e.g. setting it to -finline-limit=400.
> Additionally, you can try "-flto -fwhole-program" (on GCC 4.6) which
> might also help. (LTO = link-time optimization, i.e. optimization
> between different files; -fwhole-program = the single file (or with LTO:
> all given files) is encompass the complete program.)
>
> For curiosity: Which gfortran version do you use and do you have the
> feeling that there is too little, too much or the wrong kind of inlining?
I am using 4.5.1, and the platform is "powerpc64-unknown-linux-gnu".
This is one I compiled myself, but I have been unable to build with
ppl and cloog on this platform - the bootstrap fails with
incomprehensible C++ link-time error messages. I am therefore unable to
try the lto which is a shame as it makes a considerable performance
improvement on x86_64 platforms.
There's a particular small function called repeatedly inside a loop
and with other compilers inlining has given a substantial speedup.
Inlining should allow scope for better loop optimisation in the caller.
Keith Refson
>
> Tobias
|
|
0
|
|
|
|
Reply
|
Keith
|
12/5/2010 3:24:14 PM
|
|
On 12/5/2010 7:24 AM, Keith Refson wrote:
> On 05/12/10 12:41, Tobias Burnus wrote:
>> Keith Refson wrote:
>>> Is there any way to identify whether gfortran inlined a particular
>>> function?
>>
>> You could try -fdump-tree-original and inspect the file. Or use
>> -fdump-ipa-inline.
>
> Thanks. I will try that.
>
>
>>> I am working on a performance issue where I strongly suspect
>>> that inlining may made a substantial difference.
>>
>> Regarding inlining: Except for internal functions, GCC< 4.6 did not do
>> much inlining; this has changed with GCC 4.6, which should inline much
>> more functions - which might not always lead to better code. The default
>> value for -finline-limit= seems to be a bit too low for gfortran. You
>> could play with that parameter, e.g. setting it to -finline-limit=400.
>> Additionally, you can try "-flto -fwhole-program" (on GCC 4.6) which
>> might also help. (LTO = link-time optimization, i.e. optimization
>> between different files; -fwhole-program = the single file (or with LTO:
>> all given files) is encompass the complete program.)
>>
>> For curiosity: Which gfortran version do you use and do you have the
>> feeling that there is too little, too much or the wrong kind of inlining?
>
> I am using 4.5.1, and the platform is "powerpc64-unknown-linux-gnu".
> This is one I compiled myself, but I have been unable to build with
> ppl and cloog on this platform - the bootstrap fails with
> incomprehensible C++ link-time error messages. I am therefore unable to
> try the lto which is a shame as it makes a considerable performance
> improvement on x86_64 platforms.
>
> There's a particular small function called repeatedly inside a loop
> and with other compilers inlining has given a substantial speedup.
> Inlining should allow scope for better loop optimisation in the caller.
>
> Keith Refson
>
>
If the function is really small and repeatedly called inside a loop, why
not just manually copy the function code into the loop and be done with it.
|
|
0
|
|
|
|
Reply
|
baf
|
12/5/2010 4:22:45 PM
|
|
In article <8m1sijFvk1U1@mid.individual.net>, baf <baf@nowhere.net>
wrote:
> > There's a particular small function called repeatedly inside a loop
> > and with other compilers inlining has given a substantial speedup.
> > Inlining should allow scope for better loop optimisation in the caller.
> >
> If the function is really small and repeatedly called inside a loop, why
> not just manually copy the function code into the loop and be done with it.
This may be the best solution. Also, is the function external,
module, or internal? Does the function have an explicit interface?
It might make a difference with efficiency, with the internal
function being the most likely to either be inlined or to have the
best performance. If you have control over the arguments, then it
is sometimes possible to make minor changes in the declarations that
have significant impact on the efficiency. For example, are you
passing an assumed-shape array actual argument to either an
assumed-size or explicit-shape array dummy argument? Do you have
unnecessary pointer or target attributes on the arguments? Should
some arguments be declared INTENT(IN) that have some other attribute
(including no INTENT at all)? Is the function declared as RECURSIVE
when it is not used recursively? Should the function have the PURE
attribute, or could you make minor changes in order to allow it to
do so?
$.02 -Ron Shepard
|
|
0
|
|
|
|
Reply
|
Ron
|
12/5/2010 8:13:56 PM
|
|
baf <baf@nowhere.net> wrote:
(snip)
> If the function is really small and repeatedly called inside a loop, why
> not just manually copy the function code into the loop and be done with it.
Or use statement functions. Compilers should inline them, though
I don't actually know what current compilers do.
-- glen
|
|
0
|
|
|
|
Reply
|
glen
|
12/5/2010 8:35:28 PM
|
|
|
5 Replies
412 Views
(page loaded in 2.753 seconds)
|