// some code here
{
int a; // beginning of block
goto next;
int b;
}
next:
// and some code here also...:)
In above code, how much size of stack are allocated at the beginning
of the block? 4 or 8(assume that int is 4 byte) byte?
Is it more efficient if I move up >>int b;<< to next line of >>int
a;<<?
thanks.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
hongseok.yoon (19)
|
3/26/2009 3:16:47 AM |
|
On Mar 26, 5:16 am, hongseok.y...@gmail.com wrote:
> // some code here
>
> {
> int a; // beginning of block
> goto next;
> int b;}
>
> next:
>
> // and some code here also...:)
>
> In above code, how much size of stack are allocated at the beginning
> of the block? 4 or 8(assume that int is 4 byte) byte?
> Is it more efficient if I move up >>int b;<< to next line of >>int
> a;<<?
Uninitialized, unused variables have no behavior, and may or may not
be optimized away entirely. It either uses 8 bytes of stack space
(optimizer removing nothing), 4 (optimizer removing dead code), or 0
(removing unused variables and dead code), depending on the compiler.
Without initialization, no work is done, and there is nothing to gain
by moving the declaration of b. For that matter, the goto is
needless, as there is nothing to skip. But as it stands, a smart
optimizer might generate nothing for the code you have shown, since it
doesn't do anything. You might also get warnings for unused
variables.
Chris
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Chris
|
3/26/2009 1:35:48 PM
|
|
hongseok.yoon@gmail.com wrote:
> // some code here
>
> {
> int a; // beginning of block
> goto next;
> int b;
> }
> next:
>
> // and some code here also...:)
>
> In above code, how much size of stack are allocated at the beginning
> of the block? 4 or 8(assume that int is 4 byte) byte?
> Is it more efficient if I move up >>int b;<< to next line of >>int
> a;<<?
>
1. Any good compiler will complain about b never being used.
2. Does it matter? Have you benchmarked to determine that this
particular construct is actually your bottleneck? Please
allow me to repeat Knuth's Law: "Premature optimization
is the root of all evil." And I have just now invented
red floyd's Corollary: "99% of all micro-optimizations are
better performed by the compiler."
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
red
|
3/26/2009 1:36:40 PM
|
|
On 26 mar, 10:16, hongseok.y...@gmail.com wrote:
> // some code here
>
> {
> int a; // beginning of block
> goto next;
> int b;}
>
> next:
>
> // and some code here also...:)
Why are you using gotos?
> In above code, how much size of stack are allocated at the beginning
> of the block? 4 or 8(assume that int is 4 byte) byte?
Well, since here b can never be accessed anyway, it won't exist at
all.
> Is it more efficient if I move up >>int b;<< to next line of >>int
> a;<<?
That's not how you design things.
You declare a variable when you can give it a proper value and make
the scope as restricted as possible for its usage.
The compiler will do whatever smart thing there is to do.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Mathias
|
3/26/2009 1:37:04 PM
|
|
hongseok.yoon@gmail.com wrote in news:0293620c-7040-4dd1-887b-
d0649c6a7acd@j18g2000prm.googlegroups.com:
> In above code, how much size of stack are allocated at the beginning
> of the block? 4 or 8(assume that int is 4 byte) byte?
> Is it more efficient if I move up >>int b;<< to next line of >>int
> a;<<?
The compiler will typically pre-allocate the memory for everything in the
function (ie. add a constant to the stack pointer one time at the top) but
only invoke constructors and destructors as the declarations are reached.
A good compiler will identify the maximum amount of storage needed and
share stack space between objects in different non-nested blocks.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Kenneth
|
3/26/2009 1:41:04 PM
|
|
On Mar 26, 4:16 am, hongseok.y...@gmail.com wrote:
> // some code here
Other posters have pointed out that your example is weird in a few
ways. I'll try to answer the questions I think you're asking.
// Assuming main() won't be inlined.
int main()
{
// Just for fun.
int c;
functionCompiledAndLinkedCompletelySeparately(c);
// Inner block.
if (true)
{
// Assuming you meant for 'a' to be something the compiler
// couldn't optimize away entirely:
int a;
functionCompiledAndLinkedCompletelySeparately(&a);
// Assuming you meant for 'b' to be something trivially
// discarded by an optimizing compiler:
if (false)
{
int b;
}
}
functionCompiledAndLinkedCompletelySeparately(0);
}
> In above code, how much size of stack are allocated at the beginning
> of the block? 4 or 8(assume that int is 4 byte) byte?
In my experience, usually 'a' would simply increase the size of the
fixed stack allocation made on entry to main() by sizeof(int). It
would probably live right after 'c' on the stack. This small increase
is not going to cost even a single additional instruction.
> Is it more efficient if I move up >>int b;<< to next line of >>int
> a;<<?
Probably not, because 'b' will be discarded altogether. Even if it
couldn't be, it would probably be put in the same fixed allocation as
'a'.
All this is just speculation based on experience. Your compiler may
very well behave differently. For example, rumor has it that in the
old days, some compilers couldn't optimize any function that had even
a single goto in it.
If you really need to care about that kind of thing, write it in
assembler.
- Marsh
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Marsh
|
3/27/2009 2:05:30 AM
|
|
> hongseok.yoon@gmail.com wrote:
>> // some code here
>>
>> {
>> int a; // beginning of block
>> goto next;
>> int b;
>> }
>> next:
'goto's are evil. Avoid them (*).
(*) There are many discussions and rationales about this on the internet.
>> // and some code here also...:)
>>
>> In above code, how much size of stack are allocated at the beginning
>> of the block? 4 or 8(assume that int is 4 byte) byte?
>> Is it more efficient if I move up >>int b;<< to next line of >>int
>> a;<<?
Why do you assume that the compiler will allocate any stack space for a
or b? A good compiler (with optimization turned on) would only allocate
space if the variables are used,
and --if they were used-- it would most probably store the values in
registers only when possible, so still no stack space.
But all this depends on the compiler quality, the architecture and
everything else.
red floyd wrote:
> 1. Any good compiler will complain about b never being used.
a and b are not used in this example.
> 2. Does it matter? Have you benchmarked to determine that this
> particular construct is actually your bottleneck? Please
> allow me to repeat Knuth's Law: "Premature optimization
> is the root of all evil." And I have just now invented
> red floyd's Corollary: "99% of all micro-optimizations are
> better performed by the compiler."
Thomas' co-corollary: "_all_ micro-optimizations are better performed by
the compiler, except the ones the compiler can't handle."
--
Thomas
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Thomas
|
3/27/2009 2:07:04 AM
|
|
On Mar 26, 12:41 pm, Kenneth Porter <shiva.blackl...@sewingwitch.com>
wrote:
> hongseok.y...@gmail.com wrote in news:0293620c-7040-4dd1-887b-
> d0649c6a7...@j18g2000prm.googlegroups.com:
>
> > In above code, how much size of stack are allocated at the beginning
> > of the block? 4 or 8(assume that int is 4 byte) byte?
> > Is it more efficient if I move up >>int b;<< to next line of >>int
> > a;<<?
>
> The compiler will typically pre-allocate the memory for everything in the
> function (ie. add a constant to the stack pointer one time at the top) but
> only invoke constructors and destructors as the declarations are reached.
>
> A good compiler will identify the maximum amount of storage needed and
> share stack space between objects in different non-nested blocks.
Actually, I suspect no. This is the knapsack problem (or something
very much like it), so I hope a compiler isn't brute forcing the
solution to a NP complete problem. Now, you can get pretty close with
a minor pessimizations and do it P time, but not the actual optimal
solution. (The pessimization is just allocate storage for the stack
objects lifetime.)
For example:
int foo(int a, int b, int c, int d)
{ int m = a + b;
int n = m + c;
int o = n + d;
return o;
}
I expect most compilers to allocate space so that a stack variable
will have storage for its lifetime. A stack frame for this function
would look like:
0-3 a
4-7 b
8-11 c
12-15 d
16-19 m
20-23 n
24-27 o
However, a really smart compiler could see that the following stack
frame would suffice (using the "as-if" rule):
0-3 a
4-7 b
8-11 c
12-15 d
16-19 m, o
20-23 n
And this entire discussion is partially moot because we're getting
into register allocation schemes and lots of hardware dependent
details.
However, I don't know. Do compilers actually solve this NP-complete
problem when compiling? It might not be that bad as it's a NP problem
with input size of the number of stack variables which isn't that
large most of the time. I suspect no.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
joshuamaurice
|
3/27/2009 2:07:51 AM
|
|
On 27 Mar, 08:07, joshuamaur...@gmail.com wrote:
> However, I don't know. Do compilers actually solve this NP-complete
> problem when compiling? It might not be that bad as it's a NP problem
> with input size of the number of stack variables which isn't that
> large most of the time. I suspect no.
They certainly attempt to solve some NP-complete problems. For
instance, the optimal placing of blocks is equivalent to the
travelling salesman problem. G++ used to have an option -ftsp-ordering
flag to actually solve (I assume heuristically) this problem.
We're getting outside the scope of this thread, and I suspect
comp.compilers is a better place for further discussion.
-Ed
--
(You can't go wrong with psycho-rats.)(http://mi.eng.cam.ac.uk/~er258)
/d{def}def/f{/Times s selectfont}d/s{11}d/r{roll}d f 2/m{moveto}d -1
r 230 350 m 0 1 179{ 1 index show 88 rotate 4 mul 0 rmoveto}for/s 12
d f pop 235 420 translate 0 0 moveto 1 2 scale show showpage
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Edward
|
3/28/2009 2:30:45 AM
|
|
joshuamaurice@gmail.com wrote:
> On Mar 26, 12:41 pm, Kenneth Porter <shiva.blackl...@sewingwitch.com>
> wrote:
>> hongseok.y...@gmail.com wrote in news:0293620c-7040-4dd1-887b-
>> d0649c6a7...@j18g2000prm.googlegroups.com:
>>
>> > In above code, how much size of stack are allocated at the
>> > beginning of the block? 4 or 8(assume that int is 4 byte) byte?
>> > Is it more efficient if I move up >>int b;<< to next line of >>int
>> > a;<<?
>>
>> The compiler will typically pre-allocate the memory for everything in
>> the function (ie. add a constant to the stack pointer one time at the
>> top) but only invoke constructors and destructors as the declarations
>> are reached.
>>
>> A good compiler will identify the maximum amount of storage needed
>> and share stack space between objects in different non-nested blocks.
>
> Actually, I suspect no. This is the knapsack problem (or something
> very much like it), so I hope a compiler isn't brute forcing the
> solution to a NP complete problem.
No. Most optimising compilers will only re-use memory storage for
objects with dis-joint lifetimes.
The register-allocation scheme is usually more sophisticated, in that it
can recognise the situation where an object allocated to a register is
used as operand now and not ever afterwards. Then the compiler will
typically re-use that register for the result of the operation without
storing the old value to memory.
> Now, you can get pretty close with
> a minor pessimizations and do it P time, but not the actual optimal
> solution. (The pessimization is just allocate storage for the stack
> objects lifetime.)
>
> For example:
> int foo(int a, int b, int c, int d)
> { int m = a + b;
> int n = m + c;
> int o = n + d;
> return o;
> }
>
> I expect most compilers to allocate space so that a stack variable
> will have storage for its lifetime. A stack frame for this function
> would look like:
> 0-3 a
> 4-7 b
> 8-11 c
> 12-15 d
> 16-19 m
> 20-23 n
> 24-27 o
I would expect an optimising compiler to use 16 or 20 bytes of stack and
one register. 16 bytes for the parameters a,b,c and d and optionally 4
bytes for the result (if that is passed through the stack).
This assumes a calling convention where all parameters are passed on the
stack. Return values could be on the stack or in a register.
>
> However, a really smart compiler could see that the following stack
> frame would suffice (using the "as-if" rule):
> 0-3 a
> 4-7 b
> 8-11 c
> 12-15 d
> 16-19 m, o
> 20-23 n
>
> And this entire discussion is partially moot because we're getting
> into register allocation schemes and lots of hardware dependent
> details.
>
> However, I don't know. Do compilers actually solve this NP-complete
> problem when compiling? It might not be that bad as it's a NP problem
> with input size of the number of stack variables which isn't that
> large most of the time. I suspect no.
To my knowledge, yes compilers do solve this problem. Somehow, I am even
not convinced it is NP-complete.
If I were to write a mechanism for this, I would combine it with
register allocation in a four-pass algorithm:
1. Determine for each used object it's usage scope (where within its
lifetime the object is last used; including the call to a non-trivial
destructor)
2. Perform register allocation (using the usage information from the
previous pass). Keep track of which objects can not be eliminated
completely from memory.
3. going through all the blocks, calculate how much memory is needed to
store all objects (that need storing in memory) at the end of
leaf-blocks in the nesting tree. The maximum of these calculations is
the amount of stack space needed for the function.
4. Now you can use a simple stack mechanism to allocate memory locations
on the stack to each object.
Except for the register allocation, each step is clearly P. I am not
familiar enough with the algorithms for register allocation to tell if
they are P or NP.
Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://c-faq.com/
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Bart
|
3/28/2009 3:00:05 AM
|
|
Bart van Ingen Schenau wrote:
> joshuamaurice@gmail.com wrote:
>
>> On Mar 26, 12:41 pm, Kenneth Porter <shiva.blackl...@sewingwitch.com>
>> wrote:
>>> hongseok.y...@gmail.com wrote in news:0293620c-7040-4dd1-887b-
>>> d0649c6a7...@j18g2000prm.googlegroups.com:
>>>
>>>> In above code, how much size of stack are allocated at the
>>>> beginning of the block? 4 or 8(assume that int is 4 byte) byte?
>>>> Is it more efficient if I move up >>int b;<< to next line of >>int
>>>> a;<<?
>>> The compiler will typically pre-allocate the memory for everything in
>>> the function (ie. add a constant to the stack pointer one time at the
>>> top) but only invoke constructors and destructors as the declarations
>>> are reached.
>>>
>>> A good compiler will identify the maximum amount of storage needed
>>> and share stack space between objects in different non-nested blocks.
>> Actually, I suspect no. This is the knapsack problem (or something
>> very much like it), so I hope a compiler isn't brute forcing the
>> solution to a NP complete problem.
>
> No. Most optimising compilers will only re-use memory storage for
> objects with dis-joint lifetimes.
Last time I looked, VC++ and DMC++ did not do that, whereas g++ did. I
just ran this test with g++:
#include <iostream>
using namespace std;
void* fun(bool f)
{
if (f) {
int a;
return &a;
} else {
int b;
return &b;
}
}
int main(int argc, char **argv)
{
void * p1 = fun(false);
void * p2 = fun(true);
if (p1 == p2) cout << "Yah\n";
else cout << "Nah\n";
}
In no-flag compilation p or up to -O2 compilation, the answer was "Yah".
In -O3 compilation, the answer was "Nah". Not that that proves anything
of substance.
> The register-allocation scheme is usually more sophisticated, in that it
> can recognise the situation where an object allocated to a register is
> used as operand now and not ever afterwards. Then the compiler will
> typically re-use that register for the result of the operation without
> storing the old value to memory.
>
>> Now, you can get pretty close with
>> a minor pessimizations and do it P time, but not the actual optimal
>> solution. (The pessimization is just allocate storage for the stack
>> objects lifetime.)
>>
>> For example:
>> int foo(int a, int b, int c, int d)
>> { int m = a + b;
>> int n = m + c;
>> int o = n + d;
>> return o;
>> }
>>
>> I expect most compilers to allocate space so that a stack variable
>> will have storage for its lifetime. A stack frame for this function
>> would look like:
>> 0-3 a
>> 4-7 b
>> 8-11 c
>> 12-15 d
>> 16-19 m
>> 20-23 n
>> 24-27 o
>
> I would expect an optimising compiler to use 16 or 20 bytes of stack and
> one register. 16 bytes for the parameters a,b,c and d and optionally 4
> bytes for the result (if that is passed through the stack).
> This assumes a calling convention where all parameters are passed on the
> stack. Return values could be on the stack or in a register.
>
>> However, a really smart compiler could see that the following stack
>> frame would suffice (using the "as-if" rule):
>> 0-3 a
>> 4-7 b
>> 8-11 c
>> 12-15 d
>> 16-19 m, o
>> 20-23 n
>>
>> And this entire discussion is partially moot because we're getting
>> into register allocation schemes and lots of hardware dependent
>> details.
>>
>> However, I don't know. Do compilers actually solve this NP-complete
>> problem when compiling? It might not be that bad as it's a NP problem
>> with input size of the number of stack variables which isn't that
>> large most of the time. I suspect no.
>
> To my knowledge, yes compilers do solve this problem. Somehow, I am even
> not convinced it is NP-complete.
> If I were to write a mechanism for this, I would combine it with
> register allocation in a four-pass algorithm:
> 1. Determine for each used object it's usage scope (where within its
> lifetime the object is last used; including the call to a non-trivial
> destructor)
> 2. Perform register allocation (using the usage information from the
> previous pass). Keep track of which objects can not be eliminated
> completely from memory.
> 3. going through all the blocks, calculate how much memory is needed to
> store all objects (that need storing in memory) at the end of
> leaf-blocks in the nesting tree. The maximum of these calculations is
> the amount of stack space needed for the function.
> 4. Now you can use a simple stack mechanism to allocate memory locations
> on the stack to each object.
>
> Except for the register allocation, each step is clearly P. I am not
> familiar enough with the algorithms for register allocation to tell if
> they are P or NP.
The Internet is always helpful. In a fraction of the time needed to
think of and write down a scheme from scratch, a google search leads to
e.g. http://en.wikipedia.org/wiki/Register_allocation.
Andrei
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Andrei
|
3/28/2009 8:17:24 PM
|
|
|
10 Replies
103 Views
(page loaded in 0.484 seconds)
|