f



C++, unions, and type punning

After seeing the discussion in <http://blog.regehr.org/archives/959>, I
was motivated to try to track down what the legality of various
type-punning paradigms in the various C and C++ specifications. The
union trick in particular, reproduced below is one whose
well-definedness I had trouble tracking down:

uint32_t getFloatRepr(float f) {
   union {
     float a;
     uint32_t b;
   } u;
   u.a = f;
   return u.b;
}

In C99 and C11 [they have almost identical text in the relative
sections], the semantics outright state in a footnote (to 6.5.2.3) that:
> If the member used to access the contents of a union object is not
> the same as the member last used to store a value in the object, the
> appropriate part of the object representation of the value is
> reinterpreted as an object representation in the new type as
> described in 6.2.6 (a process sometimes called “type punning”). This
> might be a trap representation.

So in modern C dialects, up to the inherent unspecified nature of the
representation of types, the above code segment is legal and
well-defined. There are also explicit further notes that:
1. Padding bytes in aggregates have unspecified values
2. The values of bytes that correspond to union members other than the
one last stored into
3. If all members of a union share the same common struct fields at the
beginning, then those fields can be viewed at any time [this seems
redundant given other operative text]

But in C++11, I was unable to find any confirmation one way or the other
when it came to validity of this type-punning trick. The operative text
on unions doesn't contain any information on this, except for a
reproduction of the third point from C and this line:
> In a union, at most one of the non-static data members can be active
> at any time, that is, the value of at most one of the non-static
> data members can be stored in a union at any time.

The only other thing which comes close to mentioning validity is the
strict aliasing rules, which explicitly permit:
> an aggregate or union type that includes one of the aforementioned
> types among its elements or nonstatic data members (including,
> recursively, an element or non-static data member of a subaggregate
> or contained union),

Which is still unclear to me if this kind of type-punning is 
well-defined in C++11.

-- 
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]
0
UTF
6/14/2013 8:41:36 PM
comp.lang.c++.moderated 10738 articles. 1 followers. allnor (8509) is leader. Post Follow

3 Replies
1157 Views

Similar Articles

[PageSpeed] 38

On Friday, 14 June 2013 23:41:36 UTC+3, Joshua Cranmer 🐧  wrote:
> After seeing the discussion in
> <http://blog.regehr.org/archives/959>, I was motivated to try to
> track down what the legality of various type-punning paradigms in
> the various C and C++ specifications. The union trick in particular,
> reproduced below is one whose well-definedness I had trouble
> tracking down:
>
> uint32_t getFloatRepr(float f) {
>    union {
>      float a;
>      uint32_t b;
>    } u;
>    u.a = f;
>    return u.b;
> }

+1. I have also always thought of same thing about union.

What you in practice get from above code in C++ is well working
program and may be a warning. I have not met tests that fail.  However
by word of law it looks illegal. We can certainly use 'memcpy' or to
write that part in C, but union is dim.

That is pity since 'union' feels to be for compatibility with C.
Classes with improved type safety (like 'boost::variant') are lot
better for rest of the use-cases in C++.

I remember that compatibility with existing code was the major
argument of "dark forces" against readable keywords (and for
introducing Klingon elements like [[squaredugliness]], 'ovrdecl' or
'finaldecl') when it was discussed here. That makes me wonder what is
the major argument for such odd behavioral incompatibility with C?


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
ISO
6/15/2013 3:30:22 AM
On Fri, 14 Jun 2013 13:41:36 -0700 (PDT)
Joshua Cranmer  <Pidgeot18@verizon.invalid> wrote:

> In C99 and C11 [they have almost identical text in the relative
> sections], the semantics outright state in a footnote (to 6.5.2.3)
> that:
> > If the member used to access the contents of a union object is not
> > the same as the member last used to store a value in the object,
> > the appropriate part of the object representation of the value is
> > reinterpreted as an object representation in the new type as
> > described in 6.2.6 (a process sometimes called ?type
> > punning?). This might be a trap representation.
>
> So in modern C dialects, up to the inherent unspecified nature of
> the representation of types, the above code segment is legal and
> well-defined.

Hmm, but what does "might be a trap representation" mean, then?  That
the compiler will treat the memory as the "new type" but that the
hardware might object?  Not everyone would call that well defined!

> But in C++11, I was unable to find any confirmation one way or the
> other when it came to validity of this type-punning trick. The
> operative text on unions doesn't contain any information on this,
> except for a reproduction of the third point from C and this line:
>
> > In a union, at most one of the non-static data members can be
> > active at any time, that is, the value of at most one of the
> > non-static data members can be stored in a union at any time.

As I read it, that says a union holds only what you put into it.
That's the active member.  If you read from an inactive member, Thor
will smite you for a thousand generations.

IMO this is unfortunate, part of the tut-tutting that's crept into the
language in the past decade.  Time was when "type-punning" had no name
and was common practice; certainly it's clearer notation than any
alternative.  I hear tell a rumor that it's being entertained a
explicitly allowed in C++17.  Hope springs eternal.

--jkl


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
James
6/15/2013 3:33:11 AM
On 15/06/2013 10:33, James K. Lowden wrote:
>
> On Fri, 14 Jun 2013 13:41:36 -0700 (PDT)
> Joshua Cranmer  <Pidgeot18@verizon.invalid> wrote:
>
>> In C99 and C11 [they have almost identical text in the relative
>> sections], the semantics outright state in a footnote (to 6.5.2.3)
>> that:
>>> If the member used to access the contents of a union object is not
>>> the same as the member last used to store a value in the object,
>>> the appropriate part of the object representation of the value is
>>> reinterpreted as an object representation in the new type as
>>> described in 6.2.6 (a process sometimes called ?type
>>> punning?). This might be a trap representation.
>>
>> So in modern C dialects, up to the inherent unspecified nature of
>> the representation of types, the above code segment is legal and
>> well-defined.
>
> Hmm, but what does "might be a trap representation" mean, then?
> That the compiler will treat the memory as the "new type" but that
> the hardware might object?  Not everyone would call that well
> defined!

Indeed, it is the potential existence of trap values that has
historically meant that this kind of type punning has undefined
behaviour. It is non-portable (and that is one of the reasons that
both C and C++ classify some code as having undefined behaviour. For
the overwhelming majority of cases even on platforms where there are
potential trap values the code will behave as expected (for some value
of 'expected') but it is potentially unsafe and the safety cannot be
generally checked at compile time because it will depend on runtime
values (and sometimes on hardware settings)

It is little different to:

int i, j, k;
// code setting runtime values for i and j
k = i+j;  // might trap on overflow

except in this case it is possible, though cumbersome, for the
programmer to pre-check the values in i and j and not execute the
statement if overflow will occur.

Francis




-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Francis
6/15/2013 9:24:56 AM
Reply: