{ The question concerns the two C++0x types char16_t and char32_t. -mod }
Hi All,
Any idea when these two types will be commonly supported across major
compilers?
2010? 2011? Later?
My primary development environment is VS2008 on Windows.
TIA,
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
jaibuduvin (188)
|
1/5/2010 2:40:39 AM |
|
On 5 Jan, 02:35, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> Hi All,
>
> Any idea when these two types will be commonly supported across major
> compilers?
>
> 2010? 2011? Later?
No idea. I didn't even know this was being thought of. Having been in
Javaland recently I do miss unicode. IMO it is a bit of a hole in C++,
especially when it comes to handling XML. I reckon explicit unicode
support would be far more useful then char16_t and char32_t. I think
talking about char16t_t and char32_t is too low level and the higher
level concepts are more useful. I can't see it happening for C++
though.
>
> My primary development environment is VS2008 on Windows.
Mine is also VS at the moment. I am using the C++ version of Xerces
for XML work and it is a right pain to convert back and forth between
C-style const char* and XMLCh*, which is xerces way of doing unicode.
char16_t and/or char32_t would be no help there.
Regards,
Andrew Marlow
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Andrew
|
1/5/2010 7:20:50 PM
|
|
On Jan 5, 2:35 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> Hi All,
>
> Any idea when these two types will be commonly supported across major
> compilers?
The types themselves are of little use.
What you want are Unicode character and string literals. I know GCC
supports them since version 4.5, no idea about MSVC.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Mathias
|
1/8/2010 6:50:28 AM
|
|
On Jan 8, 12:48 am, Mathias Gaunard <loufo...@gmail.com> wrote:
> On Jan 5, 2:35 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>
> > { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> > Hi All,
>
> > Any idea when these two types will be commonly supported across major
> > compilers?
>
> The types themselves are of little use.
> What you want are Unicode character and string literals. I know GCC
> supports them since version 4.5, no idea about MSVC.
Then what are they for, and why have they been included in C++0x?
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Le
|
1/8/2010 1:51:00 PM
|
|
On Jan 8, 1:40 pm, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On Jan 8, 12:48 am, Mathias Gaunard <loufo...@gmail.com> wrote:
> > On Jan 5, 2:35 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> > > Any idea when these two types will be commonly supported
> > > across major compilers?
> > The types themselves are of little use.
> > What you want are Unicode character and string literals. I know GCC
> > supports them since version 4.5, no idea about MSVC.
> Then what are they for, and why have they been included in C++0x?
Because you can't implement UTF-16 and UTF-32 character and
string literals without them. (Except that I disagree with
regards to their utility. I find it rather a step forward to
have a type which I know can hold a UTF-16 or a UTF-32 element.)
--
James Kanze
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
James
|
1/12/2010 12:01:34 AM
|
|
James Kanze <james.kanze@gmail.com> wrote in news:fc2b2c43-3e94-4f6d-aaa4-
2b7c9d856ad2@d20g2000yqh.googlegroups.com:
> On Jan 8, 1:40 pm, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>
>> On Jan 8, 12:48 am, Mathias Gaunard <loufo...@gmail.com> wrote:
>>
>> Then what are they for, and why have they been included in C++0x?
>
> Because you can't implement UTF-16 and UTF-32 character and
> string literals without them.
Perhaps this is a FAQ, but I'm wondering what role wchar_t has in C++0x now
that we have types char16_t and char32_t. Is wchar_t now considered "legacy
support only" or is there some feature that it provides that is not
provided by the new types?
Peter
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Peter
|
1/12/2010 2:17:51 AM
|
|
On 11 jan, 22:01, James Kanze <james.ka...@gmail.com> wrote:
> Because you can't implement UTF-16 and UTF-32 character and
> string literals without them. (Except that I disagree with
> regards to their utility. I find it rather a step forward to
> have a type which I know can hold a UTF-16 or a UTF-32 element.)
uint_least16_t and uint_least32_t work just as well for that.
Sure, it doesn't have the semantic attached to it, but that's not
needed to hold elements.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Mathias
|
1/16/2010 9:26:51 PM
|
|
On 16 Jan., 20:26, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 11 jan, 22:01, James Kanze <james.ka...@gmail.com> wrote:
>
> > I find it rather a step forward to have a type
> > which I know can hold a UTF-16 or a UTF-32 element.)
>
> uint_least16_t and uint_least32_t work just as well for that.
> Sure, it doesn't have the semantic attached to it, but that's not
> needed to hold elements.
Though, there is a difference between uint_least16_t and char16_t. The
first is just an alias of some integer type while the second is a
distinct type. I'm not sure how important this is in reality but it
affects overloading, for example. Just wanted to mention this.
Cheers,
SG
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
SG
|
1/18/2010 4:49:14 PM
|
|
On Jan 18, 8:49 am, SG <s.gesem...@gmail.com> wrote:
> On 16 Jan., 20:26, Mathias Gaunard <loufo...@gmail.com> wrote:
> > uint_least16_t and uint_least32_t work just as well for that.
> > Sure, it doesn't have the semantic attached to it, but that's not
> > needed to hold elements.
>
> Though, there is a difference between uint_least16_t and char16_t. The
> first is just an alias of some integer type while the second is a
> distinct type. I'm not sure how important this is in reality but it
> affects overloading, for example. Just wanted to mention this.
Which is one of the reasons I need the reach char16_t and char32_t.
My code is type-driven, and being network-oriented, numerical type
codes for char16_t and char32_t must remain invariant from one network
node to another.
What disturbs me is that, within the next few months, I will be forced
to commit. I will have to choose wchar_t or char16_t, but as char16_t
is not available, and there is no indication of when it might become
available, I must use wchar_t for now, but will not be able to change
to char16_t when it becomes available without a total recall of all
deployed network software.
Patchwork-trickery of various kinds that one might imagine to
circumvent this issue do not seem to work in my heavily type-driven
code. :(
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Le
|
1/18/2010 9:59:55 PM
|
|
In article <20b7032a-896f-4702-b9b8-62d164ec5474
@h9g2000yqa.googlegroups.com>, jaibuduvin@gmail.com says...
>
> { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> Hi All,
>
> Any idea when these two types [char16_t and char32_t] will be
> commonly supported across major compilers?
They are currently supported in gcc and the beta version of VS/VC++
2010. I'd expect that most compilers that don't support them already
will probably add that support quite quickly.
--
Later,
Jerry.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Jerry
|
1/19/2010 8:45:08 AM
|
|
Le Chaud Lapin wrote:
> On Jan 18, 8:49 am, SG <s.gesem...@gmail.com> wrote:
>> On 16 Jan., 20:26, Mathias Gaunard <loufo...@gmail.com> wrote:
>
>>> uint_least16_t and uint_least32_t work just as well for that.
>>> Sure, it doesn't have the semantic attached to it, but that's not
>>> needed to hold elements.
>> Though, there is a difference between uint_least16_t and char16_t. The
>> first is just an alias of some integer type while the second is a
>> distinct type. I'm not sure how important this is in reality but it
>> affects overloading, for example. Just wanted to mention this.
>
> Which is one of the reasons I need the reach char16_t and char32_t.
> My code is type-driven, and being network-oriented, numerical type
> codes for char16_t and char32_t must remain invariant from one network
> node to another.
>
What do you mean by "numerical type codes"? Is this something compiler
specific?
cheers,
Martin
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Martin
|
1/20/2010 8:13:39 AM
|
|
On Jan 19, 12:45 am, Jerry Coffin <jerryvcof...@yahoo.com> wrote:
> In article <20b7032a-896f-4702-b9b8-62d164ec5474
> @h9g2000yqa.googlegroups.com>, jaibudu...@gmail.com says...
> > { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> > Hi All,
>
> > Any idea when these two types [char16_t and char32_t] will be
> > commonly supported across major compilers?
>
> They are currently supported in gcc and the beta version of VS/VC++
> 2010. I'd expect that most compilers that don't support them already
> will probably add that support quite quickly.
Thanks Jerry.
I just had one of my engineers check to see if VS2010 beta actually
supports char16_t versus simply making it an aliases for something
else, the obvious choice being wchar_t, and it appears that Microsoft,
right now, is simply making it an alias:
/* uchar PROPERTIES */
#if _HAS_CHAR16_T_LANGUAGE_SUPPORT
#else /* _HAS_CHAR16_T_LANGUAGE_SUPPORT */
#if !defined(_CHAR16T)
#define _CHAR16T
typedef unsigned short char16_t;
typedef unsigned int char32_t;
#endif /* !defined(_CHAR16T) */
#endif /* _HAS_CHAR16_T_LANGUAGE_SUPPORT */
My system relies on char16_t and char32_t being distinct types, not
aliases for anything else, so unfortunately, I will have to stay with
wchar_t for now.
Also, after a bit of musing a few weeks ago about the feasibility/
appropriateness of adding char16_t/char32_t to C++ as distinct types,
I arrived at the conclusion that it was not as trivial as it might
seem for the compiler developer.
Unfortunately, I cannot recall they exact thought process that lead me
to this conclusion. I think it had to do with hard choices regarding
policy. But it does not suprise me that Microsoft has deferred, at
least for the time being, on making these bonafide distinct types.
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Le
|
1/20/2010 8:14:33 AM
|
|
On 18 jan, 19:59, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> What disturbs me is that, within the next few months, I will be forced
> to commit. I will have to choose wchar_t or char16_t, but as char16_t
> is not available, and there is no indication of when it might become
> available, I must use wchar_t for now, but will not be able to change
> to char16_t when it becomes available without a total recall of all
> deployed network software.
I think the types themselves are supported since GCC 4.3 or 4.4, which
are production quality.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Mathias
|
1/20/2010 8:15:24 AM
|
|
On Jan 20, 8:13 am, "Martin B." <0xCDCDC...@gmx.at> wrote:
> Le Chaud Lapin wrote:
> > Which is one of the reasons I need the reach char16_t and char32_t.
> > My code is type-driven, and being network-oriented, numerical type
> > codes for char16_t and char32_t must remain invariant from one network
> > node to another.
>
> What do you mean by "numerical type codes"? Is this something compiler
> specific?
No, just my thing.
I generate a code from 1 to 16 for each of the C++ scalar arithmetic
types, and use these codes in many places [like serialization]. My
UNIOCDE string class is currently based on wchar_t, but I would rather
use char16_t, but for this to work under my model, char16_t must be a
distinct type, not simply an alias for, say, wchar_t, for then the
assigned code for char16_t would be the same as that for wchar_t,
creating confusion throughout my system.
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Le
|
1/20/2010 12:32:32 PM
|
|
Le Chaud Lapin wrote:
> On Jan 19, 12:45 am, Jerry Coffin <jerryvcof...@yahoo.com> wrote:
>> In article <20b7032a-896f-4702-b9b8-62d164ec5474
>> @h9g2000yqa.googlegroups.com>, jaibudu...@gmail.com says...
>>> { The question concerns the two C++0x types char16_t and char32_t. -mod }
>>> Hi All,
>>> Any idea when these two types [char16_t and char32_t] will be
>>> commonly supported across major compilers?
>> They are currently supported in gcc and the beta version of VS/VC++
>> 2010. I'd expect that most compilers that don't support them already
>> will probably add that support quite quickly.
>
> Thanks Jerry.
> ...
> else, the obvious choice being wchar_t, and it appears that Microsoft,
> right now, is simply making it an alias:
> ...
> typedef unsigned short char16_t;
> typedef unsigned int char32_t;
> ...
>
> ... I arrived at the conclusion that it was not as trivial as it might
> seem for the compiler developer.
>
> Unfortunately, I cannot recall they exact thought process that lead me
> to this conclusion. I think it had to do with hard choices regarding
> policy. But it does not suprise me that Microsoft has deferred, at
> least for the time being, on making these bonafide distinct types.
>
But the standard mandates these being distinct types?
So it's the whole "treat xyzchar_t as builtin type: Yes/No" mess all
over again? *Sigh* :-)
br,
Martin
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Martin
|
1/21/2010 4:47:01 AM
|
|
Martin B. wrote:
> Le Chaud Lapin wrote:
>> On Jan 19, 12:45 am, Jerry Coffin <jerryvcof...@yahoo.com> wrote:
>>> In article <20b7032a-896f-4702-b9b8-62d164ec5474
>>> @h9g2000yqa.googlegroups.com>, jaibudu...@gmail.com says...
>>>> { The question concerns the two C++0x types char16_t and
>>>> char32_t. -mod } Hi All,
>>>> Any idea when these two types [char16_t and char32_t] will be
>>>> commonly supported across major compilers?
>>> They are currently supported in gcc and the beta version of
>>> VS/VC++ 2010. I'd expect that most compilers that don't support
>>> them already will probably add that support quite quickly.
>>
>> Thanks Jerry.
>> ...
>> else, the obvious choice being wchar_t, and it appears that
>> Microsoft, right now, is simply making it an alias:
>> ...
>> typedef unsigned short char16_t;
>> typedef unsigned int char32_t;
>> ...
>>
>> ... I arrived at the conclusion that it was not as trivial as it
>> might seem for the compiler developer.
>>
>> Unfortunately, I cannot recall they exact thought process that
>> lead me to this conclusion. I think it had to do with hard choices
>> regarding policy. But it does not suprise me that Microsoft has
>> deferred, at least for the time being, on making these bonafide
>> distinct types.
>
> But the standard mandates these being distinct types?
> So it's the whole "treat xyzchar_t as builtin type: Yes/No" mess all
> over again? *Sigh* :-)
>
No.
The library has support for the new char types, but the compiler does
not. What can you do, except add a couple of (temporary) typedefs?
Bo Persson
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Bo
|
1/21/2010 8:16:33 PM
|
|
On 8 Jan, 14:40, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On Jan 8, 12:48 am, Mathias Gaunard <loufo...@gmail.com> wrote:
>
> > On Jan 5, 2:35 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>
> > > { The question concerns the two C++0x types char16_t and char32_t. -mod }
>
> > > Hi All,
>
> > > Any idea when these two types will be commonly supported across major
> > > compilers?
>
> > The types themselves are of little use.
> > What you want are Unicode character and string literals. I know GCC
> > supports them since version 4.5, no idea about MSVC.
>
> Then what are they for, and why have they been included in C++0x?
You need types with explicitly defined sizes to access
binary files.
Rune
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Rune
|
1/22/2010 12:57:24 AM
|
|
In article <b9f8fcbb-2abb-4642-a191-5dc940e6f941
@m25g2000yqc.googlegroups.com>, jaibuduvin@gmail.com says...
[ ... ]
> I just had one of my engineers check to see if VS2010 beta actually
> supports char16_t versus simply making it an aliases for something
> else, the obvious choice being wchar_t, and it appears that Microsoft,
> right now, is simply making it an alias:
Yup -- this is what I get for trusting their blog. Doing an actual
test confirms that char16_t is simply an alias for unsigned short.
One thing that really would be nice is the "strong typedef", that's
been proposed (and even accepted, if memory serves) which would
produce a new type just like an existing one, but still
distinguishable for purposes like function overloading.
--
Later,
Jerry.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Jerry
|
1/22/2010 7:49:46 PM
|
|
On Jan 22, 12:57 am, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 8 Jan, 14:40, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> > > > Any idea when these two types will be commonly supported across major
> > > > compilers?
>
> > > The types themselves are of little use.
> > > What you want are Unicode character and string literals. I know GCC
> > > supports them since version 4.5, no idea about MSVC.
>
> > Then what are they for, and why have they been included in C++0x?
>
> You need types with explicitly defined sizes to access
> binary files.
As I can no longer wait for char16_t and char32_t, I guess I will
commit to wchar_t being the fundamental character type of my String
class.
How often does it occur that wchar_t < 16 bits?
-Le Chaud Lapin-
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Le
|
1/26/2010 4:11:14 AM
|
|
On 26 Jan., 11:11, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On Jan 22, 12:57 am, Rune Allnor <all...@tele.ntnu.no> wrote:
>
> > On 8 Jan, 14:40, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> > > > > Any idea when these two types will be commonly supported across major
> > > > > compilers?
>
> > > > The types themselves are of little use.
> > > > What you want are Unicode character and string literals. I know GCC
> > > > supports them since version 4.5, no idea about MSVC.
>
> > > Then what are they for, and why have they been included in C++0x?
>
> > You need types with explicitly defined sizes to access
> > binary files.
>
> As I can no longer wait for char16_t and char32_t, I guess I will
> commit to wchar_t being the fundamental character type of my String
> class.
>
> How often does it occur that wchar_t < 16 bits?
I don't know of any compiler that does so. If you want to be sure
you may test for the new C99 #define:
__STDC_ISO_10646__ :
An integer constant of the form yyyymmL (for example,
199712L). If this symbol is defined, then every character in the
Unicode
required set, when stored in an object of type wchar_t, has the same
value as the short identifier of that character. The Unicode required
set
consists of all the characters that are defined by ISO/IEC 10646,
along with
all amendments and technical corrigenda, as of the specified year and
month.
If this is not available you may add a compile-time test that checks
for std::numeric_limits<wchar_t>::digits, combined with it's sign
information to check whether it's matches the bit constraints or
just check WCHAR_MIN and WCHAR_MAX for the expected
minimum ranges.
HTH & Greetings from Bremen,
Daniel Kr�gler
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
ISO
|
1/26/2010 4:14:46 PM
|
|
In article <3c77f886-7ca2-42d2-b132-741d6430f8f9
@c34g2000yqn.googlegroups.com>, jaibuduvin@gmail.com says...
[ ... ]
> How often does it occur that wchar_t < 16 bits?
The standards allow wchar_t to be an 8-bit type, but I'm not sure
I've ever seen or heard of a compiler that actually used less than
16.
--
Later,
Jerry.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
0
|
|
|
|
Reply
|
Jerry
|
1/26/2010 10:49:46 PM
|
|
|
20 Replies
567 Views
(page loaded in 0.23 seconds)
|