COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Questions on character constants

• Email
• Follow

There are 2 points in Sec. 6.4.4.4, describing character constants
that are
not entirely clear to me. It may be that I don't read well the text or
that I
have not understood correcly the issues of character encondings.

In p10 there is the sentence "The value of an integer character
constant
containing a single character that maps to a single-byte execution
character is the
numerical value of the representation of the mapped character
interpreted as an integer".
This confirms that it may be that a single character of the source set
may be
mapped to multiple bytes in the execution character set (and this
consistent with
other parts of the standard). But still in p10 there is the sentence
"If an integer
character constant contains a single character or escape sequence, its
value
is the one that results when an object with type char whose value is
that of the
single character or escape sequence is converted to
type int". This sentence seems to imply that the value corresponding
to a single
character (or escape sequence) can be fit into a single object of
thype char,
i.e., into a single byte. Isn't the latter sentence a contradiction
with the
former (and other parts of the standard)?

In p11 there is the sentence "The value of a wide character constant
containing a single
multibyte character that maps to a member of the extended execution
character set is the
wide character corresponding to that multibyte character, as defined
by the mbtowc
function, with an implementation-defined current locale."
This sentence suggests to me that the function mbtowc maps the
multibyte encoding
of a character of the *source* character set to a wide character.
I find this surprising because of the following reasons:
1) the second parameter of mbtowc is a char *, so a pointer to bytes
in the
execution environment
2) wctomb operates at runtime so I think it converts a wide character
to a multibyte
encoding in the execution environment; I would expect that wctomb and
mbtowc were
inverse of each other

One more question: a byte is (sec. 3.3.6) a unit of data storage of
the execution environment.
Isn't it possible that the host environment has units of data storage
with a different
number of bits?
 0

See related articles to this posting

On 12/12/2010 10:24 AM, Luca Forlizzi wrote:
> There are 2 points in Sec. 6.4.4.4, describing character constants
> that are
> not entirely clear to me. It may be that I don't read well the text or
> that I
> have not understood correcly the issues of character encondings.
>
> In p10 there is the sentence "The value of an integer character
> constant
> containing a single character that maps to a single-byte execution
> character is the
> numerical value of the representation of the mapped character
> interpreted as an integer".
> This confirms that it may be that a single character of the source set
> may be
> mapped to multiple bytes in the execution character set (and this
> consistent with
> other parts of the standard). But still in p10 there is the sentence
> "If an integer
> character constant contains a single character or escape sequence, its
> value
> is the one that results when an object with type char whose value is
> that of the
> single character or escape sequence is converted to
> type int". This sentence seems to imply that the value corresponding
> to a single
> character (or escape sequence) can be fit into a single object of
> thype char,
> i.e., into a single byte. Isn't the latter sentence a contradiction
> with the
> former (and other parts of the standard)?

"The escape sequence" refers to the source-code escape sequences,
multi-source-character sequences like '\n' or '\xFF'.  When you write
the resulting character to a stream, the implementation might use an
encoding scheme like Shift JIS that employs "escape sequences" of its
own, but these "escape sequences" are not the source-level constructs
described in 6.4.4.4.

(I'll pass on your question about mbtowc() et al. because I have
used them only a few times, and even then without real understanding.)

> One more question: a byte is (sec. 3.3.6) a unit of data storage of

ITYM 3.6.

> the execution environment.
> Isn't it possible that the host environment has units of data storage
> with a different
> number of bits?

Yes, and in fact it's quite common.  Very many C platforms support
"units" of many sizes: bytes, halfwords, words, doublewords, pages, ...
The crucial requirement in 3.6 is not only that this unit exist, but
that it be an "addressable unit," because in C's view nearly every data
object can be treated as an array of bytes.  Even if this treatment is
not "natural" for the underlying platform, the C implementation must
somehow make the array-of-bytes view work.  For example, the original
DEC Alpha supported 32- and 64-bit units, and used shifts and masks
to simulate byte access within those larger blobs.

--
Eric Sosman
esosman@ieee-dot-org.invalid
 0

On Dec 12, 9:24=A0am, Luca Forlizzi <luca.forli...@gmail.com> wrote:
> There are 2 points in Sec. 6.4.4.4, describing character constants
> that are
> not entirely clear to me. It may be that I don't read well the text or
> that I
> have not understood correcly the issues of character encondings.
>
> In p10 there is the sentence "The value of an integer character
> constant
> containing a single character that maps to a single-byte execution
> character is the
> numerical value of the representation of the mapped character
> interpreted as an integer".
> This confirms that it may be that a single character of the source set
> may be
> mapped to multiple bytes in the execution character set (and this
> consistent with
> other parts of the standard).
[snip]

It may be helpful, in learning C, to shift your perspective from bytes
to words. One of C's original, primary purposes was to be fast. This
means
that the compiler is less concerned with the spatial arrangement of
code and data in core memory. and more focused on the temporal
arrangment
of intructions as they will be executed.

So when the standard talks about 'integer character constant's, it's
dead fucking serious. This object is not a char. It's an int. This is
because when it get loaded into a register, it'll be an integer-sized
register. It doesn't matter if there are separately-addressable byte-
sized BH and BL registers, it's gonna use EBX.

to circle or highlight the nouns with all their adjectives and

So in this case,

"The
value
[-- perhaps they could've said 'value yielded in an expression' --]

of an

integer character constant
[-- this is just the term being defined, so we don't get to assume

containing a

single character that maps to a single-byte execution character
[-- so it's telling us absolutely nothing about characters that
do not map to a 'single-byte execution character', whatever that
may mean --]

is the

numerical value of the representation of the mapped character
interpreted as an integer"
[-- remember we're talking about the 'value' of this creature.
It's value is a number. It's whatever number it needs to be
to match the 'representaion of the mapped character' if you
had to give the most obvious number to it. --]

lxt
--
Hopefully this comes off more useful than patronizing.
 0

2 Replies
165 Views

Similar Articles

12/12/2013 4:27:38 AM
[PageSpeed]

Similar Artilces:

Can I get some help please "warning: multi-character character constant" and more
I'm having problem with my homework. I have to use if/else if and switch. If I only use if the programs run, and I get some sort of result incorrect but something. However if I used switch the program ask for the input and return 0 without any result ????? other thing that I am getting is "warning: multi-character character constant" /* Premium Insurance Cost*/ #include <iostream> using namespace std; int main () { int age; int ticket; int car; double result; //char choice; cout << " How many ticket do yo...

I've never even attempted to do something like this, before, and I don't know if there's a standard way of going about doing it... or how complicated the mathematics are ... I'm trying to pull information from the desktop screen, in windows... I have been searching a way to retrieve data from this program, to no avail. ... so, I'm going to just resort to trying to read the numbers off the screen directly. I believe I will know the font and size/style, ahead of time ... any recommendations? I'm figuring it must be simple geometry ... but character recognition i...

Multi-character constants
After reading through some (open) Intel (CPU detection) C++ source (www.intel.com/cd/ids/developer/asmo-na/eng/276611.htm) I stumbled upon a sketchy use of multibyte characters - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 260: unsigned int VendorID[3] = {0, 0, 0}; try // If CPUID instruction is supported { ... } catch (...) { ... } return ( (VendorID[0] == 'uneG') && (VendorID[1] == 'Ieni') && (Vend...

INTEGER CONSTANT Question
Here's a two part question: Q#1: If a CONSTANT is defined of type INTEGER (i.e. CONSTANT counter_bits : INTEGER := 1;), does a range need to be specified? For synthesis, it is generally good practice to constrain INTEGERs. However, I can see that this makes sense for VARIABLES and SIGNALS, but does it also apply to CONSTANTS? Q#2: Is it acceptable coding style to define a STD_LOGIC_VECTOR counter load as an INTEGER CONSTANT (where conversion functions need to be called when the counter is loaded), or can this lead to any synthesis issues? CONSTANT counter_bits : INTEGER := 6; CONS...

Character Handling Question
Hello, I've two quick questions. Suppose I've got the following: char* cptr = "This is potentially, a really, really long string"; I need to determine if the string contains "PULP FICTION" without regard to case. For example, all of "PULP FICTION", "pulp fiction", and PuLp Fiction" meet my criteria. I don't care WHERE in the character array it occurs, I just want to know if it's there. Is there a standard library function which will do this for me or do I need to write something myself? Either something which will handle...

character encoding question
I have an html file which is encoded in UTF-8. The file contains the following text: It&#39;s a wonderful life now the character code 39 is for aphostrohpe in UTF8. so suppose I got the 39 out of the text using: s="It&#39;s a wonderful life" s.gsub(/&#(\d+);/, '\1') The output is It39s a wonderful life So firstly I am having trouble making it It\39s a wonderful life Secondly I manually did this in test_utf8.rb: puts "It\39s a wonderful life" and ran it ruby test_utf8.rb > utf8.txt but by opening it in the open ...

matlab character array questions
Hi, I am trying to read a character array (or matrix) by matlab, for all the elements, they either contain "normal" or "tumor", I want to label those elements, say, for "normal", I label it with 1, for tumor, I label it with -1. How can I do this? Many thanks. Jun Jun: <SNIP wants to classify his/her string array with numbers... one of the many solutions: s={ 'normal' 'tumor' 'tumor' 'normal' }; v=strrep(s,'normal','0'); v=strrep(v,'tumor','...

newline character in string constant
Hi! I have a form with a textarea field. I want to validate the input from the textarea using javascript. Suppose I want to check that the user has not entered the string: "Hello World!" To do this I am using the script: form["text"].value == "Hello\nWorld" But this gives an "unterminated string constant error" because the browser converts this to: form["text"].value == "Hello World" So how do I do my check? Regards, Shobhit shobhit.mathur@gmail.com writes: > To do this I am using the script: > form["text"].val...

some question on Constant Q transform
just got the constant Q transform efficient algorithm. And I input the data(shuju.txt length(shuju)=1974, and it is real number vector )into the algorithm,then I set the breakpoints at this sentence: okk=35*[K:-1:1]; Just Run, and then I search the var "whos". I found Grand total is 79726 elements using 1506920 bytes, with the consideration from double to int type, that means If I implemented this algorithm inside my TMS320LF2407, it need cost 1506920/4=376340bytes. It is crazy for me. Since actually, the minFreq should be 0.75Hz, not 4Hz. so it need more memory space,mainly due to t...

Character assignment question in Function
Hello, Here is the situation I have: An optional input argument in a function I am working with is of assumed length. i.e. CHARACTER( * ), OPTIONAL, INTENT( IN ) :: filename I believe the argument is being passed in succesfully(I can print the argument inside the function), but I am unable to assign the argument to a local CHARACTER variable in the function. Also "PRESENT(filename)" is returning false within the function. I'm probably looking over something very basic. Any thoughts on what might be going on would be very helpful. Thanks in advance for ...

question on generics, constants in vhdl
suppose I want calculate the log2(N) were N is an integer valued generic parameter in my design, now, lets say i would like to use log2(N) as a parameter in a for ... generate statement, is there a way to do this? I dont want the log2(N) to be synthesized into my hardware, i just want to use the result in a loop -- Geoffrey Wall Masters Student in Electrical/Computer Engineering Florida State University, FAMU/FSU College of Engineering wallge@eng.fsu.edu Cell Phone: 850.339.4157 ECE Machine Intelligence Lab http://www.eng.fsu.edu/mil MIL Office Phone: 850.410.6145 Center for Applied Vision...

a (serious) question about character codes in Mathematica
Usually, my questions about character codes are more complaints than anything else. But, this time I just want info. I promise :-) FromCharacterCode[16^^52,"Mathematica7"]//FullForm gives "\[DoubleStruckCapitalR]" (52 in hex is 82 in decimal, so this is the 83rd character in the Mathematica7 font - numbering starts from zero) As shown below, Mathematica seems to map characters to non-standard Unicode points. In Unicode, the double struck capitol R is 0x211d in hex and 8477 in decimal[1]. But, Mathematica assigns this character to (the second a...

textarea character counting question (javascript)
Hi everyone, I have a countChars function (javascript) for my textarea box: The functions uses a textarea, and textbox. User is only allowed to type up to "maxLength" characters in the textarea and the characters remaining is displayed in the textbox. function countChars(controlToValidateRef, maxLength,outputControlRef, errorMessage, enableClientSideRestriction,showJavascriptAlert,showCharacterCount) { var countString = maxLength - controlToValidateRef.value.length; if (countString < 0) { if (enableClientSideRestriction) { controlToVali...

Question on Unicode characters reading and printing
Hi, I have developed a test program that inputs a UTF-8 characters and converts them to unicode(wide character strings-32-bit characters) on SCO Unixware 7.1.3). For this I have used mbstowcs function. Here is code for it wchar_t *w_example; const char* example = "\xC3\x92\xC3\x94\xC3\x95\xC3\x96\xC3\x98"; locale=setlocale(LC_ALL,NULL); locale = setlocale(LC_ALL, ""); w_example=(wchar_t *) malloc(100); mbstowcs(w_example,example,MB_CUR_MAX); Now how can print characters in w_example. I have used printf. Is there a better way to print w_example? Here are s...

Hello everyone, one short question: Is CHARACTER (LEN=*) :: Message="This could be your message." a valid Fortran command? I'm not sure if (LEN=*) is correct here, when the argument for Message is immediately following. Thanks a lot! Hank Schaller wrote: | Hello everyone, | | one short question: Is | | CHARACTER (LEN=*) :: Message="This could be your message." | | a valid Fortran command? | | I'm not sure if (LEN=*) is correct here, when the argument for Message | is immediately following. No it's not valid -- you have to specify the length of the var...

CLOS Question
Hi, I am looking for a way to make an instance of a class a constant. Is there a simple way to do this? Failing which, is there a simple way to detect all changes to a slot of an object? The reason that I am asking is that I have a relatively small lisp application (ten thousand lines of code or so) that is crashing because of a bug that causes a slot of an object to be modified. That object is supposed to be a constant once completely initialized. However, I can't seem to find out the place or function that is changing that slot. Thanks for any help. Regards, Sachin. skamboj wrote: ...

textarea character counting question (javascript) #2
Hi everyone, I have a countChars function (javascript) for my textarea box: The functions uses a textarea, and textbox. User is only allowed to type up to "maxLength" characters in the textarea and the characters remaining is displayed in the textbox. function countChars(controlToValidateRef, maxLength,outputControlRef, errorMessage, enableClientSideRestriction,showJavascriptAlert,showCharacterCount) { var countString = maxLength - controlToValidateRef.value.length; if (countString < 0) { if (enableClientSideRestriction) { controlToVali...

hp49g (still the best???:-o) constants question...
hi all...i'm at erasmus, and i've forgot how to geet to the libraries of constants in my hp49g... where they are? tnk in advice, i've forgot any documentation....... ...

text marking question
On a plot that I am making, I need to get the caret (^) placed directly above another character. Superscript of the ^ before or after the character is not what I want. Any help is appreciated. Thank you. Ben try title('$\hat{X}$', 'interpreter','latex') it works for title, xlabel, ylabel Does anyone know how to do that for legend via R14? Thanks. Ben nhum wrote: > > > try > title('$\hat{X}$', 'interpreter','latex') > > it works for title, xlabel, ylabel > > Does anyone know how to do that for legend via R14? nhu...

Question Regarding Using Constants to Define Boundries for Arrays
Hello, I wrote some software in Visual Basic 5 that I need to use arrays of varying sizes. The arrays need to have defined boundries, (i.e., Array(0 to UpperLimit)), because they will take input from an ActiveX controller which requires it. My question is... I need a Constant expression in order to define the UpperLimit value of the array, but I also need to manipulate it during runtime. Is there any way to do this? Thanks to whomever resoponds to this posting. Help would be greatly appreciated! Thanks, Dan Correction: I wrote it in Visual Basic 6 "UpliftMofo" <danie...