I've seen a lot of functions in the standard library that deal with
characters, but a lot them return/take parameters of type int (which
is usually the integer that represents the character code in the
character set). I know that C automatically converts between int and
char in those cases without problems (or could the singed/unsigned
issue cause problems?), and that character constants have actually
type int, but, what is the reason for some functions in the standard
library to return type int (or take parameters of type int) when one
is supposed to be dealing with characters instead of numbers?
Thanks,
Sebastian
|
|
0
|
|
|
|
Reply
|
s0suk3 (372)
|
7/4/2008 7:49:45 AM |
|
On Jul 4, 12:49 pm, s0s...@gmail.com wrote:
> I've seen a lot of functions in the standard library that deal with
> characters, but a lot them return/take parameters of type int (which
> is usually the integer that represents the character code in the
> character set). I know that C automatically converts between int and
> char in those cases without problems (or could the singed/unsigned
> issue cause problems?), and that character constants have actually
> type int, but, what is the reason for some functions in the standard
> library to return type int (or take parameters of type int) when one
> is supposed to be dealing with characters instead of numbers?
>
> Thanks,
> Sebastian
Cosider fgetc(). It actually returns an unsigned integer cast to an
int. Casting to the int is required because EOF is defined as negative
int(generally, but not necessarily, -1). So, other than the EOF, we
get an unsigned int.
|
|
0
|
|
|
|
Reply
|
rahulsinner (151)
|
7/4/2008 7:53:14 AM
|
|
On 4 Jul, 09:59, Richard Heathfield <r...@see.sig.invalid> wrote:
> rahul said:
>
> <snip>
>
> > Cosider fgetc(). It actually returns an unsigned integer cast to an
> > int.
>
> Since chars are integers, you are correct.
>
> > Casting to the int is required because EOF is defined as negative
> > int(generally, but not necessarily, -1). So, other than the EOF, we
> > get an unsigned int.
>
> No, other than the EOF we get an int that is the result of a conversion of
> an unsigned char to an int.
Don't you mean "a conversion of a char to an int" ?
|
|
0
|
|
|
|
Reply
|
gw7rib (462)
|
7/4/2008 8:58:58 AM
|
|
rahul said:
<snip>
> Cosider fgetc(). It actually returns an unsigned integer cast to an
> int.
Since chars are integers, you are correct.
> Casting to the int is required because EOF is defined as negative
> int(generally, but not necessarily, -1). So, other than the EOF, we
> get an unsigned int.
No, other than the EOF we get an int that is the result of a conversion of
an unsigned char to an int.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
|
|
0
|
|
|
|
Reply
|
rjh (10789)
|
7/4/2008 8:59:58 AM
|
|
gw7rib@aol.com said:
> On 4 Jul, 09:59, Richard Heathfield <r...@see.sig.invalid> wrote:
>> rahul said:
>>
>> <snip>
>>
>> > Cosider fgetc(). It actually returns an unsigned integer cast to an
>> > int.
>>
>> Since chars are integers, you are correct.
>>
>> > Casting to the int is required because EOF is defined as negative
>> > int(generally, but not necessarily, -1). So, other than the EOF, we
>> > get an unsigned int.
>>
>> No, other than the EOF we get an int that is the result of a conversion
>> of an unsigned char to an int.
>
> Don't you mean "a conversion of a char to an int" ?
No, I don't mean that.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
|
|
0
|
|
|
|
Reply
|
rjh (10789)
|
7/4/2008 9:09:31 AM
|
|
On 4 Jul, 10:09, Richard Heathfield <r...@see.sig.invalid> wrote:
> gw7...@aol.com said:
> > On 4 Jul, 09:59, Richard Heathfield <r...@see.sig.invalid> wrote:
> >> rahul said:
> >> > Casting to the int is required because EOF is defined as negative
> >> > int(generally, but not necessarily, -1). So, other than the EOF, we
> >> > get an unsigned int.
>
> >> No, other than the EOF we get an int that is the result of a conversion
> >> of an unsigned char to an int.
>
> > Don't you mean "a conversion of a char to an int" ?
>
> No, I don't mean that.
Then could you explain, please? My understanding is that a type char
is provided, for storing, well, characters, in the system's favourite
format. Functions such as fgetc() are used for reading in characters,
but the output they give has to embrace both all real characters and
have something different to indicate EOF. Hence they return an int. So
I would have assumed that the values they can return, other than EOF,
are the values of characters which in turn are the values a char can
have. I don't see where you have got "unsigned" from. AFAIAA you
should use unsigned char rather than char when reading memory
locations storing a different type but I don't see how that is
relevant here.
Paul.
|
|
0
|
|
|
|
Reply
|
gw7rib (462)
|
7/4/2008 9:22:30 AM
|
|
s0suk3@gmail.com writes:
> I've seen a lot of functions in the standard library that deal with
> characters, but a lot them return/take parameters of type int (which
> is usually the integer that represents the character code in the
> character set). I know that C automatically converts between int and
> char in those cases without problems (or could the singed/unsigned
> issue cause problems?), and that character constants have actually
> type int, but, what is the reason for some functions in the standard
> library to return type int (or take parameters of type int) when one
> is supposed to be dealing with characters instead of numbers?
In very old versions of C (pre-1989), function prototypes did not
exist. An expression of type char passed as an argument would always
be promoted to int, and a function with no visible declaration was
assumed to return int.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
|
|
0
|
|
|
|
Reply
|
kst-u (21460)
|
7/4/2008 9:31:08 AM
|
|
gw7rib@aol.com writes:
> On 4 Jul, 10:09, Richard Heathfield <r...@see.sig.invalid> wrote:
>> gw7...@aol.com said:
>> > On 4 Jul, 09:59, Richard Heathfield <r...@see.sig.invalid> wrote:
>> >> rahul said:
>> >> > Casting to the int is required because EOF is defined as negative
>> >> > int(generally, but not necessarily, -1). So, other than the EOF, we
>> >> > get an unsigned int.
>>
>> >> No, other than the EOF we get an int that is the result of a conversion
>> >> of an unsigned char to an int.
>>
>> > Don't you mean "a conversion of a char to an int" ?
>>
>> No, I don't mean that.
>
> Then could you explain, please? My understanding is that a type char
> is provided, for storing, well, characters, in the system's favourite
> format. Functions such as fgetc() are used for reading in characters,
> but the output they give has to embrace both all real characters and
> have something different to indicate EOF. Hence they return an int. So
> I would have assumed that the values they can return, other than EOF,
> are the values of characters which in turn are the values a char can
> have. I don't see where you have got "unsigned" from. AFAIAA you
> should use unsigned char rather than char when reading memory
> locations storing a different type but I don't see how that is
> relevant here.
That's what the standard says.
C99 7.19.7.1:
If the end-of-file indicator for the input _stream_ pointed to by
stream is not set and a next character is present, the _fgetc_
function obtains that character as an _unsigned char_ converted to
an _int_ and advances the associated file position indicator for
the stream (if defined).
Plain char may be either signed or unsigned. Forcing the characters
read from a file to be treated as unsigned char rather than plain char
ensures that no valid character can appear as EOF.
Typically EOF is -1. On a system with 8-bit bytes, where plain char
is signed (two's-complement), a byte with all bits set will be read as
255. If it were interpreted as a plain char, it would be read as -1.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
|
|
0
|
|
|
|
Reply
|
kst-u (21460)
|
7/4/2008 9:39:40 AM
|
|
gw7rib@aol.com said:
> On 4 Jul, 10:09, Richard Heathfield <r...@see.sig.invalid> wrote:
>> gw7...@aol.com said:
>> > On 4 Jul, 09:59, Richard Heathfield <r...@see.sig.invalid> wrote:
>> >> rahul said:
>> >> > Casting to the int is required because EOF is defined as negative
>> >> > int(generally, but not necessarily, -1). So, other than the EOF, we
>> >> > get an unsigned int.
>>
>> >> No, other than the EOF we get an int that is the result of a
>> >> conversion of an unsigned char to an int.
>>
>> > Don't you mean "a conversion of a char to an int" ?
>>
>> No, I don't mean that.
>
> Then could you explain, please? My understanding is that a type char
> is provided, for storing, well, characters, in the system's favourite
> format.
Right. Unfortunately, things aren't quite that neat.
> Functions such as fgetc() are used for reading in characters,
> but the output they give has to embrace both all real characters and
> have something different to indicate EOF. Hence they return an int.
Right so far.
> So
> I would have assumed that the values they can return, other than EOF,
> are the values of characters which in turn are the values a char can
> have. I don't see where you have got "unsigned" from.
4.9.7.1 The fgetc function
Synopsis
#include <stdio.h>
int fgetc(FILE *stream);
Description
The fgetc function obtains the next character (if present) as an
unsigned char converted to an int , from the input stream pointed to
by stream , and advances the associated file position indicator for
the stream (if defined).
So - assuming for the moment that we're not at the end of the file... the
fgetc function reads one byte from a stream, and interprets that byte as
if its bit pattern represents an unsigned char. (Whether or not that
interpretation is appropriate is neither here nor there as far as fgetc is
concerned.) It then converts that value into an int, and returns the int.
You should pick the value up using an int. If that int is != EOF (or, on
some of the more esoteric platforms, if feof(fp) and ferror(fp) both yield
0), you can safely store that value in an unsigned char. If you would
rather store it in a char, that's entirely up to you, but from now on that
value might no longer be representable as an unsigned char and so might be
unsuitable for passing to functions such as is* and to* without a cast.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
|
|
0
|
|
|
|
Reply
|
rjh (10789)
|
7/4/2008 9:48:46 AM
|
|
On Jul 4, 12:53=A0pm, rahul <rahulsin...@gmail.com> wrote:
>
> Cosider fgetc(). It actually returns an unsigned integer cast to an
> int. Casting to the int is required because EOF is defined as negative
> int(generally, but not necessarily, -1). So, other than the EOF, we
> get an unsigned int.
My mistake; I meant an unsigned char converted to an int so that it
can accommodate EOF.
|
|
0
|
|
|
|
Reply
|
rahulsinner (151)
|
7/4/2008 11:57:08 AM
|
|
On 4 Jul, 10:48, Richard Heathfield <r...@see.sig.invalid> wrote:
[Snip detailed explanation including...]
> =A0 =A0 =A0 =A0 =A0int fgetc(FILE *stream);
>
> Description
>
> =A0 =A0The fgetc function obtains the next character (if present) as an
> unsigned char converted to an int , from the input stream pointed to
> by stream , and advances the associated file position indicator for
> the stream (if defined).
Ah. You live and learn. Thanks very much to you and to Keith for
filling me in on this point.
Paul.
|
|
0
|
|
|
|
Reply
|
gw7rib (462)
|
7/4/2008 2:30:44 PM
|
|
s0suk3@gmail.com wrote:
> I've seen a lot of functions in the standard library that deal with
> characters, but a lot them return/take parameters of type int (which
> is usually the integer that represents the character code in the
> character set). I know that C automatically converts between int and
> char in those cases without problems (or could the singed/unsigned
> issue cause problems?), and that character constants have actually
> type int, but, what is the reason for some functions in the standard
> library to return type int (or take parameters of type int) when one
> is supposed to be dealing with characters instead of numbers?
The type of ('a') is type int.
strchr(s, 'a');
makes more sense with
char *strchr(const char *s, int c);
than it does with
char *strchr(const char *s, char c);
--
pete
|
|
0
|
|
|
|
Reply
|
pfiland (6613)
|
7/4/2008 2:39:31 PM
|
|
On Jul 4, 9:39 am, pete <pfil...@mindspring.com> wrote:
> s0s...@gmail.com wrote:
> > I've seen a lot of functions in the standard library that deal with
> > characters, but a lot them return/take parameters of type int (which
> > is usually the integer that represents the character code in the
> > character set). I know that C automatically converts between int and
> > char in those cases without problems (or could the singed/unsigned
> > issue cause problems?), and that character constants have actually
> > type int, but, what is the reason for some functions in the standard
> > library to return type int (or take parameters of type int) when one
> > is supposed to be dealing with characters instead of numbers?
>
> The type of ('a') is type int.
>
> strchr(s, 'a');
>
> makes more sense with
>
> char *strchr(const char *s, int c);
>
> than it does with
>
> char *strchr(const char *s, char c);
>
That's true at some extent. But one doesn't always pass a character
constant to a character-handling function. In fact, I'd say one rarely
does so. But the points rahul, Keith Thompson and Richard Heathfield
have expressed seem more reasonable. Thanks to everybody.
Sebastian
|
|
0
|
|
|
|
Reply
|
s0suk3 (372)
|
7/5/2008 8:58:02 AM
|
|
|
12 Replies
25 Views
(page loaded in 0.154 seconds)
|