f



sscanf error checking

I got the impression, according to the 'while-loop reading from file' 
thread (where I was strongly chastised for being lax with 
error-handling), that 'sscanf' was /the/ way to detect errors in 
numerical input.

But then I come across this:

   double x;
   int status;
   char* str = "123..456";

   status = sscanf(str,"%lf",&x);

   printf("Status = %d\n",status);
   printf("X      = %f\n",x);

Output:

   Status  = 1
   X       = 123.000000

The same when scanning "123GARBAGE". It's just stopping when it sees 
something that can't be part of the number without reporting anything 
amiss. Which is not far off why I do.

Am I doing something wrong or is sscanf not as great at verifying input 
as is made out?

Doing strtof("123..456",0) returns 123.0000.

If I do float("123..456") in Python it reports an error.

-- 
Bartc
0
BartC
12/14/2016 3:42:43 PM
comp.lang.c 30656 articles. 5 followers. spinoza1111 (3246) is leader. Post Follow

20 Replies
699 Views

Similar Articles

[PageSpeed] 18

In <o2rp71$3e7$1@dont-email.me> BartC <bc@freeuk.com> writes:

> The same when scanning "123GARBAGE". It's just stopping when it sees 
> something that can't be part of the number without reporting anything 
> amiss.

Why would it report anything amiss?  You told it to look for a number,
which it found.

-- 
John Gordon                   A is for Amy, who fell down the stairs
gordon@panix.com              B is for Basil, assaulted by bears
                                -- Edward Gorey, "The Gashlycrumb Tinies"

0
John
12/14/2016 3:58:26 PM
In article <o2rp71$3e7$1@dont-email.me>, BartC  <bc@freeuk.com> wrote:
....
>Output:
>
>   Status  = 1
>   X       = 123.000000
>
>The same when scanning "123GARBAGE". It's just stopping when it sees 
>something that can't be part of the number without reporting anything 
>amiss. Which is not far off why I do.
>
>Am I doing something wrong or is sscanf not as great at verifying input 
>as is made out?
>
>Doing strtof("123..456",0) returns 123.0000.
>
>If I do float("123..456") in Python it reports an error.

You're supposed to use %n to verify that it scanned everything it was
supposed to scan.

I changed your program to:

#include <stdio.h>
#include <string.h>

int main(void)
{
    double x;
    int status,n;
    char* str = "123..456";

    status = sscanf(str,"%lf%n",&x,&n);

    printf("Status = %d, n = %d, length = %lu\n",status,n,strlen(str));
    printf("X      = %f\n",x);
}

And you can see that the value of 'n' and the length of the string are not equal.
That indicates that something went wrong.

-- 
The randomly chosen signature file that would have appeared here is more than 4
lines long.  As such, it violates one or more Usenet RFCs.  In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
	http://user.xmission.com/~gazelle/Sigs/BestCLCPostEver
0
gazelle
12/14/2016 4:11:07 PM
On Wednesday December 14 2016 10:58, in comp.lang.c, "John Gordon"
<gordon@panix.com> wrote:

> In <o2rp71$3e7$1@dont-email.me> BartC <bc@freeuk.com> writes:
> 
>> The same when scanning "123GARBAGE". It's just stopping when it sees
>> something that can't be part of the number without reporting anything
>> amiss.
> 
> Why would it report anything amiss?  You told it to look for a number,
> which it found.

More to the point, BartC /didn't/ tell sscanf() to look for anything after the
number.


-- 
Lew Pitcher
"In Skills, We Trust"
PGP public key available upon request

0
Lew
12/14/2016 4:14:39 PM
BartC <bc@freeuk.com> writes:

> I got the impression, according to the 'while-loop reading from file'
> thread (where I was strongly chastised for being lax with
> error-handling), that 'sscanf' was /the/ way to detect errors in
> numerical input.
>
> But then I come across this:
>
>   double x;
>   int status;
>   char* str = "123..456";
>
>   status = sscanf(str,"%lf",&x);
>
>   printf("Status = %d\n",status);
>   printf("X      = %f\n",x);
>
> Output:
>
>   Status  = 1
>   X       = 123.000000
>
> The same when scanning "123GARBAGE". It's just stopping when it sees
> something that can't be part of the number without reporting anything
> amiss. Which is not far off why I do.

Yes, it's similar to what you did in you recent example.  The main
difference is the sscanf reports success or failure and you discard that
information.

> Am I doing something wrong or is sscanf not as great at verifying
> input as is made out?

You need to check the input.  sscanf just tells you it succeeded in
finding a number.  It can't report a failure here because it does not
know what you want.  You might want to pick the number off the front of
some alphanumeric code, or you may want treat anything following the
number as a sign of failure.  If you want the latter semantics, you can
use %n:

  sscanf(str, "%lf%n", &result, &len) == 1 && str[len] == 0

(Another strategy is to read a character and examine it, but that's more
fussy to write.)

But the strtoX functions are better for that this simple (much better in
that they will detect overflow).

-- 
Ben.
0
Ben
12/14/2016 4:40:57 PM
BartC <bc@freeuk.com> writes:
> I got the impression, according to the 'while-loop reading from file' 
> thread (where I was strongly chastised for being lax with 
> error-handling), that 'sscanf' was /the/ way to detect errors in 
> numerical input.

Your impression is incorrect.

> But then I come across this:
>
>    double x;
>    int status;
>    char* str = "123..456";
>
>    status = sscanf(str,"%lf",&x);
>
>    printf("Status = %d\n",status);
>    printf("X      = %f\n",x);
>
> Output:
>
>    Status  = 1
>    X       = 123.000000

Your input starts with a valid floating-point constant, "1.23.".
sscanf successfully parses it.  It uses a greedy algorithm (see
the standard for the precise specification), so it doesn't care
that there's invalid data following the valid date.

If I change the format string to "%lf%lf", I get two results,
123.0 (scanned from "123.") and 0.456 (scanned from ".456").

If you want to require spaces between the two constants, you can insert
a "%*[ ]" format specifier, or "%*[ \t\v\r\l]" to accept any white space
characters other than newline.  (Please don't assume that I'm arguing
this isn't more complicated than it should be.  I'm describing how it
works, not defending it.)

If you're doing this kind of thing, you might consider using a
(non-standard) regular expression library, or perhaps lex/flex.

> The same when scanning "123GARBAGE". It's just stopping when it sees 
> something that can't be part of the number without reporting anything 
> amiss.

Yes.

>        Which is not far off why I do.

What?

> Am I doing something wrong or is sscanf not as great at verifying input 
> as is made out?

Made out by whom?
     
I've pointed out serious problems with the *scanf() function before.

Calling fgets() (or perhaps something that can handle arbitrarily long
lines) followed by sscanf() is generally better than calling scanf().
For one thing, scanf() skips whitespace before each parsed item
*including newline* so it can be bad for line-oriented input.  If you're
trying to scan 3 items per line, but one line has only 2 items, scanf()
will get out of synch.  Using fgets() to read a line at a time avoids
that.

But it doesn't avoid the other problems with *scanf().

If you want to parse "123..456" to get "123." *and* an indication that
there's garbage after the valid data, you can do something like:

    char rest[100];
    status = sscanf(str, "%lf%[^\n]", &x, rest);

to capture the rest of the string (`rest` has to be big enough); you can
then examine the contents of rest to decide what to do.

But none of this solves what is in my opinion the very worst feature of
the *scanf() functions.  On numeric input, if the input item is
syntactically correct but outside the range of the target type, the
behavior is undefined.  N1570 7.21.6.2p10:

    If this object does not have an appropriate type, or if the
    result of the conversion cannot be represented in the object,
    the behavior is undefined.

If you're reading numeric input from a data source that you don't
control, it's impossible to use *scanf() safely.

I would love to see a future C standard change this, so that numeric
overflow is treated as a matching failure or in some well defined
manner.

> Doing strtof("123..456",0) returns 123.0000.
>
> If I do float("123..456") in Python it reports an error.

Sure, if you pass 0 (a null pointer) as the second argument to
strtof(), you're discarding information.  strtof() (or strtod()
if you're using type double, as you did in the original example)
will tell you exactly how much of the input it consumed.  It also
has well defined behavior on numeric overflow.  It's IMHO more
complicated than it should be, but it works.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
0
Keith
12/14/2016 5:34:25 PM
On 14/12/2016 16:11, Kenny McCormack wrote:
> In article <o2rp71$3e7$1@dont-email.me>, BartC  <bc@freeuk.com> wrote:
> ...
>> Output:
>>
>>   Status  = 1
>>   X       = 123.000000
>>
>> The same when scanning "123GARBAGE". It's just stopping when it sees
>> something that can't be part of the number without reporting anything
>> amiss. Which is not far off why I do.
>>
>> Am I doing something wrong or is sscanf not as great at verifying input
>> as is made out?

> You're supposed to use %n to verify that it scanned everything it was
> supposed to scan.
>
> I changed your program to:
>
> #include <stdio.h>
> #include <string.h>
>
> int main(void)
> {
>     double x;
>     int status,n;
>     char* str = "123..456";
>
>     status = sscanf(str,"%lf%n",&x,&n);
>
>     printf("Status = %d, n = %d, length = %lu\n",status,n,strlen(str));
>     printf("X      = %f\n",x);
> }
>
> And you can see that the value of 'n' and the length of the string are not equal.
> That indicates that something went wrong.
>

OK, thanks. I'll be using sscanf on a string that already contains a 
trimmed, single item that either ought to contain only a number, or that 
I'm checking contains only a number.

(Actually, up to now I've only used sscanf in one place and that /was/ 
with a "%lf%n" format, but that was to be able to step to the next item 
within a larger string.)

-- 
Bartc
0
BartC
12/14/2016 5:51:14 PM
In article <o2s0o1$fb$1@dont-email.me>, BartC  <bc@freeuk.com> wrote:
....
>OK, thanks. I'll be using sscanf on a string that already contains a 
>trimmed, single item that either ought to contain only a number, or that 
>I'm checking contains only a number.
>
>(Actually, up to now I've only used sscanf in one place and that /was/ 
>with a "%lf%n" format, but that was to be able to step to the next item 
>within a larger string.)

Or you could just use strtod() and friends.
These are setup to convert one string-to-number - and have good error
reporting facilities.

-- 
In politics and in life, ignorance is not a virtue.
-- Barack Obama --
0
gazelle
12/14/2016 6:09:14 PM
BartC <bc@freeuk.com> writes:

> On 14/12/2016 16:11, Kenny McCormack wrote:
>> In article <o2rp71$3e7$1@dont-email.me>, BartC  <bc@freeuk.com> wrote:
>> ...
>>> Output:
>>>
>>>   Status  = 1
>>>   X       = 123.000000
>>>
>>> The same when scanning "123GARBAGE". It's just stopping when it sees
>>> something that can't be part of the number without reporting anything
>>> amiss. Which is not far off why I do.
>>>
>>> Am I doing something wrong or is sscanf not as great at verifying input
>>> as is made out?
>
>> You're supposed to use %n to verify that it scanned everything it was
>> supposed to scan.
>>
>> I changed your program to:
>>
>> #include <stdio.h>
>> #include <string.h>
>>
>> int main(void)
>> {
>>     double x;
>>     int status,n;
>>     char* str = "123..456";
>>
>>     status = sscanf(str,"%lf%n",&x,&n);
>>
>>     printf("Status = %d, n = %d, length = %lu\n",status,n,strlen(str));
>>     printf("X      = %f\n",x);
>> }
>>
>> And you can see that the value of 'n' and the length of the string are not equal.
>> That indicates that something went wrong.
>
> OK, thanks.

(It's slightly simpler to test that str[n] is 0.)

<snip>
> (Actually, up to now I've only used sscanf in one place and that /was/
> with a "%lf%n" format, but that was to be able to step to the next
> item within a larger string.)

The code that (I thought) you wrote for your line input exercise has
sscanf(s,"%lf",&x) in the strtofloat function.

-- 
Ben.
0
Ben
12/14/2016 7:49:41 PM
On 14/12/2016 19:49, Ben Bacarisse wrote:
> BartC <bc@freeuk.com> writes:

>>>     status = sscanf(str,"%lf%n",&x,&n);
>>>
>>>     printf("Status = %d, n = %d, length = %lu\n",status,n,strlen(str));
>>>     printf("X      = %f\n",x);
>>> }
>>>
>>> And you can see that the value of 'n' and the length of the string are not equal.
>>> That indicates that something went wrong.
>>
>> OK, thanks.
>
> (It's slightly simpler to test that str[n] is 0.)

(I already know the length of the string.)

> <snip>
>> (Actually, up to now I've only used sscanf in one place and that /was/
>> with a "%lf%n" format, but that was to be able to step to the next
>> item within a larger string.)
>
> The code that (I thought) you wrote for your line input exercise has
> sscanf(s,"%lf",&x) in the strtofloat function.

Yes it was. But my interpreter apparently uses %lf%n to read a float 
then step along the buffer.

(I posted an example in that language today that used 'readln @f,a,b,c 
to read a line of input then read 3 floats. Actually it reads ints 
unless otherwise specified. I'm changing that so that each read item is 
set to an int, float or string according to what is encountered. That 
will make my example accurate.

Otherwise the example would have to be 'readln @f,a:"r",b:"r",c:"r" 
which is not quite as short and sweet compared to the C. However the 
change makes emulating that in a static language harder.)

-- 
Bartc
0
BartC
12/14/2016 8:43:31 PM
On Wed, 14 Dec 2016 15:42:43 +0000, BartC <bc@freeuk.com> wrote:

>I got the impression, according to the 'while-loop reading from file' 
>thread (where I was strongly chastised for being lax with 
>error-handling), that 'sscanf' was /the/ way to detect errors in 
>numerical input.
>
>But then I come across this:
>
>   double x;
>   int status;
>   char* str = "123..456";
>
>   status = sscanf(str,"%lf",&x);
>
>   printf("Status = %d\n",status);
>   printf("X      = %f\n",x);
>
>Output:
>
>   Status  = 1
>   X       = 123.000000
>
>The same when scanning "123GARBAGE". It's just stopping when it sees 
>something that can't be part of the number without reporting anything 
>amiss. Which is not far off why I do.
>
>Am I doing something wrong or is sscanf not as great at verifying input 
>as is made out?
>
>Doing strtof("123..456",0) returns 123.0000.
>
>If I do float("123..456") in Python it reports an error.


IMO, the *scanf() and ato*() functions are pretty much of no use
outside of toy problems, the lack of meaningful error handling being
the biggest issue.  YMMV.
0
Robert
12/14/2016 8:52:24 PM
On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
> IMO, the *scanf() and ato*() functions are pretty much of no use
> outside of toy problems, the lack of meaningful error handling being
> the biggest issue.  YMMV.

Many programs are intended only for use upon validated input.  Unless an
implementation decides to jump the rails when given invalid input (something
which the Standard doesn't forbid, but which I don't think it ever intended
to encourage) the semantics of scanf may be adequate for use in such contexts.
0
supercat
12/15/2016 3:24:06 PM
supercat@casperkitty.com writes:
> On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
>> IMO, the *scanf() and ato*() functions are pretty much of no use
>> outside of toy problems, the lack of meaningful error handling being
>> the biggest issue.  YMMV.
>
> Many programs are intended only for use upon validated input.

Many programs are poorly written.  That's often why malware is able
to work.

>                                                                Unless
> an implementation decides to jump the rails when given invalid input
> (something which the Standard doesn't forbid, but which I don't think
> it ever intended to encourage) the semantics of scanf may be adequate
> for use in such contexts.

scanf() in particular reads only from standard input, which is
typically under the control of the user, not of the programmer.

Even sscanf() and fscanf() allow you to check for errors *other than*
numeric overflow, and users are routinely advised that they should
always check the value returned by any of the *scanf() functions
rather than assuming valid input.  Why should there be no way to
detect or handle numeric overflow in particular?

I think the strto*() functions are probably newer than the *scanf()
functions.  I speculate that checking for numeric overflow was
considered an excessive burden on implementers when the *scanf()
functions were first implemented and specified.  It is my strong
opinion that that excuse no longer applies.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
0
Keith
12/15/2016 7:26:41 PM
supercat@casperkitty.com wrote:

> On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
> > IMO, the *scanf() and ato*() functions are pretty much of no use
> > outside of toy problems, the lack of meaningful error handling being
> > the biggest issue.  YMMV.
> 
> Many programs are intended only for use upon validated input.

There is no such thing as validated input; there is only _temporarily_
validated input.

A program using scanf()# and ato*() is only safe if:
- it is used with a single set of input;
- that input is validated _instantly_ before it is fed to the program;
- the program is deleted or incapacitated immediately afterward.

(A program using gets() is, of course, unsafe even then.)

Richard

# sscanf() and fscanf() are _slightly_ better.
0
raltbos
12/15/2016 8:20:24 PM
raltbos@xs4all.nl (Richard Bos) writes:
> supercat@casperkitty.com wrote:
>> On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
>> > IMO, the *scanf() and ato*() functions are pretty much of no use
>> > outside of toy problems, the lack of meaningful error handling being
>> > the biggest issue.  YMMV.
>> 
>> Many programs are intended only for use upon validated input.
>
> There is no such thing as validated input; there is only _temporarily_
> validated input.
>
> A program using scanf()# and ato*() is only safe if:
> - it is used with a single set of input;
> - that input is validated _instantly_ before it is fed to the program;
> - the program is deleted or incapacitated immediately afterward.

scanf() can, with some care, be used safely for non-numeric input.
But the hassle of doing so can easily exceed the cost of using
an alternative.

> (A program using gets() is, of course, unsafe even then.)

In the circumstances you suggest, gets() *could* be used safely.  As
long as the input line doesn't exceed the size of the buffer, it has
well defined behavior.  Of course the conditions you suggest are
(deliberately) unrealistic, and it's not at all difficult to write code
that doesn't depend on them (if nothing else, fgets() followed by
deleting any trailing '\n').

> # sscanf() and fscanf() are _slightly_ better.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
0
Keith
12/15/2016 8:33:19 PM
On Thursday, December 15, 2016 at 7:25:50 PM UTC, Keith Thompson wrote:
> supercat@casperkitty.com writes:
> > On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
> >> IMO, the *scanf() and ato*() functions are pretty much of no use
> >> outside of toy problems, the lack of meaningful error handling being
> >> the biggest issue.  YMMV.
> >
> > Many programs are intended only for use upon validated input.
> 
> Many programs are poorly written.  That's often why malware is able
> to work.
> 
Many programs are given to the user in source code form, and he is
expected to have the skills to use a C compiler to generate an
executable and run it.

So there's not much point in putting in security measures.
0
Malcolm
12/15/2016 11:35:56 PM
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> On Thursday, December 15, 2016 at 7:25:50 PM UTC, Keith Thompson wrote:
>> supercat@casperkitty.com writes:
>> > On Wednesday, December 14, 2016 at 2:52:04 PM UTC-6, robert...@yahoo.com wrote:
>> >> IMO, the *scanf() and ato*() functions are pretty much of no use
>> >> outside of toy problems, the lack of meaningful error handling being
>> >> the biggest issue.  YMMV.
>> >
>> > Many programs are intended only for use upon validated input.
>> 
>> Many programs are poorly written.  That's often why malware is able
>> to work.
>> 
> Many programs are given to the user in source code form, and he is
> expected to have the skills to use a C compiler to generate an
> executable and run it.

Most aren't.

> So there's not much point in putting in security measures.

Huh?

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
0
Keith
12/16/2016 12:28:21 AM
On Friday, December 16, 2016 at 12:29:53 AM UTC, Keith Thompson wrote:
> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>
> > So there's not much point in putting in security measures.
> 
> Huh?
> 
If the environment is such that user can write, compile and run 
arbitrary C code with the same privileges as your program, 
then there's no point in having any security against a malicious
user. Against malicious inputs that may be provided by someone
else, yes.
0
Malcolm
12/16/2016 12:37:02 AM
On Thursday, December 15, 2016 at 1:25:50 PM UTC-6, Keith Thompson wrote:
> I speculate that checking for numeric overflow was
> considered an excessive burden on implementers when the *scanf()
> functions were first implemented and specified.  It is my strong
> opinion that that excuse no longer applies.

I think it's more likely that in the days before the Standard,
implementations may have done a variety of different things in an
effort to be useful (raise signals, set errno, etc.), while others
might have behaved in ways that were generally benign but not quite
predictable enough to qualify as "Implementation-Defined".  The
only ways the authors of the Standard could avoid forcing implementations
to change what might be useful behaviors were to leave the behavior as
undefined, or define a new category of behavior for things which
implementations need not define, but should define when practical.  Of
course, if they had subdivided Undefined Behavior on such a basis, there
would have been many other actions which should have also qualified for
that new category.
0
supercat
12/16/2016 12:38:36 AM
If you don't like scanf family way
of translate arguments: It is easy
write your own, have not that bug
Something return the n of chars the
output string has,
That has one argument for len of
string of output
 that says how many
conversion are right too etc
I remember I wrote such sscanf family
but they not follow the standard C
sscanf family, they were less
rich of predefined format too
0
asetofsymbols
12/16/2016 9:49:35 AM
Possibly I was confuse with printf
family when speak of output string
0
asetofsymbols
12/16/2016 9:53:19 AM
Reply: