security coding guidelines for C/C++

  • Follow


I am Aravind.Could someone provide me with a list of specific
guidelines for secure programming in C/C++?.I would like to use those
guidelines for developing a security application to deal with issues
like buffer overflows,memory leaks,user input validation etc....
Aravind
0
Reply arvind_c_98 (5) 5/24/2004 1:04:40 PM

Aravind wrote:
> 
> I am Aravind.Could someone provide me with a list of specific
> guidelines for secure programming in C/C++?.I would like to use those
> guidelines for developing a security application to deal with issues
> like buffer overflows,memory leaks,user input validation etc....

No.  Nobody here ever heard of the language C/C++.  On the other
hand, the C++ group is down the hall to the right, and we do deal
with C here.

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

0
Reply cbfalconer (19183) 5/24/2004 2:40:01 PM


"Aravind" <arvind_c_98@yahoo.com> wrote in message
>
> I am Aravind.Could someone provide me with a list of specific
> guidelines for secure programming in C/C++?.
> I would like to use those guidelines for developing a security
> application to deal with issues like buffer overflows,memory
> leaks,user input validation etc....
>
Security is a real problem for C programs, and it is not easy to write tools
to check for it.
The worst problem is when user input overflows an "auto" (stack) array, on
systems where this corrupts the reurn stack. An attacker can use this to
induce a jump to a location of his choosing, and thus introduce malicious
code.
It is also possible to oveflow the stack. For instance the code

double eval( char *expr)
{
  ...
  if(*expr == '9')
    temp = eval(expr+1);
  ...
}

can be caused to crash by inputting a huge number of open parentheses.

You simply have to be careful to call malloc() with the right size, not
overstep the array, check the return value, and free memory after you have
done with it. The good news is that there is little the user can do to wreck
things here. (To test, a good technique is to provide a version of malloc()
that fails periodically).

For user input, be aware that the user can type anything, and assume he is
trying to wreck your program and has a copy of the source.


0
Reply Malcolm 5/24/2004 8:00:08 PM

On Mon, 24 May 2004 21:00:08 +0100, Malcolm wrote:

> 
> "Aravind" <arvind_c_98@yahoo.com> wrote in message
>>
>> I am Aravind.Could someone provide me with a list of specific
>> guidelines for secure programming in C/C++?.
>> I would like to use those guidelines for developing a security
>> application to deal with issues like buffer overflows,memory
>> leaks,user input validation etc....
>>
> Security is a real problem for C programs, and it is not easy to write tools
> to check for it.
> The worst problem is when user input overflows an "auto" (stack) array, on
> systems where this corrupts the reurn stack. An attacker can use this to
> induce a jump to a location of his choosing, and thus introduce malicious
> code.

This is largely obviated by either fgets() or the much better
fgetc()/realloc() method, or something very similar adapted to your
specific input environment.

For a good example of the fgetc()/realloc() method, see the fggets()
implementation by CBFalconer:

http://cbfalconer.home.att.net/download/ggets.zip

> It is also possible to oveflow the stack. For instance the code
> 
> double eval( char *expr)
> {
>   ...
>   if(*expr == '9')
>     temp = eval(expr+1);
>   ...
> }
> 
> can be caused to crash by inputting a huge number of open parentheses.

Or, in your case, a large number of nines. ;-)

This is probably the most serious problem if you fail to plan ahead.
Scanning the string beforehand to see if all of the parens match and if
the nesting is too deep could be valuable, or you could give eval a static
int that says how many layers deep the recursion has gone. If the int gets
too big, you return badly.

Of course, how deep is too deep is somewhat up in the air. Do any of the
standards specify the minimum recursion depth an implementation must
support?

> 
> You simply have to be careful to call malloc() with the right size, not
> overstep the array, check the return value, and free memory after you have
> done with it. The good news is that there is little the user can do to wreck
> things here.

And, of course, always #include <stdlib.h> and never cast its return value.

> (To test, a good technique is to provide a version of malloc()
> that fails periodically).

That, or begin developing big programs for small machines. ;-)

> 
> For user input, be aware that the user can type anything, and assume he is
> trying to wreck your program and has a copy of the source.

Indeed. This is the best possible route to defensive programming, and I
wish more Microsoft programmers (for example) thought this way.

-- 
yvoregnevna gjragl-guerr gjb-gubhfnaq guerr ng lnubb qbg pbz
To email me, rot13 and convert spelled-out numbers to numeric form.
"Makes hackers smile" makes hackers smile.

0
Reply see84 (117) 5/25/2004 6:16:56 AM

In article <pan.2004.05.25.06.16.54.597513@sig.now>,
 August Derleth <see@sig.now> wrote:

> On Mon, 24 May 2004 21:00:08 +0100, Malcolm wrote:
> > 
> > For user input, be aware that the user can type anything, and assume he is
> > trying to wreck your program and has a copy of the source.
> 
> Indeed. This is the best possible route to defensive programming, and I
> wish more Microsoft programmers (for example) thought this way.

I think it is too weak. 

For input, assume the input doesn't come from a user, but from a 
completely unscrupulous attacker who is paid serious money to cause as 
much damage as possible.

Assume that the attacker has both the source code and the assembler 
code; the source code is useful to find where your program has undefined 
behaviour according to the C Standard, the assembler code tells the 
attacker what your implementation actually does in the case of undefined 
behavior. 

Assume that the attacker's goal is not just to crash your program, but 
to take control of your computer, and any undefined behavior in your 
programmer could give the attacker the means to achieve this. Just 
assume that a successful attacker will force the computer to send _your_ 
name, address, date of birth, mother's maiden name, social security, 
passport, driver license and credit card numbers, bank details and so on 
to the attacker, leaving a ghastly amount of child pornography on your 
computer in exchange which will be discovered by your 
employer/wife/girlfriend, leading to job loss/loss of important body 
parts and several years of jailtime. That should keep you motivated, and 
motivation is the most important thing for secure programming.
0
Reply christian.bau (880) 5/25/2004 8:56:11 AM

The problems of buffer overflows could soon become less relevant anyway,
with Intel and AMD's new chips at least (Not that I'm suggesting in anyway
that it shouldnt still be protected against!). And NX does require OS
support as well.
http://www.internet-security.ca/internet-security-news-005/intel-prescott-cpu-will-end-buffer-overflows-security-problems.html
What a god send for Microsoft, easier to include NX support in a service
pack than to correct all the 50m lines of code i suppose, it will be
interesting to see which path they take!

The Fat Man

"Aravind" <arvind_c_98@yahoo.com> wrote in message
news:5eee70d3.0405240504.43a2d44@posting.google.com...
> I am Aravind.Could someone provide me with a list of specific
> guidelines for secure programming in C/C++?.I would like to use those
> guidelines for developing a security application to deal with issues
> like buffer overflows,memory leaks,user input validation etc....
> Aravind


0
Reply m.jakeman (6) 5/25/2004 9:08:56 AM

On Tue, 25 May 2004, Matthew Jakeman wrote:
>
[re: discussion of security in C/C++ programs]
>
> The problems of buffer overflows could soon become less relevant anyway,
> with Intel and AMD's new chips at least (Not that I'm suggesting in anyway
> that it shouldnt still be protected against!). And NX does require OS
> support as well.
> http://www.internet-security.ca/internet-security-news-005/intel-prescott-cpu-will-end-buffer-overflows-security-problems.html
> What a god send for Microsoft, easier to include NX support in a service
> pack than to correct all the 50m lines of code i suppose, it will be
> interesting to see which path they take!

  The press release doesn't say much about the technology, but from
what I can gather, it's basically making the stack segment
read-and-write-only, no execution privileges.  I thought Intel machines
had segments you couldn't execute from already, like the data segment?
No?
  And if it is what I think it is, then it doesn't protect against
an attacker's taking over your machine: it only protects against an
attacker's being able to take over your machine with code he wrote
himself and inserted into the stack frame.  A static-buffer overflow
*followed by* a stack-buffer overflow executing a jump into the
overflowed static buffer would be just as devastating as the old kind;
and stack-buffer overflows could still make the computer crash or
execute the wrong code.
  So is the "Magic Lamp to end buffer overflow exploits" tagline just
normal media hype, or am I really missing something WRT the capabilities
of the "new" technology?

(Fup-to: comp.programming.  comp.lang.c will thank you.)

-Arthur
0
Reply ajo (1601) 5/25/2004 1:15:21 PM

Arthur J. O'Dwyer wrote:

> On Tue, 25 May 2004, Matthew Jakeman wrote:
> 
> [re: discussion of security in C/C++ programs]
> 
>>The problems of buffer overflows could soon become less relevant anyway,
>>with Intel and AMD's new chips at least (Not that I'm suggesting in anyway
>>that it shouldnt still be protected against!). And NX does require OS
>>support as well.
>>http://www.internet-security.ca/internet-security-news-005/intel-prescott-cpu-will-end-buffer-overflows-security-problems.html
>>What a god send for Microsoft, easier to include NX support in a service
>>pack than to correct all the 50m lines of code i suppose, it will be
>>interesting to see which path they take!
> 
> 
>   The press release doesn't say much about the technology, but from
> what I can gather, it's basically making the stack segment
> read-and-write-only, no execution privileges.  I thought Intel machines
> had segments you couldn't execute from already, like the data segment?
> No?
>   And if it is what I think it is, then it doesn't protect against
> an attacker's taking over your machine: it only protects against an
> attacker's being able to take over your machine with code he wrote
> himself and inserted into the stack frame.  A static-buffer overflow
> *followed by* a stack-buffer overflow executing a jump into the
> overflowed static buffer would be just as devastating as the old kind;
> and stack-buffer overflows could still make the computer crash or
> execute the wrong code.
>   So is the "Magic Lamp to end buffer overflow exploits" tagline just
> normal media hype, or am I really missing something WRT the capabilities
> of the "new" technology?
> 
> (Fup-to: comp.programming.  comp.lang.c will thank you.)
> 
> -Arthur

I read it as media hype, Arthur.  It makes it harder, but does
not eliminate it.  IIRC, SUN has a compiler(linker?) switch something
like "NO_STACK_EXECUTE" which seems to provide the same
functionality.  Whatever happened to the days of separate
I&D space?  Would it solve the problem now?  I just don't
know.

Nick L.


-- 
"It is impossible to make anything foolproof
because fools are so ingenious"
  - A. Bloch
0
Reply hukolau3 (292) 5/26/2004 12:00:30 AM

Nick Landsberg <hukolau@NOSPAM.att.net> wrote in
news:y6Rsc.67934$hH.1212500@bgtnsc04-news.ops.worldnet.att.net: 

> I read it as media hype, Arthur.  It makes it harder, but does
> not eliminate it.  IIRC, SUN has a compiler(linker?) switch something
> like "NO_STACK_EXECUTE" which seems to provide the same
> functionality.  Whatever happened to the days of separate
> I&D space?  Would it solve the problem now?  I just don't
> know.

And whatever happened to computer languages that perform run-time
array bounds checking? How about languages that also perform run-time
data validation, such as ensuring that a numeric value is always
within a specified range?

Application security, no matter what the hardware or language used,
requires strict and accurate data validation. Any program that 
allows the user will to enter harmful data is an insecure program.

Jim Rogers
0
Reply jimmaureenrogers (283) 5/26/2004 12:43:06 AM

James Rogers wrote:
> Nick Landsberg <hukolau@NOSPAM.att.net> wrote in
> news:y6Rsc.67934$hH.1212500@bgtnsc04-news.ops.worldnet.att.net: 
> 
> 
>>I read it as media hype, Arthur.  It makes it harder, but does
>>not eliminate it.  IIRC, SUN has a compiler(linker?) switch something
>>like "NO_STACK_EXECUTE" which seems to provide the same
>>functionality.  Whatever happened to the days of separate
>>I&D space?  Would it solve the problem now?  I just don't
>>know.
> 
> 
> And whatever happened to computer languages that perform run-time
> array bounds checking? How about languages that also perform run-time
> data validation, such as ensuring that a numeric value is always
> within a specified range?

This is the job of the programmer writing the code. It can and should 
all be done at source-code level.

Now you may say that not everyone will and that this policy should be 
applied automatically on all software running on the machine. However 
computers are stupid and when the code is compiled to machine code the 
intentions of the programmer are obfuscated.

Thus the only way this can be done automatically AFAICS would be to have 
the source-code available at runtime. Unfortunately many companies are 
unwilling to release this.

> Application security, no matter what the hardware or language used,
> requires strict and accurate data validation. Any program that 
> allows the user will to enter harmful data is an insecure program.

True words. That should also be expanded to *any* sort of input, whether 
by humans or computers but especially from networks.

-- 
Ben M.
0
Reply saint_abroadremove (144) 5/26/2004 10:24:31 AM

In article <Pine.LNX.4.58-035.0405250906360.1193@unix45.andrew.cmu.edu>, "Arthur J. O'Dwyer" <ajo@nospam.andrew.cmu.edu> writes:
> 
>   The press release doesn't say much about the technology, but from
> what I can gather, it's basically making the stack segment
> read-and-write-only, no execution privileges.  I thought Intel machines
> had segments you couldn't execute from already, like the data segment?
> No?

No.  x86 CPUs have read and write page permissions only.  The same is
true of PowerPC, which means the two dominant CPU architectures for
general-purpose computers lack separate page-execute permissions.

This comes up over and over again in Bugtraq and similar forums.  It's
odd how much confusion there is even in the security community about
MMU capabilities.

>   And if it is what I think it is, then it doesn't protect against
> an attacker's taking over your machine: it only protects against an
> attacker's being able to take over your machine with code he wrote
> himself and inserted into the stack frame.

Or any other non-executable page.

> A static-buffer overflow
> *followed by* a stack-buffer overflow executing a jump into the
> overflowed static buffer would be just as devastating as the old kind;
> and stack-buffer overflows could still make the computer crash or
> execute the wrong code.

Yes; so-called "return into libc" exploits still work fine with
separate-execute permissions.

And, of course, there are plenty of buffer overflow exploits that
don't depend on executing the wrong code at all; the "right" code
will generally do the wrong thing when given the wrong data, and many
overflows exploit that.

>   So is the "Magic Lamp to end buffer overflow exploits" tagline just
> normal media hype, or am I really missing something WRT the capabilities
> of the "new" technology?

It's normal media hype, compounded by the general lack of security
knowledge.

-- 
Michael Wojcik                  michael.wojcik@microfocus.com

The antics which have been drawn together in this book are huddled here
for mutual protection like sheep.  If they had half a wit apiece each
would bound off in many directions, to unsimplify the target.
 -- Walt Kelly
0
Reply mwojcik (1874) 5/26/2004 4:13:13 PM

Ben Measures <saint_abroadremove@removehotmail.com> wrote in 
news:zf_sc.2943$Wb2.29826051@news-text.cableinet.net:

> James Rogers wrote:
>> Nick Landsberg <hukolau@NOSPAM.att.net> wrote in
>> news:y6Rsc.67934$hH.1212500@bgtnsc04-news.ops.worldnet.att.net: 
>> 
>> 
>>>I read it as media hype, Arthur.  It makes it harder, but does
>>>not eliminate it.  IIRC, SUN has a compiler(linker?) switch something
>>>like "NO_STACK_EXECUTE" which seems to provide the same
>>>functionality.  Whatever happened to the days of separate
>>>I&D space?  Would it solve the problem now?  I just don't
>>>know.
>> 
>> 
>> And whatever happened to computer languages that perform run-time
>> array bounds checking? How about languages that also perform run-time
>> data validation, such as ensuring that a numeric value is always
>> within a specified range?
> 
> This is the job of the programmer writing the code. It can and should 
> all be done at source-code level.
> 
> Now you may say that not everyone will and that this policy should be 
> applied automatically on all software running on the machine. However 
> computers are stupid and when the code is compiled to machine code the 
> intentions of the programmer are obfuscated.

I agree that this is the responsibility of the programmer writing the code.
There are languages that simplify this repsonsibility.

My personal favorite language for illustrating such simplification is Ada.
I can define a numeric type with an explicit range as below:

type My_Int is range -20..231;

Ada is a strongly typed, statically typed, language. The compiler keeps
track of scalar type range constraints. It also automatically generates
range checking code as a default behavior. Any range violation is greeted
with a Constraint_Error exception.

Thus, the following code will raise Constraint_Error:

foo : My_Int := -20;

foo := foo - 1;

The second line would result in an out-of-range value for the variable foo.
Constraint_Error is raised and must be handled or the program will
terminate.

Similarly, Ada provides a generic I/O package for integer types. When
instantiated for a specific type, the package will generate all 
necessary validation checks for I/O with that integer type.

package MY_IO is new Ada.Text_IO.Integer_Text_IO(My_Int);

My_IO.Get(Item => foo);

The Get procedure will read standard input and place the value
read in foo. The convenience is that the exception Data_Error is
raised if the input is anything except a number in the range specified
for My_Int.

Ada does not provide any implicit conversions between types. This
prevents the kinds of problems found in C when a negative
signed integer is converted to an unsigned integer. Such problems have
been used in security attacks to overflow memory. If a negative signed
integer value is passed to the C malloc function it is automatically
converted to a very large unsigned value with the same bit pattern.
The result is the allocation of much more memory than the programmer
ever expected, even if the programmer checked for an upper limit for
the value passed to malloc.

> 
> Thus the only way this can be done automatically AFAICS would be to have 
> the source-code available at runtime. Unfortunately many companies are 
> unwilling to release this.

The source code need not be available at run time if the executable 
contains the correct validation expressions at run time.

0
Reply jimmaureenrogers (283) 5/27/2004 12:35:55 AM

James Rogers wrote:
> Ben Measures <saint_abroadremove@removehotmail.com> wrote in 
> news:zf_sc.2943$Wb2.29826051@news-text.cableinet.net:
>>James Rogers wrote:
>>>And whatever happened to computer languages that perform run-time
>>>array bounds checking? How about languages that also perform run-time
>>>data validation, such as ensuring that a numeric value is always
>>>within a specified range?
>>
>>This is the job of the programmer writing the code. It can and should 
>>all be done at source-code level.
>>
>>Now you may say that not everyone will and that this policy should be 
>>applied automatically on all software running on the machine. However 
>>computers are stupid and when the code is compiled to machine code the 
>>intentions of the programmer are obfuscated.
> 
> 
> I agree that this is the responsibility of the programmer writing the code.
> There are languages that simplify this repsonsibility.
> 
> My personal favorite language for illustrating such simplification is Ada.
> I can define a numeric type with an explicit range as below:
> 
> type My_Int is range -20..231;
> 
> Ada is a strongly typed, statically typed, language. The compiler keeps
> track of scalar type range constraints. It also automatically generates
> range checking code as a default behavior. Any range violation is greeted
> with a Constraint_Error exception.

This is simply the compiler generating extra machine code that checks 
the ranges are within that specified in the source code. This is indeed 
very helpful but can still be done by writing source code to do these 
checks. In the end, it is the programmer that has to define these ranges 
and make sure the checks are performed.

> Ada does not provide any implicit conversions between types. This
> prevents the kinds of problems found in C when a negative
> signed integer is converted to an unsigned integer. Such problems have
> been used in security attacks to overflow memory. If a negative signed
> integer value is passed to the C malloc function it is automatically
> converted to a very large unsigned value with the same bit pattern.
> The result is the allocation of much more memory than the programmer
> ever expected, even if the programmer checked for an upper limit for
> the value passed to malloc.

I agree that this is problem that does need care to avoid (in languages 
with implicit conversions).

>>Thus the only way this can be done automatically AFAICS would be to have 
>>the source-code available at runtime. Unfortunately many companies are 
>>unwilling to release this.
> 
> The source code need not be available at run time if the executable 
> contains the correct validation expressions at run time.

It doesn't matter how the executable contains validations (whether by 
compiler or by explicit source code), either way the programmer took the 
care to define them.

By "automatically" I mean external software (such as an OS kernel) 
examining the machine code of running programs (written by careless 
people that do not necessarily have validations in place) and inserting 
extra validations appropriately. For this to have any chance of 
suceeding the programmer's intentions must be as clear as possible, and 
the source code (hopefully) contains such intentions.

-- 
Ben M.
0
Reply saint_abroadremove (144) 5/27/2004 1:17:44 PM

On Thu, 27 May 2004, Ben Measures wrote:
>
> James Rogers wrote:
> > Ben Measures <saint_abroadremove@removehotmail.com> wrote...
> >>James Rogers wrote:
> >>>And whatever happened to computer languages that perform run-time
> >>>array bounds checking? How about languages that also perform run-time
> >>>data validation, such as ensuring that a numeric value is always
> >>>within a specified range?
> >>
> >>This is the job of the programmer writing the code. It can and should
> >>all be done at source-code level.
> >
> > I agree that this is the responsibility of the programmer writing the
> > code.  There are languages that simplify this responsibility.
    [ Like Ada, of course. ;-) ]

> This is simply the compiler generating extra machine code that checks
> the ranges are within that specified in the source code. This is indeed
> very helpful but can still be done by writing source code to do these
> checks. In the end, it is the programmer that has to define these ranges
> and make sure the checks are performed.

  You may not be missing the point, but you're making it sound like you
are.  "The programmer" in the C or C++ case is the one guy writing the
program.  "The programmer" in the Ada case is the guy writing the
program, *plus* several dozen other guys who wrote the compiler and
runtime system that generates and enforces these compile-time and run-time
sanity checks.  Sure, human beings have to do everything in this world;
but it's a question of how many human beings, and how often.  James is
absolutely right that in Ada, the answer is "fewer and less often."

  This doesn't mean Ada is better than C, of course! ;-)  But it's
true that there *are* paradigms in which the programmer doesn't need
to be on his toes *all* the time, because the language helps him
out in ways C and friends generally don't.


> > The source code need not be available at run time if the executable
> > contains the correct validation expressions at run time.
>
> It doesn't matter how the executable contains validations (whether by
> compiler or by explicit source code), either way the programmer took the
> care to define them.

  Again: The programmer does not define the validations; the language
does.  (Read: "the compiler-and-runtime-support programmers did.")
No "care" required.

> By "automatically" I mean external software (such as an OS kernel)
> examining the machine code of running programs (written by careless
> people that do not necessarily have validations in place) and inserting
> extra validations appropriately. For this to have any chance of
> suceeding the programmer's intentions must be as clear as possible, and
> the source code (hopefully) contains such intentions.

  And the executable does not (because optimization involves the removal
of redundancy).  And the executable is the only thing the average OS
can see (although open-source software and ubiquitous Internet may one
day change that, </hype>).

-Arthur
0
Reply ajo (1601) 5/27/2004 2:27:29 PM

"Arthur J. O'Dwyer" <ajo@nospam.andrew.cmu.edu> wrote:
> On Tue, 25 May 2004, Matthew Jakeman wrote:
>  [re: discussion of security in C/C++ programs]
> >
> > The problems of buffer overflows could soon become less relevant anyway,
> > with Intel and AMD's new chips at least (Not that I'm suggesting in anyway
> > that it shouldnt still be protected against!). And NX does require OS
> > support as well.
> > http://www.internet-security.ca/internet-security-news-005/intel-prescott-cpu-will-end-buffer-overflows-security-problems.html
> > What a god send for Microsoft, easier to include NX support in a service
> > pack than to correct all the 50m lines of code i suppose, it will be
> > interesting to see which path they take!
> 
>   The press release doesn't say much about the technology, but from
> what I can gather, it's basically making the stack segment
> read-and-write-only, no execution privileges.  I thought Intel machines
> had segments you couldn't execute from already, like the data segment?
> No?

No.  First of all this is a feature that comes from AMD, and has
already been implemented in the Athlon-64 CPUs.  Its called the
No-Execute bit, and its a mode that's just been missing in the x86
architecture before AMD added it to AMD64.  The OpenBSD folks have
added it to their architecture in anticipation of this being enabled
in x86s.

>   And if it is what I think it is, then it doesn't protect against
> an attacker's taking over your machine: it only protects against an
> attacker's being able to take over your machine with code he wrote
> himself and inserted into the stack frame.

Right.  But this is a very common problem and exploit.  For example
one of the software based X-box mods rely on exactly this flaw in the
initial dashboard background image loader.

> [...] A static-buffer overflow
> *followed by* a stack-buffer overflow executing a jump into the
> overflowed static buffer would be just as devastating as the old kind;
> and stack-buffer overflows could still make the computer crash or
> execute the wrong code.

Yes, you can still crash the machine.  Making it execute code you've
crafted yourself though -- you'll have to rely on other holes.

>   So is the "Magic Lamp to end buffer overflow exploits" tagline just
> normal media hype, or am I really missing something WRT the capabilities
> of the "new" technology?

It solves one manifestation of the problem.  I've thought about this
problem for a little bit, and I am not sure it will be ended by using
this mechanism -- just that it needs to be a lot more creative.  I.e.,
to spurr a program to arbitrary action you have to prime it by forcing
it to jump into *its own* code segment, however with the *parameters*
on the stack may be arbitrarily modified.  (So for example if the code
itself has a call in it to turn the NX bit on and off, then you can
call that code with parameters that say turn it off, *then* do the
same exploit, etc.)
 
> (Fup-to: comp.programming.  comp.lang.c will thank you.)

Its not off topic for comp.lang.c.  Remember C, and C alone is the
language that creates this problem.

--
Paul Hsieh
http://www.pobox.com/~qed/
0
Reply qed (328) 5/27/2004 8:11:08 PM

Arthur J. O'Dwyer wrote:
> On Thu, 27 May 2004, Ben Measures wrote:
> 
>>James Rogers wrote:
>>
>>>Ben Measures <saint_abroadremove@removehotmail.com> wrote...
>>>
>>>>James Rogers wrote:
>>>>
>>>>>And whatever happened to computer languages that perform run-time
>>>>>array bounds checking? How about languages that also perform run-time
>>>>>data validation, such as ensuring that a numeric value is always
>>>>>within a specified range?
>>>>
>>>>This is the job of the programmer writing the code. It can and should
>>>>all be done at source-code level.
>>>
>>>I agree that this is the responsibility of the programmer writing the
>>>code.  There are languages that simplify this responsibility.
> 
>     [ Like Ada, of course. ;-) ]
> 
> 
>>This is simply the compiler generating extra machine code that checks
>>the ranges are within that specified in the source code. This is indeed
>>very helpful but can still be done by writing source code to do these
>>checks. In the end, it is the programmer that has to define these ranges
>>and make sure the checks are performed.
> 
> 
>   You may not be missing the point, but you're making it sound like you
> are.  "The programmer" in the C or C++ case is the one guy writing the
> program.  "The programmer" in the Ada case is the guy writing the
> program, *plus* several dozen other guys who wrote the compiler and
> runtime system that generates and enforces these compile-time and run-time
> sanity checks.  Sure, human beings have to do everything in this world;
> but it's a question of how many human beings, and how often.  James is
> absolutely right that in Ada, the answer is "fewer and less often."
> 
>   This doesn't mean Ada is better than C, of course! ;-)  But it's
> true that there *are* paradigms in which the programmer doesn't need
> to be on his toes *all* the time, because the language helps him
> out in ways C and friends generally don't.
> 
> 
> 
>>>The source code need not be available at run time if the executable
>>>contains the correct validation expressions at run time.
>>
>>It doesn't matter how the executable contains validations (whether by
>>compiler or by explicit source code), either way the programmer took the
>>care to define them.
> 
> 
>   Again: The programmer does not define the validations; the language
> does.  (Read: "the compiler-and-runtime-support programmers did.")
> No "care" required.
> 
> 
>>By "automatically" I mean external software (such as an OS kernel)
>>examining the machine code of running programs (written by careless
>>people that do not necessarily have validations in place) and inserting
>>extra validations appropriately. For this to have any chance of
>>suceeding the programmer's intentions must be as clear as possible, and
>>the source code (hopefully) contains such intentions.
> 
> 
>   And the executable does not (because optimization involves the removal
> of redundancy).  And the executable is the only thing the average OS
> can see (although open-source software and ubiquitous Internet may one
> day change that, </hype>).
> 
> -Arthur

I agree fully with everything you said there (it actually echos what I 
was trying to put across) with one exception: [in Ada] "The programmer 
does not define the validations".

I'm not familiar with Ada (other than its history) but doesn't the 
programmer have to write something similar to:
type My_Int is range -20..231;
thereby "defining" the validation (that the compiler inserts)?

-- 
Ben M.
0
Reply saint_abroadremove (144) 5/27/2004 9:40:13 PM

Ben Measures <saint_abroadremove@removehotmail.com> wrote in
news:YTltc.4110$1C6.39576762@news-text.cableinet.net: 

> By "automatically" I mean external software (such as an OS kernel) 
> examining the machine code of running programs (written by careless 
> people that do not necessarily have validations in place) and
> inserting extra validations appropriately. For this to have any chance
> of suceeding the programmer's intentions must be as clear as possible,
> and the source code (hopefully) contains such intentions.

One problem is that many languages, including C, provide no syntax for
the programmer to make his or her intentions clear without actually
creating the checks themselves.

Part of the problem is simply the richness of a language syntax.
C has traditionally had a minimalist design for its syntax.
This makes the syntax very simple. It also makes a full expression
of the programmer's intentions sometimes very complex. The complexity
works on two levels. First the application programmer must carefully
create all the checks and error indicators manually. Second, the
compiler cannot be reasonably expected to optimize out the checks
in the cases where they are unnecessary.

Ada provides a more complex syntax than C. This becomes a problem for
the compiler designers. On the other hand, it allows the application
programmer to produce simpler code. Because of the richness of the
syntax the compiler can even optimize out unnecessary checks.

For example, let's look at an Ada "for" loop. Ada's "for" loop 
iterates through a range of values. Ada also provides attributes for
types including the minimum value, the maximum value, and the range
of values. Combining these features for iterating through an array
looks like the following:

type My_Int is range 0..9;
type My_Array is array(My_Int) of float;

foo : My_Array := (Others => 10.0);
Total : Float := 0.0;

for index My_Int'Range loop
   Total := Total + foo(index);
end loop;

The first line simply defines an integer type with a valid range from
0 through 9. The second line defines an array type indexed by the type
My_Int (therefore having index values from 0 through 9) of type float.
The third line declares a variable "foo" of type My_Array and initializes
all its values to 10.0. The fourth line declares a float variable 
initialized to 0.0;

The "for" loop iterates through all values of My_Int starting at 0 and
ending at 9. Each element of the foo array is added to total.

The compiler recognizes the array initialization syntax to indicate that
it shall iterate through all values of the array and set each value to
10.0. There is no possibility of overruning the array index. The compiler
can optimize out the array index range checking for this operation.
Within the "for" loop the index value is of the same type as the array
index value. Again, it is not possible to iterate outside the range of
the array index. The compiler can optimize out the index range checking
within the for loop.

It would be very difficult for an application programmer to provide all
the needed range checks and omit all the unneeded range checks for each
situation. This is particularly true when iterating through loops. The
C looping syntax cannot be easily manipulated such that it is impossible
to encounter a buffer overflow, no matter how the size of an array changes
through the maintenance lifetime of the program.

C syntax is intended to provide the opportunity to create highly efficient
executables. In many cases it succeeds. In the case demonstrated above
C will consistently produce less efficient code than Ada, while requiring
the application programmer to do a lot more work for equivalent safey.

Jim Rogers
0
Reply jimmaureenrogers (283) 5/28/2004 4:30:08 AM

Ben Measures <saint_abroadremove@removehotmail.com> wrote in
news:1fttc.4547$pe2.42966253@news-text.cableinet.net: 

> I agree fully with everything you said there (it actually echos what I
> was trying to put across) with one exception: [in Ada] "The programmer
> does not define the validations".
> 
> I'm not familiar with Ada (other than its history) but doesn't the 
> programmer have to write something similar to:
> type My_Int is range -20..231;
> thereby "defining" the validation (that the compiler inserts)?

You are correct. The programmer does need to specify his intentions.

As I mentioned in an earlier post, the difference is the amount of
work the application programmer must do, and the amount of 
optimization available to the compiler. 

I do not see how an OS could devine what a programmer intended for
range limits if the programmer does not specify those limits in some
manner. If a programmer expects an input value in the range of
1 through 10 but never specifies or checks for any particular range,
how will the OS know what the expected range is?

Specifying data ranges is a two-edged sword. It can be used to
provide proper data validation. This can improve program safety.
If, however, the programmer does not specify the correct range 
the result can be disasterous. Correct range specification
requires a very detailed understanding of the true system
requirements. In other words, to make code safe and secure one
must not only specify ranges, but also ensure the specified
ranges are correct. This is not always as simple as it might
seem.

Ada provides a simple range specification syntax that can be
associated with a scalar data type. Eiffel provides pre and
post conditions for methods that enforce data ranges. It is
important to specify the correct ranges in each language.
Invalid pre and post conditions will lead to improper program
behavior when the program behavior is validated against the
requirements.

Jim Rogers 
0
Reply jimmaureenrogers (283) 5/28/2004 4:44:19 AM

In article <796f488f.0405271211.1b0148a7@posting.google.com>, qed@pobox.com (Paul Hsieh) writes:
> "Arthur J. O'Dwyer" <ajo@nospam.andrew.cmu.edu> wrote:
> 
> >   So is the "Magic Lamp to end buffer overflow exploits" tagline just
> > normal media hype, or am I really missing something WRT the capabilities
> > of the "new" technology?
> 
> It solves one manifestation of the problem.  I've thought about this
> problem for a little bit, and I am not sure it will be ended by using
> this mechanism -- just that it needs to be a lot more creative.

I largely agree with what Paul had to say in his post, but I'd like
to point out that while return-into-program exploits no doubt are
more difficult than return-into-buffer ones, they're well-understood,
there's some good technical information on them and practical advice
on writing them available, and they've been found "in the wild".
(See eg the Bugtraq archives, or past issues of Phrack, for more
information.)

In other words, attackers already know how to exploit return-into-
program buffer overflow holes, so patching return-into-buffer (with a
non-exec page bit) really just eliminates some of the low-hanging
fruit.

> > (Fup-to: comp.programming.  comp.lang.c will thank you.)
> 
> Its not off topic for comp.lang.c.  Remember C, and C alone is the
> language that creates this problem.

C programs may be the worst offenders, and language features have
certainly contributed to that.  But it's hardly alone.  For one thing,
there are plenty of poorly-written C++ programs that suffer from the
same problems.

-- 
Michael Wojcik                  michael.wojcik@microfocus.com

Although he was an outsider, and excluded from their rites, they were
always particularly charming to him at this time; he and his household
received small courtesies and presents, just because he was outside.
  -- E M Forster
0
Reply mwojcik (1874) 5/28/2004 5:15:28 PM

18 Replies
30 Views

(page loaded in 0.293 seconds)


Reply: