Below is some historic source code of a function written by me which is part of
some internal library 'we' are using:
---------------
static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
{
uint8_t *p;
unsigned have, want, ofs;
have = evr_msg->e - evr_msg->p;
want = evr_msg->e - evr_msg->s;
do {
have += want;
want *= 2;
} while (have < need);
ofs = evr_msg->p - evr_msg->s;
p = realloc(evr_msg, sizeof(*evr_msg) + want);
if (!p) return -1;
evr_msg = (void *)p;
evr_msg->s = (void *)(evr_msg + 1);
evr_msg->p = evr_msg->s + ofs;
evr_msg->e = evr_msg->s + want;
return 0;
}
--------------
This has been used 'in production' since some time in August, until it
finally blew up on a customer appliance this Wednesday
(2011/11/16). The error is especially ghastly because the code will
work 'most of the time' and when it doesn't, the invoker will happily
write over whatever memory management data structures malloc uses
internally with aribtrarily damageing consequences. This is the source
code equivalent of something known in German as 'Augenblicksversagen'
(blink of an eye failure) when it leads to (usually disastrous) car
crashes on motorways: Despite knowing better and having applied this
better knowledge countless times, someone suddenly does something
completely stupid (in a hurry) because his brain just chose to have a
hiccup. When writing code, it is therefore prudent to assume that
everyone will make stupid mistakes, at least once in a while, and -
despite this isn't exactly popular in "many circles", that nothing
should ever be moved from development to production without a prior
code review conducted by the person who wrote it after some time has
passed and ideally, additionally by second person who hasn't been
involved with it so far.
Humans are going to make stupid mistakes and some strategy to detect
and mitigate these early should be employed (and the guy who is
convinced that HE would never do something likes shouldn't[*]).
[*] I've surely written my share of code with deadlocks. But this is
just an ordinary programming error, not some kind of the
horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
the code to use the proper locking order).
|
|
0
|
|
|
|
Reply
|
rweikusat (2678)
|
11/25/2011 6:24:36 PM |
|
Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
> despite this isn't exactly popular in "many circles", that nothing
> should ever be moved from development to production without a prior
> code review conducted by the person who wrote it after some time has
> passed and ideally, additionally by second person who hasn't been
> involved with it so far.
>=20
Amen.
Unfortunately, in the real world, most of the source code is never
reviewed.
--=20
Andr=C3=A9 Gillibert
|
|
0
|
|
|
|
Reply
|
UTF
|
11/25/2011 6:46:23 PM
|
|
On 11/25/2011 1:24 PM, Rainer Weikusat wrote:
>[...]
> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
> {
> uint8_t *p;
> unsigned have, want, ofs;
>
> have = evr_msg->e - evr_msg->p;
> want = evr_msg->e - evr_msg->s;
> do {
> have += want;
> want *= 2;
> } while (have< need);
>
> ofs = evr_msg->p - evr_msg->s;
> p = realloc(evr_msg, sizeof(*evr_msg) + want);
> if (!p) return -1;
>
> evr_msg = (void *)p;
> evr_msg->s = (void *)(evr_msg + 1);
> evr_msg->p = evr_msg->s + ofs;
> evr_msg->e = evr_msg->s + want;
> return 0;
> }
>[...]
> [*] I've surely written my share of code with deadlocks. But this is
> just an ordinary programming error, not some kind of the
> horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
> the code to use the proper locking order).
Perhaps I'm missing something, but this doesn't look like a
locking problem. Rather, it looks like a design problem: If the
function returns 0, the caller knows that all its old pointers to
the struct are untrustworthy, but can't discover what new pointer
value it should replace them with.
Maybe a little more context would be helpful.
--
Eric Sosman
esosman@ieee-dot-org.invalid
|
|
0
|
|
|
|
Reply
|
esosman2 (2945)
|
11/25/2011 7:24:48 PM
|
|
Eric Sosman <esosman@ieee-dot-org.invalid> writes:
> On 11/25/2011 1:24 PM, Rainer Weikusat wrote:
>>[...]
>> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
>> {
>> uint8_t *p;
>> unsigned have, want, ofs;
>>
>> have = evr_msg->e - evr_msg->p;
>> want = evr_msg->e - evr_msg->s;
>> do {
>> have += want;
>> want *= 2;
>> } while (have< need);
>>
>> ofs = evr_msg->p - evr_msg->s;
>> p = realloc(evr_msg, sizeof(*evr_msg) + want);
>> if (!p) return -1;
>>
>> evr_msg = (void *)p;
>> evr_msg->s = (void *)(evr_msg + 1);
>> evr_msg->p = evr_msg->s + ofs;
>> evr_msg->e = evr_msg->s + want;
>> return 0;
>> }
>>[...]
>> [*] I've surely written my share of code with deadlocks. But this is
>> just an ordinary programming error, not some kind of the
>> horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
>> the code to use the proper locking order).
>
> Perhaps I'm missing something,
Presumably, the text you deleted.
|
|
0
|
|
|
|
Reply
|
rweikusat (2678)
|
11/25/2011 7:36:02 PM
|
|
Rainer Weikusat <rweikusat@mssgmbh.com> writes:
> Eric Sosman <esosman@ieee-dot-org.invalid> writes:
>> On 11/25/2011 1:24 PM, Rainer Weikusat wrote:
>>>[...]
>>> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
>>> {
>>> uint8_t *p;
>>> unsigned have, want, ofs;
>>>
>>> have = evr_msg->e - evr_msg->p;
>>> want = evr_msg->e - evr_msg->s;
>>> do {
>>> have += want;
>>> want *= 2;
>>> } while (have< need);
>>>
>>> ofs = evr_msg->p - evr_msg->s;
>>> p = realloc(evr_msg, sizeof(*evr_msg) + want);
>>> if (!p) return -1;
>>>
>>> evr_msg = (void *)p;
>>> evr_msg->s = (void *)(evr_msg + 1);
>>> evr_msg->p = evr_msg->s + ofs;
>>> evr_msg->e = evr_msg->s + want;
>>> return 0;
>>> }
>>>[...]
>>> [*] I've surely written my share of code with deadlocks. But this is
>>> just an ordinary programming error, not some kind of the
>>> horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
>>> the code to use the proper locking order).
>>
>> Perhaps I'm missing something,
>
> Presumably, the text you deleted.
.... which was about 'humans making stupid mistakes', such as
reallocating some memory area inside a subroutine and not returning
the realloc return value to the caller ...
|
|
0
|
|
|
|
Reply
|
rweikusat (2678)
|
11/25/2011 7:39:49 PM
|
|
On 11/26/11 07:24 AM, Rainer Weikusat wrote:
> Below is some historic source code of a function written by me which is part of
> some internal library 'we' are using:
>
> ---------------
> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
> {
> uint8_t *p;
> unsigned have, want, ofs;
>
> have = evr_msg->e - evr_msg->p;
> want = evr_msg->e - evr_msg->s;
> do {
> have += want;
> want *= 2;
> } while (have< need);
>
> ofs = evr_msg->p - evr_msg->s;
> p = realloc(evr_msg, sizeof(*evr_msg) + want);
> if (!p) return -1;
>
> evr_msg = (void *)p;
> evr_msg->s = (void *)(evr_msg + 1);
> evr_msg->p = evr_msg->s + ofs;
> evr_msg->e = evr_msg->s + want;
> return 0;
> }
> --------------
>
> This has been used 'in production' since some time in August, until it
> finally blew up on a customer appliance this Wednesday
> (2011/11/16). The error is especially ghastly because the code will
> work 'most of the time' and when it doesn't, the invoker will happily
> write over whatever memory management data structures malloc uses
> internally with aribtrarily damageing consequences. This is the source
> code equivalent of something known in German as 'Augenblicksversagen'
> (blink of an eye failure) when it leads to (usually disastrous) car
> crashes on motorways: Despite knowing better and having applied this
> better knowledge countless times, someone suddenly does something
> completely stupid (in a hurry) because his brain just chose to have a
> hiccup. When writing code, it is therefore prudent to assume that
> everyone will make stupid mistakes, at least once in a while, and -
> despite this isn't exactly popular in "many circles", that nothing
> should ever be moved from development to production without a prior
> code review conducted by the person who wrote it after some time has
> passed and ideally, additionally by second person who hasn't been
> involved with it so far.
This looks like a case where unit tests (written before the code) would
have saved your bacon. I'm sure only writing production code in order
to pass tests has saved me form many similar embarrassing moments!
--
Ian Collins
|
|
0
|
|
|
|
Reply
|
ian-news (9873)
|
11/25/2011 7:47:24 PM
|
|
On 11/25/2011 2:39 PM, Rainer Weikusat wrote:
> Rainer Weikusat<rweikusat@mssgmbh.com> writes:
>> Eric Sosman<esosman@ieee-dot-org.invalid> writes:
>>> On 11/25/2011 1:24 PM, Rainer Weikusat wrote:
>>>> [...]
>>>> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
>>>> {
>>>> uint8_t *p;
>>>> unsigned have, want, ofs;
>>>>
>>>> have = evr_msg->e - evr_msg->p;
>>>> want = evr_msg->e - evr_msg->s;
>>>> do {
>>>> have += want;
>>>> want *= 2;
>>>> } while (have< need);
>>>>
>>>> ofs = evr_msg->p - evr_msg->s;
>>>> p = realloc(evr_msg, sizeof(*evr_msg) + want);
>>>> if (!p) return -1;
>>>>
>>>> evr_msg = (void *)p;
>>>> evr_msg->s = (void *)(evr_msg + 1);
>>>> evr_msg->p = evr_msg->s + ofs;
>>>> evr_msg->e = evr_msg->s + want;
>>>> return 0;
>>>> }
>>>> [...]
>>>> [*] I've surely written my share of code with deadlocks. But this is
>>>> just an ordinary programming error, not some kind of the
>>>> horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
>>>> the code to use the proper locking order).
>>>
>>> Perhaps I'm missing something,
>>
>> Presumably, the text you deleted.
>
> ... which was about 'humans making stupid mistakes', such as
> reallocating some memory area inside a subroutine and not returning
> the realloc return value to the caller ...
Yes, I saw that part -- which said only that there had been a
mistake, and said nothing about its nature. What I did not see (and
did not snip) was anything that might have explained why "the remedy
is simple (fix the code to use the proper locking order)." Two
questions remain: What is the locking issue, and how does using "the
proper locking order" get the realloc() result back to the caller?
--
Eric Sosman
esosman@ieee-dot-org.invalid
|
|
0
|
|
|
|
Reply
|
esosman2 (2945)
|
11/25/2011 10:02:04 PM
|
|
Eric Sosman <esosman@ieee-dot-org.invalid> writes:
> On 11/25/2011 2:39 PM, Rainer Weikusat wrote:
>> Rainer Weikusat<rweikusat@mssgmbh.com> writes:
>>> Eric Sosman<esosman@ieee-dot-org.invalid> writes:
>>>> On 11/25/2011 1:24 PM, Rainer Weikusat wrote:
>>>>> [...]
>>>>> static int grow_evr_msg_to(struct evr_msg *evr_msg, unsigned need)
>>>>> {
>>>>> uint8_t *p;
>>>>> unsigned have, want, ofs;
>>>>>
>>>>> have = evr_msg->e - evr_msg->p;
>>>>> want = evr_msg->e - evr_msg->s;
>>>>> do {
>>>>> have += want;
>>>>> want *= 2;
>>>>> } while (have< need);
>>>>>
>>>>> ofs = evr_msg->p - evr_msg->s;
>>>>> p = realloc(evr_msg, sizeof(*evr_msg) + want);
>>>>> if (!p) return -1;
>>>>>
>>>>> evr_msg = (void *)p;
>>>>> evr_msg->s = (void *)(evr_msg + 1);
>>>>> evr_msg->p = evr_msg->s + ofs;
>>>>> evr_msg->e = evr_msg->s + want;
>>>>> return 0;
>>>>> }
>>>>> [...]
>>>>> [*] I've surely written my share of code with deadlocks. But this is
>>>>> just an ordinary programming error, not some kind of the
>>>>> horrid-monster-which-lurks-in-the-deeps, and the remedy is simple (fix
>>>>> the code to use the proper locking order).
>>>>
>>>> Perhaps I'm missing something,
>>>
>>> Presumably, the text you deleted.
>>
>> ... which was about 'humans making stupid mistakes', such as
>> reallocating some memory area inside a subroutine and not returning
>> the realloc return value to the caller ...
>
> Yes, I saw that part -- which said only that there had been a
> mistake, and said nothing about its nature. What I did not see (and
> did not snip) was anything that might have explained why "the remedy
> is simple (fix the code to use the proper locking order)." Two
> questions remain: What is the locking issue, and how does using "the
> proper locking order" get the realloc() result back to the caller?
This was supposed to be a reference to Boltar's conjecture the me
referring to 'deadlocks' as 'simple problems' must mean that I'd
ludicrously believe to never write code which deadlocks at run time.
But I actually referred to them as simple because fixing them is
simple (provided that it is known what the locking order should have
been and determining this locking order is a required design step for
'proper' multithreaded applications complicated enough to need one)
and 'program stops as soon as the problem occurs', kindly preserving
all information required for fixing it until someone notices that, is
a lot easier to deal with than errors where the program continues to
run, but either works with corrupted data or even runs along codepaths
which manifested themselves out of thin air because some halfway
overwritten function pointer caused a control transfer somewhere into
the middle of an instruction (for CPUs where such things are
possible) or so.
And there are far worse things than that. Something I had to do in the
past was to modify the Linux 2.4 ARM head.s code to work with an Intel
IXP425 processor in little-endian mode. Apparently because "big-endian
makes so much more sense" someone at Intel(!) originally decided that
nobody could possibly want to run such a CPU in a different
configuration and consequently, the memory controller was designed to
be big-endian only. In order to allow a CPU running in little-endian
mode to work with that, there's a hardware byte-swapper sitting
between the processor and the bus and this hardware byte-swapper swaps
bytes provided the MMU is enabled and the page table entry for the
page in question has a certain flag set. By the time this code runs,
the processor is in big-endian mode and a kernel has been loaded into
RAM by the bootloader. It then needs to setup enough of a page table
to provide access to the kernel after the MMU has been enabled,
byteswap the entire contents of the RAM except the the code performing
the procedure itself, enable the MMU and reconfigure the CPU into
little-endian mode and hope for the best because every error in the
(machine) code which does all this causes the whole bastard thing to
go into a state where 'code execution' is not only not anyhow
observable except with a 'hardware debugger' I didn't have but where
it is also impossible to generate any kind of diagnostic output, be it
as primitive as lighting an LED via GPIO pin.
|
|
0
|
|
|
|
Reply
|
rweikusat (2678)
|
11/25/2011 10:51:48 PM
|
|
|
7 Replies
41 Views
(page loaded in 0.156 seconds)
Similiar Articles: Poker hand evaluator - comp.lang.javascriptThat remark is both rude and stupid. If you knew anything about the subject, you would realise that one cannot cheat people out of their money merely by using a hand ... '?'s in Rom Firmware Kernal dumps - comp.sys.cbm... their documents for a FULLY-commented, FULLY explained ROM dump? This is stupid. ... Your laughable "6502 is a baby" remark was like > shooting yourself in the foot. top 10 uses for random data compression?? anyone? - comp ...Hardly any superb armed library remarks skins according to Marwan's immense mate. ... He should desert entirely if Ahmed's shit isn't stupid. I narrow once, improve ... improve strlen - comp.lang.asm.x86You think that I am so stupid, that I think *I* am creating something new, when I am ... you might have noticed.. but that's my problem, thank you for not making funny remark ... basic questions about the leapsecond - comp.protocols.time.ntp ...Then I don't ask so many stupid questions. At least hopefully I don't ask the same ... >> There is also the remark: "If the top level NTP server is unable to obtain >> the ... DSP Job opening in Sacramento, CA - comp.dspWhat a vicious racist remark! I'll have you know Robert, there are hundreds of ... of this: The U.S. is such a great > country that if a U.S. citizen is stupid (can't ... renderbuffer objects - comp.graphics.api.opengl... GL_FRAMEBUFFER_EXT,GL_DEPTH_ATTACHMENT_EXT,GL_TEXTURE_2D, depthTex, 0); (*REMARK ... depth-buffer-writes are both enabled > ... my apologies if that might sound stupid ... [comp.publish.cdrom] CD-Recordable FAQ, Part 1/4 - comp.publish ...Archive-name: cdrom/cd-recordable/part1 Posting-Frequency: monthly Last-modified: 2008/10/09 Version: 2.71 Send corrections and updates to And... .g64 images - comp.sys.cbmIf it comes to plain (sometimes stupid) hardcore applications, DOS is the OS of ... way but never tried it with msdos. Who knows... Anyway, thank you for your remarks. CP/M 3 DRI manual under construction - comp.os.cpmI appreciate your considerations of my remarks. Thanks again for restoring the CP/M ... not empirical knowledge) that the bigpond mailserver isn't just being lazy or stupid... stupidity - Definition of stupiditythe quality or condition of being stupid; pl. stupidities something stupid; foolish remark, irrational act, etc. Obama Stands by "Stupid" Remark - YouTubeObama stands by "Stupid" remark. Given the chance to water down or retract he remarks about the Cambridge Police acting stupidly, he attempts to change the ... 7/20/2012 7:47:27 AM
|