Hi, I've to deal with a multi-threaded program that has, as one of its threads a "watchdog thread" that, when it doesn't notice some variable getting set within a certain time, is supposed to stop the whole program (at any cost, no worries about data lost). It does attempt to shut down the program by calling exit(). Now, all the references I have consulted (TLPI, APUE 3rd ed. etc.) all claim that when one of the threads calls exit() the program will be ended. A look at SUSv4 just mentions in addition that the end of the program might be delayed if there are outstanding asynchronuous I/O operations that can't be cancelled (nothing I guess I'm having). This did work with a 3.4 Linux kernel. But after switching to a 4.4 kernel it suddenly doesn't work reliably anymore. If it fails one thread seems to run amok, using about 50% of the CPU time, the other 50% being used by ksoftirqd. The whole thing can't be stopped in any way (not even with 'kill -SIGKILL'). I've also tried to replace the exit() call with a kill(getpid(), SIGKILL) but also with no luck. Attaching with gdb fails as well (hangs indefinitely). Looks like a real zombie: dead and very active at the same time:-( Does that ring a bell with anyone of you? One of the threads is rather likely to do a lot of epoll() calls. Please keep in mind that I can't simply change the whole architecture - this is an embedded system already out in the field, and my role in this is to get a new kernel ver- sion to work, not upset a more or less working application (unless I can come up with very convincing arguments;-) Best regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ http://toerring.de
![]() |
0 |
![]() |
Jens Thoms Toerring <jt@toerring.de> wrote: > Hi, > > I've to deal with a multi-threaded program that has, as > one of its threads a "watchdog thread" that, when it doesn't > notice some variable getting set within a certain time, > is supposed to stop the whole program (at any cost, no > worries about data lost). It does attempt to shut down the > program by calling exit(). <snip> > This did work with a 3.4 Linux kernel. But after switching > to a 4.4 kernel it suddenly doesn't work reliably anymore. > If it fails one thread seems to run amok, using about 50% > of the CPU time, the other 50% being used by ksoftirqd. The > whole thing can't be stopped in any way (not even with 'kill > -SIGKILL'). I've also tried to replace the exit() call with > a kill(getpid(), SIGKILL) but also with no luck. Attaching > with gdb fails as well (hangs indefinitely). Looks like a > real zombie: dead and very active at the same time:-( A shot in the dark: is the application using robust mutexes? That's the first thing that comes to mind. Robust mutexes require the kernel, when destroying a thread, to walk a userspace linked-list data structure.
![]() |
0 |
![]() |
william@wilbur.25thandclement.com wrote: > Jens Thoms Toerring <jt@toerring.de> wrote: > > Hi, > > > > I've to deal with a multi-threaded program that has, as > > one of its threads a "watchdog thread" that, when it doesn't > > notice some variable getting set within a certain time, > > is supposed to stop the whole program (at any cost, no > > worries about data lost). It does attempt to shut down the > > program by calling exit(). > <snip> > > This did work with a 3.4 Linux kernel. But after switching > > to a 4.4 kernel it suddenly doesn't work reliably anymore. > > If it fails one thread seems to run amok, using about 50% > > of the CPU time, the other 50% being used by ksoftirqd. The > > whole thing can't be stopped in any way (not even with 'kill > > -SIGKILL'). I've also tried to replace the exit() call with > > a kill(getpid(), SIGKILL) but also with no luck. Attaching > > with gdb fails as well (hangs indefinitely). Looks like a > > real zombie: dead and very active at the same time:-( > A shot in the dark: is the application using robust mutexes? That's the > first thing that comes to mind. Robust mutexes require the kernel, when > destroying a thread, to walk a userspace linked-list data structure. Unfortunately, I can't say (and the term "robust mutex" was new to me, admittedly). There are several libraries involved that create their own threads (libevent, libusb etc.) about which I can't say much. The rest of the threads in the application itself usually use pipes for basic communication apart from very simple boolean values, defined as volatile sig_atomic_t for certain state information. But, as far as I can see (but this can change as I get around to delve deeper into the application) there are no mutex locks that might lead to some kind of dead-lock. But then it's 150 kloc of code I'm not too familiar with... I'll de- finitely look at this aspect! Could something like that keep a program alive that sends it- self a SIGKILL (or does exit() or _exit())? That are all things I've tried. The only result was that the chance that it got stuck in that strange busy, non-killable state seemed to change (and each test runs until the problem appears can take an hour and more, making things somewhat annoying;-) Thank you and best regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ http://toerring.de
![]() |
0 |
![]() |
On Monday December 12 2016 18:35, in comp.unix.programmer, "william@wilbur.25thandClement.com" <william@wilbur.25thandClement.com> wrote: > Jens Thoms Toerring <jt@toerring.de> wrote: >> Hi, >> >> I've to deal with a multi-threaded program that has, as >> one of its threads a "watchdog thread" that, when it doesn't >> notice some variable getting set within a certain time, >> is supposed to stop the whole program (at any cost, no >> worries about data lost). It does attempt to shut down the >> program by calling exit(). > <snip> >> This did work with a 3.4 Linux kernel. But after switching >> to a 4.4 kernel it suddenly doesn't work reliably anymore. >> If it fails one thread seems to run amok, using about 50% >> of the CPU time, the other 50% being used by ksoftirqd. The >> whole thing can't be stopped in any way (not even with 'kill >> -SIGKILL'). I've also tried to replace the exit() call with >> a kill(getpid(), SIGKILL) but also with no luck. Attaching >> with gdb fails as well (hangs indefinitely). Looks like a >> real zombie: dead and very active at the same time:-( > > A shot in the dark: is the application using robust mutexes? That's the > first thing that comes to mind. Robust mutexes require the kernel, when > destroying a thread, to walk a userspace linked-list data structure. Another shot in the dark: Did the C runtime library (glibc or local equivalent) change? If so, was it compiled so as to use the exit_group(2) syscall in the exit(3) function? According to various Linux kernel docs, since the introduction of NPTL, exit(2) only terminates the calling thread, leaving all other threads in the "process" active. To terminate /all/ threads at once, use exit_group(2). Since glibc v2.3, the exit(3) call has invoked exit_group(2) instead of exit(2). Perhaps your newer version of the runtime library has reverted back to calling exit(2). -- Lew Pitcher "In Skills, We Trust" PGP public key available upon request
![]() |
0 |
![]() |
Jens Thoms Toerring <jt@toerring.de> wrote: > william@wilbur.25thandclement.com wrote: >> Jens Thoms Toerring <jt@toerring.de> wrote: >> > Hi, >> > >> > I've to deal with a multi-threaded program that has, as >> > one of its threads a "watchdog thread" that, when it doesn't >> > notice some variable getting set within a certain time, >> > is supposed to stop the whole program (at any cost, no >> > worries about data lost). It does attempt to shut down the >> > program by calling exit(). >> <snip> >> > This did work with a 3.4 Linux kernel. But after switching >> > to a 4.4 kernel it suddenly doesn't work reliably anymore. >> > If it fails one thread seems to run amok, using about 50% >> > of the CPU time, the other 50% being used by ksoftirqd. The >> > whole thing can't be stopped in any way (not even with 'kill >> > -SIGKILL'). I've also tried to replace the exit() call with >> > a kill(getpid(), SIGKILL) but also with no luck. Attaching >> > with gdb fails as well (hangs indefinitely). Looks like a >> > real zombie: dead and very active at the same time:-( > >> A shot in the dark: is the application using robust mutexes? That's the >> first thing that comes to mind. Robust mutexes require the kernel, when >> destroying a thread, to walk a userspace linked-list data structure. > > Unfortunately, I can't say (and the term "robust mutex" was new > to me, admittedly). There are several libraries involved that > create their own threads (libevent, libusb etc.) about which I > can't say much. USB, embedded... I switch my vote to a USB driver issue ;) <snip> > Could something like that keep a program alive that sends it- > self a SIGKILL (or does exit() or _exit())? That are all things > I've tried. The only result was that the chance that it got > stuck in that strange busy, non-killable state seemed to change > (and each test runs until the problem appears can take an hour > and more, making things somewhat annoying;-) Theoretically the kernel shouldn't have a problem if the linked-list is corrupted or if any of the memory it points to has weird permissions. However, the Linux kernel is quite complex and has more than its fair share of bugs. The ksoftirqd load made me think of some kind of pathological page faulting behavior occuring from kernel context as it tears the process down (see exit_robust_list in kernel/futex.c). But I don't even know if ksoftirqd handles page faults at all. Don't put much stock in my comments. I haven't personally run into issues with robust mutexes, beyond bugs in glibc[1]. That locking doesn't stand out to you would make me look elsewhere. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=12683
![]() |
0 |
![]() |
On Mon, 2016-12-12, Jens Thoms Toerring wrote: > Hi, > > I've to deal with a multi-threaded program that has, as > one of its threads a "watchdog thread" that, when it doesn't > notice some variable getting set within a certain time, > is supposed to stop the whole program (at any cost, no > worries about data lost). It does attempt to shut down the > program by calling exit(). Now, all the references I have > consulted (TLPI, APUE 3rd ed. etc.) all claim that when one > of the threads calls exit() the program will be ended. A > look at SUSv4 just mentions in addition that the end of > the program might be delayed if there are outstanding > asynchronuous I/O operations that can't be cancelled > (nothing I guess I'm having). > > This did work with a 3.4 Linux kernel. But after switching > to a 4.4 kernel it suddenly doesn't work reliably anymore. > If it fails one thread seems to run amok, using about 50% > of the CPU time, the other 50% being used by ksoftirqd. The > whole thing can't be stopped in any way (not even with 'kill > -SIGKILL'). I've also tried to replace the exit() call with > a kill(getpid(), SIGKILL) but also with no luck. Attaching > with gdb fails as well (hangs indefinitely). Looks like a > real zombie: dead and very active at the same time:-( > > Does that ring a bell with anyone of you? One of the threads > is rather likely to do a lot of epoll() calls. > > Please keep in mind that I can't simply change the whole > architecture - this is an embedded system already out in > the field, and my role in this is to get a new kernel ver- > sion to work, not upset a more or less working application > (unless I can come up with very convincing arguments;-) Apart from what the others wrote: - Can you use strace or pstack or something to find out what that remaining thread is doing? Even looking in /proc can be useful. - Keep in mind that exit() does things before exiting, e.g. run exit handlers. Also shots in the dark ... /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o .
![]() |
0 |
![]() |
On Tue, 2016-12-13, Lew Pitcher wrote: > On Monday December 12 2016 18:35, in > comp.unix.programmer, "william@wilbur.25thandClement.com" > <william@wilbur.25thandClement.com> wrote: > >> Jens Thoms Toerring <jt@toerring.de> wrote: >>> Hi, >>> >>> I've to deal with a multi-threaded program that has, as >>> one of its threads a "watchdog thread" that, when it doesn't >>> notice some variable getting set within a certain time, >>> is supposed to stop the whole program (at any cost, no >>> worries about data lost). It does attempt to shut down the >>> program by calling exit(). >> <snip> >>> This did work with a 3.4 Linux kernel. But after switching >>> to a 4.4 kernel it suddenly doesn't work reliably anymore. >>> If it fails one thread seems to run amok, using about 50% >>> of the CPU time, the other 50% being used by ksoftirqd. The >>> whole thing can't be stopped in any way (not even with 'kill >>> -SIGKILL'). I've also tried to replace the exit() call with >>> a kill(getpid(), SIGKILL) but also with no luck. Attaching >>> with gdb fails as well (hangs indefinitely). Looks like a >>> real zombie: dead and very active at the same time:-( >> >> A shot in the dark: is the application using robust mutexes? That's the >> first thing that comes to mind. Robust mutexes require the kernel, when >> destroying a thread, to walk a userspace linked-list data structure. > > Another shot in the dark: > Did the C runtime library (glibc or local equivalent) change? If so, was it > compiled so as to use the exit_group(2) syscall in the exit(3) function? > > According to various Linux kernel docs, since the introduction of NPTL, > exit(2) only terminates the calling thread, leaving all other threads in > the "process" active. To terminate /all/ threads at once, use exit_group(2). > Since glibc v2.3, the exit(3) call has invoked exit_group(2) instead of > exit(2). This also seems to be documented in _exit(2). (Note the underscore.) > Perhaps your newer version of the runtime library has reverted back > to calling exit(2). Also, perhaps Jens' team has broken exit() while porting. Since it's embedded I suppose they (or a third party) provide the OS. From your description, this seems easy to get wrong. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o .
![]() |
0 |
![]() |
jt@toerring.de (Jens Thoms Toerring) writes: [terminate program via exit run by watchdog thread] > This did work with a 3.4 Linux kernel. But after switching > to a 4.4 kernel it suddenly doesn't work reliably anymore. > If it fails one thread seems to run amok, using about 50% > of the CPU time, the other 50% being used by ksoftirqd. The > whole thing can't be stopped in any way (not even with 'kill > -SIGKILL'). This suggests that the thread is in a D state (uninterruptible sleep) which persists for some reason. Trying to determine what it's doing in the kernel (eg, strace, /proc/<pid>/wchan) might be useful.
![]() |
0 |
![]() |
On 13.12.16 00.03, Jens Thoms Toerring wrote: > This did work with a 3.4 Linux kernel. But after switching > to a 4.4 kernel it suddenly doesn't work reliably anymore. > If it fails one thread seems to run amok, using about 50% > of the CPU time, the other 50% being used by ksoftirqd. The > whole thing can't be stopped in any way (not even with 'kill > -SIGKILL'). I've also tried to replace the exit() call with > a kill(getpid(), SIGKILL) but also with no luck. Attaching > with gdb fails as well (hangs indefinitely). Looks like a > real zombie: dead and very active at the same time:-( Probably an exit handler does unexpected things. This could be part of the C runtime as well as part of a used library or even your code. Maybe shutting down your program this way runs into badly tested code paths with some race conditions. Try abort() which does not invoke that much exit handlers. > Does that ring a bell with anyone of you? One of the threads > is rather likely to do a lot of epoll() calls. Definitely I/O. It should check for the exit condition before invoking another I/O. The Linux kernel behaves quite bad when killing processes with outstanding I/O. Request like that are simply ignored. Marcel
![]() |
0 |
![]() |
Marcel Mueller <news.5.maazl@spamgourmet.org> writes: >On 13.12.16 00.03, Jens Thoms Toerring wrote: >> This did work with a 3.4 Linux kernel. But after switching >> to a 4.4 kernel it suddenly doesn't work reliably anymore. >> If it fails one thread seems to run amok, using about 50% >> of the CPU time, the other 50% being used by ksoftirqd. The >> whole thing can't be stopped in any way (not even with 'kill >> -SIGKILL'). I've also tried to replace the exit() call with >> a kill(getpid(), SIGKILL) but also with no luck. Attaching >> with gdb fails as well (hangs indefinitely). Looks like a >> real zombie: dead and very active at the same time:-( > >Probably an exit handler does unexpected things. This could be part of >the C runtime as well as part of a used library or even your code. > >Maybe shutting down your program this way runs into badly tested code >paths with some race conditions. > >Try abort() which does not invoke that much exit handlers. > >> Does that ring a bell with anyone of you? One of the threads >> is rather likely to do a lot of epoll() calls. > >Definitely I/O. It should check for the exit condition before invoking >another I/O. The Linux kernel behaves quite bad when killing processes >with outstanding I/O. Request like that are simply ignored. > If SIGKILL doesn't kill the process, you've a kernel bug.
![]() |
0 |
![]() |
On Tuesday December 13 2016 13:13, in comp.unix.programmer, "Scott Lurndal" <scott@slp53.sl.home> wrote: > Marcel Mueller <news.5.maazl@spamgourmet.org> writes: >>On 13.12.16 00.03, Jens Thoms Toerring wrote: >>> This did work with a 3.4 Linux kernel. But after switching >>> to a 4.4 kernel it suddenly doesn't work reliably anymore. >>> If it fails one thread seems to run amok, using about 50% >>> of the CPU time, the other 50% being used by ksoftirqd. The >>> whole thing can't be stopped in any way (not even with 'kill >>> -SIGKILL'). I've also tried to replace the exit() call with >>> a kill(getpid(), SIGKILL) but also with no luck. Attaching >>> with gdb fails as well (hangs indefinitely). Looks like a >>> real zombie: dead and very active at the same time:-( >> >>Probably an exit handler does unexpected things. This could be part of >>the C runtime as well as part of a used library or even your code. >> >>Maybe shutting down your program this way runs into badly tested code >>paths with some race conditions. >> >>Try abort() which does not invoke that much exit handlers. >> >>> Does that ring a bell with anyone of you? One of the threads >>> is rather likely to do a lot of epoll() calls. >> >>Definitely I/O. It should check for the exit condition before invoking >>another I/O. The Linux kernel behaves quite bad when killing processes >>with outstanding I/O. Request like that are simply ignored. >> > > If SIGKILL doesn't kill the process, you've a kernel bug. Even with a non-buggy kernel, SIGKILL won't terminate a zombie process, nor a process stuck in "uninterruptable sleep" state. It would be helpfull to see the state of the hung thread, as reported by ps or some other tool. -- Lew Pitcher "In Skills, We Trust" PGP public key available upon request
![]() |
0 |
![]() |
Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes: >On Tuesday December 13 2016 13:13, in comp.unix.programmer, "Scott Lurndal" ><scott@slp53.sl.home> wrote: > >> Marcel Mueller <news.5.maazl@spamgourmet.org> writes: >>>On 13.12.16 00.03, Jens Thoms Toerring wrote: >>>> This did work with a 3.4 Linux kernel. But after switching >>>> to a 4.4 kernel it suddenly doesn't work reliably anymore. >>>> If it fails one thread seems to run amok, using about 50% >>>> of the CPU time, the other 50% being used by ksoftirqd. The >>>> whole thing can't be stopped in any way (not even with 'kill >>>> -SIGKILL'). I've also tried to replace the exit() call with >>>> a kill(getpid(), SIGKILL) but also with no luck. Attaching >>>> with gdb fails as well (hangs indefinitely). Looks like a >>>> real zombie: dead and very active at the same time:-( >>> >>>Probably an exit handler does unexpected things. This could be part of >>>the C runtime as well as part of a used library or even your code. >>> >>>Maybe shutting down your program this way runs into badly tested code >>>paths with some race conditions. >>> >>>Try abort() which does not invoke that much exit handlers. >>> >>>> Does that ring a bell with anyone of you? One of the threads >>>> is rather likely to do a lot of epoll() calls. >>> >>>Definitely I/O. It should check for the exit condition before invoking >>>another I/O. The Linux kernel behaves quite bad when killing processes >>>with outstanding I/O. Request like that are simply ignored. >>> >> >> If SIGKILL doesn't kill the process, you've a kernel bug. > >Even with a non-buggy kernel, SIGKILL won't terminate a zombie process, nor a >process stuck in "uninterruptable sleep" state. A zombie no longer holds resources, with the exception of the exit status (say 32-bits) and the pid. It's the parent responsibility to reap the status. An operating system that allows an application to enter an "uninterruptable sleep" state is broken. It used to be in SVR3, that one could end up in an uninterruptable sleep state during close(2) when the file descriptor referenced a character special device for a parallel port (e.g. printer) and the printer was off-line. Bugs like that were mainly fixed a quarter century ago.
![]() |
0 |
![]() |
On Tuesday December 13 2016 13:44, in comp.unix.programmer, "Scott Lurndal" <scott@slp53.sl.home> wrote: > Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes: >>On Tuesday December 13 2016 13:13, in comp.unix.programmer, "Scott Lurndal" >><scott@slp53.sl.home> wrote: >> >>> Marcel Mueller <news.5.maazl@spamgourmet.org> writes: >>>>On 13.12.16 00.03, Jens Thoms Toerring wrote: >>>>> This did work with a 3.4 Linux kernel. But after switching >>>>> to a 4.4 kernel it suddenly doesn't work reliably anymore. >>>>> If it fails one thread seems to run amok, using about 50% >>>>> of the CPU time, the other 50% being used by ksoftirqd. The >>>>> whole thing can't be stopped in any way (not even with 'kill >>>>> -SIGKILL'). I've also tried to replace the exit() call with >>>>> a kill(getpid(), SIGKILL) but also with no luck. Attaching >>>>> with gdb fails as well (hangs indefinitely). Looks like a >>>>> real zombie: dead and very active at the same time:-( >>>> >>>>Probably an exit handler does unexpected things. This could be part of >>>>the C runtime as well as part of a used library or even your code. >>>> >>>>Maybe shutting down your program this way runs into badly tested code >>>>paths with some race conditions. >>>> >>>>Try abort() which does not invoke that much exit handlers. >>>> >>>>> Does that ring a bell with anyone of you? One of the threads >>>>> is rather likely to do a lot of epoll() calls. >>>> >>>>Definitely I/O. It should check for the exit condition before invoking >>>>another I/O. The Linux kernel behaves quite bad when killing processes >>>>with outstanding I/O. Request like that are simply ignored. >>>> >>> >>> If SIGKILL doesn't kill the process, you've a kernel bug. >> >>Even with a non-buggy kernel, SIGKILL won't terminate a zombie process, nor >>a process stuck in "uninterruptable sleep" state. > > A zombie no longer holds resources, with the exception of the exit status > (say 32-bits) and the pid. > > It's the parent responsibility to reap the status. True. It remains in the process table (and visible through ps(1)) until the parent reaps the status, or permits init(8) to reap the status. Since the process is already dead, it CANNOT be "killed" (terminated and removed from the process table) by SIGKILL. > An operating system that allows an application to enter an > "uninterruptable sleep" state is broken. OK. Thanks for the opinion. Howver, whether or not the OS is, in your opinion, "broken", "uninterruptable sleep" is still a permitted state. And, because the process cannot be scheduled, it cannot receive /any/ signal, let alone SIGKILL. > It used to be in SVR3, that one could end up in an uninterruptable > sleep state during close(2) when the file descriptor referenced a > character special device for a parallel port (e.g. printer) and the > printer was off-line. Bugs like that were mainly fixed a quarter > century ago. -- Lew Pitcher "In Skills, We Trust" PGP public key available upon request
![]() |
0 |
![]() |
On 13.12.16 19.13, Scott Lurndal wrote: >> Definitely I/O. It should check for the exit condition before invoking >> another I/O. The Linux kernel behaves quite bad when killing processes >> with outstanding I/O. Request like that are simply ignored. > > If SIGKILL doesn't kill the process, you've a kernel bug. Well, welcome to real word. A process hanging in state D is one of the most often causes of system reboots. This did not change significantly over the last 15 years from Debian Woody to recent Raspbian with kernel 4.4. Of course, it is not that often that I have serious trouble. Once or twice per year or something like that. AFAIK there is absolutely no recovery from a process blocked in state D. This seems to be a Linux specific "feature". Marcel
![]() |
0 |
![]() |
Hi, thank you all - I'm quite overwhelmed by the number and quality of responses! So please don't be annoyed if I don't respond to each post in detail. As usual I guess I've looked too much at "red herrings". It doesn't seem to have been something really related to threads. After a lot more of looking at the rather longish output of strace I started to notice a pattern, i.e. that one of the threads got interrupted in a call of close(). This often happend a long (relatively speaking) time be- fore the software watchdog tried to stop the program - and that thread never got re-scheduled. So I switched my attention to the serial driver (that close() call was for a device file for one of the serial ports of the processor) and found a different version of it. And, lo and behold, with that updated driver I haven't seen any of that strange behaviour anymore for about 400 test runs. While that is, of course, no proof that everything is well, it at least encouraging;) Unfortunately, the somewhat restricted tools I have at my disposal don't tell me too much what state a process is in. 'ps' is rather terse in what it tells you (no D/S/R etc., i. e. no STAT field at all) one is used from a PC. But the pro- cess/thread was definitely not sleeping nor a zombie - it was so active that it used up about 50% of the CPU time, and ob- viously somehow kept [ksoftirqd] busy as well;-) So from what I can say at the moment it was a slightly buggy driver that, in what manner I can't tell yet, didn't close the device file as requested and thus kept the program from exiting. At least my believe in TLPI/APUE has been restored in that it most likely was a situation where an exit() would have killed all threads if not a buggy driver had intervened;-) Thank you all and best regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ http://toerring.de
![]() |
0 |
![]() |
On Tue, 2016-12-13, Jens Thoms Toerring wrote: > Hi, > > thank you all - I'm quite overwhelmed by the number and > quality of responses! So please don't be annoyed if I don't > respond to each post in detail. > > As usual I guess I've looked too much at "red herrings". > It doesn't seem to have been something really related to > threads. After a lot more of looking at the rather longish > output of strace I started to notice a pattern, i.e. that > one of the threads got interrupted in a call of close(). > This often happend a long (relatively speaking) time be- > fore the software watchdog tried to stop the program - and > that thread never got re-scheduled. > > So I switched my attention to the serial driver (that close() > call was for a device file for one of the serial ports of the > processor) Seems that was the turning point. Nice! > and found a different version of it. And, lo and > behold, with that updated driver I haven't seen any of that > strange behaviour anymore for about 400 test runs. While > that is, of course, no proof that everything is well, it at > least encouraging;) > > Unfortunately, the somewhat restricted tools I have at my > disposal don't tell me too much what state a process is in. > 'ps' is rather terse in what it tells you (no D/S/R etc., i. > e. no STAT field at all) one is used from a PC. One useful trick is to look in the Linux /proc file system. I think that's where ps gets its information anyway, and there's more useful information in there too. The proc(5) man page et cetera may be needed to interpret it. > But the pro- > cess/thread was definitely not sleeping nor a zombie - it was > so active that it used up about 50% of the CPU time, and ob- > viously somehow kept [ksoftirqd] busy as well;-) > So from what I can say at the moment it was a slightly buggy > driver that, in what manner I can't tell yet, didn't close > the device file as requested and thus kept the program from > exiting. A guess: the buggy serial driver sometimes couldn't deal with the resource cleanup caused by the file descriptor closing. close() never returned but initiated some work: partly attributed to the process, and partly to the kernel itself. Maybe the work was actual I/O. Probably you'd have triggered the same thing with a 'kill -9' or an abort() as with exit(). In both cases there's a freeing of kernel resources associated with that file descriptor. > At least my believe in TLPI/APUE has been restored > in that it most likely was a situation where an exit() would > have killed all threads if not a buggy driver had intervened;-) > > Thank you all and best regards, Jens /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o .
![]() |
0 |
![]() |
Marcel Mueller <news.5.maazl@spamgourmet.org> wrote: > On 13.12.16 19.13, Scott Lurndal wrote: >>> Definitely I/O. It should check for the exit condition before invoking >>> another I/O. The Linux kernel behaves quite bad when killing processes >>> with outstanding I/O. Request like that are simply ignored. >> >> If SIGKILL doesn't kill the process, you've a kernel bug. > > Well, welcome to real word. > A process hanging in state D is one of the most often causes of system > reboots. This did not change significantly over the last 15 years from > Debian Woody to recent Raspbian with kernel 4.4. Of course, it is not > that often that I have serious trouble. Once or twice per year or > something like that. > AFAIK there is absolutely no recovery from a process blocked in state D. > This seems to be a Linux specific "feature". The classic stumbling block is that the block device subsystems in Linux as well the *BSDs are fundamentally synchronous. This is related historically to why polling I/O on regular (block device) files is defined by POSIX to alway immediately return ready. Given the expectations engendered by the history, it was apparently too convenient for implementations to bake synchronous interfaces into their block device and driver models. NFS implementations on Linux (and I assume other Unix systems) were especially notorious in this regard, because the kernel implementations adopted the same synchronous interface model, but for obvious reasons were much more prone to putting processes into a prolonged, uninterruptible state. AFAIU, making block device I/O asynchronous (and thus interruptible) requires extensive refactoring of the driver model as well as the individual drivers for those operating systems. POSIX AIO on those systems simply use kernel threads to do the synchronous calls, which only hides the issue. The kernel thread could still block, consuming system resources indefinitely even after the requesting process has long exited. You get a slightly cleaner user process tree, yes, but requests still linger behind the scenes, and resource accounting can no longer be kept deterministic without some ugly compromises. Given the pedigree of Solaris, AIX, and HP-UX, I'm curious what those systems did. Did they refactor their driver model? Officially commit to the kernel thread hack? Or find some sort of compromise, e.g. a quasi-synchronous interface where updated drivers could bubble up through the call stack an interrupt or timeout? There have been several attempts over the years to systematize the kernel thread hack in Linux. See, e.g., these 2007 articles "Fibrils and asynchronous system calls", https://lwn.net/Articles/219954/ "LCA: A new approach to asynchronous I/O" https://lwn.net/Articles/316806/ and most recently from 2016 "Fixing asynchronous I/O, again" https://lwn.net/Articles/671649/ I like to think they always fail because at the end of the day using slave threads can be easily done in userspace. And interfaces like splice(2), sendfile(2), eventfd(2), etc that can allow the userspace solution to match or even exceed the kernel-space solution are useful in their own right. That reality makes it difficult to accept the maintenance burden of an in-kernel overlay solution that doesn't address the underlying issues. But maybe that's just wishful thinking.
![]() |
0 |
![]() |
On Tue, 13 Dec 2016 16:23:08 -0800 <william@wilbur.25thandClement.com> wrote: > The classic stumbling block is that the block device subsystems in > Linux as well the *BSDs are fundamentally synchronous. It's not clear to me why they should be anything other than synchronous. The devices themselves might in some cases support a queued command interface (e.g. SCSI) but that view of the device is very different from a linear-store-of-bytes abstraction. The kernel provides applications with a perfectly good asynchronous interface: the timeslice. if the application has something better to do while it's blocked against I/O, it can put that processing on another pid. In the typical case, the application blocks against needed input, and the kernel can schedule CPU time for something else. > I like to think they always fail because at the end of the day using > slave threads can be easily done in userspace. Exactly. --jkl
![]() |
0 |
![]() |
<william@wilbur.25thandClement.com> writes: >Given the pedigree of Solaris, AIX, and HP-UX, I'm curious what those >systems did. Did they refactor their driver model? Officially commit to the >kernel thread hack? Or find some sort of compromise, e.g. a >quasi-synchronous interface where updated drivers could bubble up through >the call stack an interrupt or timeout? SVR4.2 ES/MP completely redesigned the I/O system to handle asynchronicity natively (along with eliminating the BFKL[*]). The POSIX asynchronous I/O apis were implemented naturally throughout the I/O stack. Our Chorus microkernel-based port of SVR4.2 ES/MP (called SVR4/MK, or project Amadeus in Europe) also supported the asynchronous interfaces internally, and they were heavily used by Oracle for performance. [*] Big F'ing Kernel Lock
![]() |
0 |
![]() |
> Well, welcome to real word. > A process hanging in state D is one of the most often causes of system > reboots. This did not change significantly over the last 15 years from > Debian Woody to recent Raspbian with kernel 4.4. Of course, it is not > that often that I have serious trouble. Once or twice per year or > something like that. > AFAIK there is absolutely no recovery from a process blocked in state D. > This seems to be a Linux specific "feature". I'm not sure I agree with that. Hanging device drivers (in state "D"), specifically due to USB devices being disconnected at inconvenient times, seems to be a bigger problem than just Linux. I've observed it occasionally on the *BSDs. Usually it's quite obvious that the device shouldn't have been intentionally disconnected, but that the cable/connector was a little loose and someone wiggled it.
![]() |
0 |
![]() |
On 15.12.16 07.35, Gordon Burditt wrote: >> AFAIK there is absolutely no recovery from a process blocked in state D. >> This seems to be a Linux specific "feature". > > I'm not sure I agree with that. Hanging device drivers (in state > "D"), specifically due to USB devices being disconnected at > inconvenient times, seems to be a bigger problem than just Linux. > I've observed it occasionally on the *BSDs. Usually it's quite > obvious that the device shouldn't have been intentionally disconnected, > but that the cable/connector was a little loose and someone wiggled > it. Bugs and I/O errors can occur everywhere. Not that nice, but that's life. The only problem is the kernel is unable to recover from this errors without reboot. This is not contemporary. Marcel
![]() |
0 |
![]() |
Marcel Mueller <news.5.maazl@spamgourmet.org> writes: > On 15.12.16 07.35, Gordon Burditt wrote: >>> AFAIK there is absolutely no recovery from a process blocked in state D. >>> This seems to be a Linux specific "feature". >> >> I'm not sure I agree with that. Hanging device drivers (in state >> "D"), specifically due to USB devices being disconnected at >> inconvenient times, seems to be a bigger problem than just Linux. >> I've observed it occasionally on the *BSDs. Usually it's quite >> obvious that the device shouldn't have been intentionally disconnected, >> but that the cable/connector was a little loose and someone wiggled >> it. > > Bugs and I/O errors can occur everywhere. Not that nice, but that's life. > The only problem is the kernel is unable to recover from this errors > without reboot. This is not contemporary. It is contemporary because it's happening now. 'Uninterruptible sleep' state usually means 'the operation being waited for is always expected to complete' as it's entirely within the domain of the local system. Insofar the state persists when talking to a device, that's usually a hardware failure. Another possible cause would be a kernel mutex deadlock. Interruptible sleeping needs correct support code for every instance of a sleep. That's a whole load of opportunities for additional bugs as this will usually need 'resource allocation unwinding' back up the complete callstack. It also needs to be handled correctly in all applications. IMHO, is very questionable if this is really a good idea "just in case there's a kernel bug". It's entirely unclear how "recovery in case of hardware errors" should look like. If a mass storage device fails, the result is going to be "unpleasant" regardless of requiring a reboot to paper over the issue for some time. The idea to use 'D' state for network filesystems is obviously moronic and there should be some kind of 'emergency abort' for removable storage devices, too.
![]() |
0 |
![]() |
On Fri, 16 Dec 2016 17:38:09 +0000 Rainer Weikusat <rweikusat@talktalk.net> wrote: >It's entirely unclear how "recovery in case of hardware errors" should >look like. If a mass storage device fails, the result is going to be >"unpleasant" regardless of requiring a reboot to paper over the issue >for some time. Unless the device is the drive the OS system files are hosted on or some other critical main board component, then any hardware failure should be dealt with gracefully. Period. Hardware failures should be expected and the OS should help the admins diagnose the problem, not just give up and die. >The idea to use 'D' state for network filesystems is obviously moronic >and there should be some kind of 'emergency abort' for removable storage >devices, too. FreeBSD had a nice bug back in the day (maybe still does) whereby if you mounted a floppy disk as a filesystem then removed the disk the kernel would crash. Despite numerous people including myself pointing this out they still hadn't fixed it by 6.0, at which point I switched to linux for other reasons. -- Spud
![]() |
0 |
![]() |
On 16.12.16 18.38, Rainer Weikusat wrote: >> Bugs and I/O errors can occur everywhere. Not that nice, but that's life. >> The only problem is the kernel is unable to recover from this errors >> without reboot. This is not contemporary. > > It is contemporary because it's happening now. > > 'Uninterruptible sleep' state usually means 'the operation being waited > for is always expected to complete' as it's entirely within the domain > of the local system. Insofar the state persists when talking to a > device, that's usually a hardware failure. Another possible cause would > be a kernel mutex deadlock. Even if DMA is involved it should be possible to cancel this operation. And well, if a hardware DMA does not complete within a few minutes it will likely never complete. So unloading the driver is just fine in 99,9% of the cases. > Interruptible sleeping needs correct support code for every instance of > a sleep. That's a whole load of opportunities for additional bugs as > this will usually need 'resource allocation unwinding' back up the > complete callstack. Agree. But I do not talk about graceful exit. Just cancel all related threads. Of course, this might leave the driver in an inconsistent state. Not too surprising since there is the bug. So the next action is to forcibly unload the driver. Since most drivers reset their device when loaded (again) it is likely that the hardware could recover from this error. > It also needs to be handled correctly in all > applications. IMHO, is very questionable if this is really a good idea > "just in case there's a kernel bug". I do not see any action other than "kill" that could be executed in this state. So I see no need for any action in userspace. > It's entirely unclear how "recovery in case of hardware errors" should > look like. If a mass storage device fails, the result is going to be > "unpleasant" regardless of requiring a reboot to paper over the issue > for some time. If it is the root filesystem or swap, yes. There is no reasonable recovery. But most of the time state D is not related to the system disk. More likely it is a WLAN device (amazingly unreliable this kind of hardware) or an USB stick or some other less important device. > The idea to use 'D' state for network filesystems is obviously moronic > and there should be some kind of 'emergency abort' for removable storage > devices, too. Indeed. NFS is really annoying if the network is not 100% solid. Marcel
![]() |
0 |
![]() |
>>The idea to use 'D' state for network filesystems is obviously moronic >>and there should be some kind of 'emergency abort' for removable storage >>devices, too. > > FreeBSD had a nice bug back in the day (maybe still does) whereby if you > mounted a floppy disk as a filesystem then removed the disk the kernel would > crash. Despite numerous people including myself pointing this out they still > hadn't fixed it by 6.0, at which point I switched to linux for other reasons. I expect that you would have the same problem for *ANY* removable device with a UFS filesystem with soft updates enabled (on FreeBSD 10.1, and I think on 11.0). I've managed to trigger some kind of panic related to soft updates by accidental removal of a mounted filesystem (as in "accidentally yanked the cable out"). Floppies using a FAT-16 filesystem probably won't have this issue. Neither, it seems, will a UFS filesystems with soft updates turned off. The data is inconsistent, but the system doesn't panic. Sometimes, the panic was triggered after the program that wrote the data had already terminated (but not all data flushed to disk). Soft updates does seem to work well for actually non-removable drives. The problem of panics doesn't exist when non-removable drives are removed from the system by a power failure. I'm not sure about journaling on UFS, but journaling is usually unsuitable for my application for removable media: large copy to the drive, followed by the data being read-only for a long time (maybe months), or else read a few times (usually by different systems) and then deleted. Journaling increases the number of writes (possibly wearing out flash drives earlier), and I don't really care about the integrity of the data *between* the time the copy starts and everything gets written. I do care about data integrity after it's unmounted and re-mounted. No, this wasn't any essential filesystem like /, swap, /usr, or /var. Most of the time it was /mnt or /mnt2, filesystems used for data transfer or archive using USB memory sticks, or a USB hard drive. I suppose it would also happen with a USB or normal floppy drive. Nothing is permanently mounted on /mnt. In case of accidental disconnection, I'd expect the data in process of being transferred to be toast, and I really don't care much about that. I can't trust the copy anyway.
![]() |
0 |
![]() |