Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
writing out dirty blocks to disk?
Yesterday, the contents of a number of files were lost when I
crashed the machine in a failed experiment with SCSI rescan to
remove a hot-unplugged ESATA disk. (Rebuilding the software
RAID10s is another story...) Today, recompiling WINE left 860MB
of dirty blocks needing to be written to disk, according to
/proc/meminfo. After the compilation finished, I waited over 90
seconds, with the system essentially completely idle, and the
amount of dirty data did not go down. Then, when I did a 'sync'
(as a non-root user), immediately everything flushed out to disk
in about 10-15 seconds time and the amount of dirty data went to
zero. (I had "vmstat 3" and xcpustate running, along with a
periodic manual "grep Dirty /proc/meminfo".)
There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
which accoring to this page (and others I have read)
http://www.westnet.com/~gsmith/content/linux-pdflush.htm
means that the kernel's helper processes _should_ get busy and
start to write out any dirty blocks at _MOST_ 30 seconds after
said blocks went dirty. Even approx. 90 seconds after the
compilation job had finished, nothing was getting written. Based
on yesterday's experience, nothing would have been written even
many minutes later.
What gives? Why are my kernel and its helper processes being so
lazy?
Thanks.
--
Robert Riches
spamtrap42@jacob21819.net
(Yes, that is one of my email addresses.)
|
|
0
|
|
|
|
Reply
|
spamtrap421 (75)
|
1/30/2012 12:56:40 AM |
|
Robert Riches wrote:
> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
> writing out dirty blocks to disk?
>
> Yesterday, the contents of a number of files were lost when I
> crashed the machine in a failed experiment with SCSI rescan to
> remove a hot-unplugged ESATA disk. (Rebuilding the software
> RAID10s is another story...) Today, recompiling WINE left 860MB
> of dirty blocks needing to be written to disk, according to
> /proc/meminfo. After the compilation finished, I waited over 90
> seconds, with the system essentially completely idle, and the
> amount of dirty data did not go down. Then, when I did a 'sync'
> (as a non-root user), immediately everything flushed out to disk
> in about 10-15 seconds time and the amount of dirty data went to
> zero. (I had "vmstat 3" and xcpustate running, along with a
> periodic manual "grep Dirty /proc/meminfo".)
>
> There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
> which accoring to this page (and others I have read)
>
> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>
> means that the kernel's helper processes _should_ get busy and
> start to write out any dirty blocks at _MOST_ 30 seconds after
> said blocks went dirty. Even approx. 90 seconds after the
> compilation job had finished, nothing was getting written. Based
> on yesterday's experience, nothing would have been written even
> many minutes later.
>
> What gives? Why are my kernel and its helper processes being so
> lazy?
>
> Thanks.
>
I'm no expert BUT I thought pages weren't flushed at all until the
buffers were needed..or at least until the system was pretty much idle.
http://www.makelinux.net/books/lkd2/ch15lev1sec4
seems to have some detail.
|
|
0
|
|
|
|
Reply
|
tnp (2247)
|
1/30/2012 9:40:45 AM
|
|
On 2012-01-30, Robert Riches <spamtrap42@jacob21819.net> wrote:
> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
> writing out dirty blocks to disk?
man sync
--
-----------------------------------------------------------------------------
Roger Blake (Change "invalid" to "com" for email. Google Groups killfiled.)
"Climate policy has almost nothing to do anymore with environmental
protection... the next world climate summit in Cancun is actually
an economy summit during which the distribution of the world's
resources will be negotiated." -- Ottmar Edenhofer, IPCC
-----------------------------------------------------------------------------
|
|
0
|
|
|
|
Reply
|
rogblake (22)
|
1/30/2012 9:24:39 PM
|
|
On 2012-01-30, The Natural Philosopher <tnp@invalid.invalid> wrote:
> Robert Riches wrote:
>> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
>> writing out dirty blocks to disk?
>>
>> Yesterday, the contents of a number of files were lost when I
>> crashed the machine in a failed experiment with SCSI rescan to
>> remove a hot-unplugged ESATA disk. (Rebuilding the software
>> RAID10s is another story...) Today, recompiling WINE left 860MB
>> of dirty blocks needing to be written to disk, according to
>> /proc/meminfo. After the compilation finished, I waited over 90
>> seconds, with the system essentially completely idle, and the
>> amount of dirty data did not go down. Then, when I did a 'sync'
>> (as a non-root user), immediately everything flushed out to disk
>> in about 10-15 seconds time and the amount of dirty data went to
>> zero. (I had "vmstat 3" and xcpustate running, along with a
>> periodic manual "grep Dirty /proc/meminfo".)
>>
>> There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
>> which accoring to this page (and others I have read)
>>
>> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>>
>> means that the kernel's helper processes _should_ get busy and
>> start to write out any dirty blocks at _MOST_ 30 seconds after
>> said blocks went dirty. Even approx. 90 seconds after the
>> compilation job had finished, nothing was getting written. Based
>> on yesterday's experience, nothing would have been written even
>> many minutes later.
>>
>> What gives? Why are my kernel and its helper processes being so
>> lazy?
>>
>> Thanks.
>>
>
> I'm no expert BUT I thought pages weren't flushed at all until the
> buffers were needed..or at least until the system was pretty much idle.
>
> http://www.makelinux.net/books/lkd2/ch15lev1sec4
>
> seems to have some detail.
Yes, that page has info similar to what I had already studied.
These are what is in /proc/sys/vm/dirty*:
/proc/sys/vm/dirty_background_bytes:0
/proc/sys/vm/dirty_background_ratio:10
/proc/sys/vm/dirty_bytes:0
/proc/sys/vm/dirty_expire_centisecs:3000
/proc/sys/vm/dirty_ratio:20
/proc/sys/vm/dirty_writeback_centisecs:500
According to the descriptions on that page and elsewhere, the
worker threads should wake up ever 5 seconds and flush out
anything older than 30 second old. The system had been idle
(>90% idle CPU with 12 logical CPUs for the worker threads to
use, disks essentially completely idle) for over 90 seconds, and
_nothing_ was getting flushed to disk.
On other Linux systems, I have seen writeback happen very well
right around 30 seconds after the pages went dirty. Manually
running sync ever minute or so should _NOT_ be necessary.
--
Robert Riches
spamtrap42@jacob21819.net
(Yes, that is one of my email addresses.)
|
|
0
|
|
|
|
Reply
|
spamtrap421 (75)
|
1/31/2012 4:39:44 AM
|
|
Robert Riches wrote:
> On 2012-01-30, The Natural Philosopher <tnp@invalid.invalid> wrote:
>> Robert Riches wrote:
>>> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
>>> writing out dirty blocks to disk?
>>>
>>> Yesterday, the contents of a number of files were lost when I
>>> crashed the machine in a failed experiment with SCSI rescan to
>>> remove a hot-unplugged ESATA disk. (Rebuilding the software
>>> RAID10s is another story...) Today, recompiling WINE left 860MB
>>> of dirty blocks needing to be written to disk, according to
>>> /proc/meminfo. After the compilation finished, I waited over 90
>>> seconds, with the system essentially completely idle, and the
>>> amount of dirty data did not go down. Then, when I did a 'sync'
>>> (as a non-root user), immediately everything flushed out to disk
>>> in about 10-15 seconds time and the amount of dirty data went to
>>> zero. (I had "vmstat 3" and xcpustate running, along with a
>>> periodic manual "grep Dirty /proc/meminfo".)
>>>
>>> There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
>>> which accoring to this page (and others I have read)
>>>
>>> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>>>
>>> means that the kernel's helper processes _should_ get busy and
>>> start to write out any dirty blocks at _MOST_ 30 seconds after
>>> said blocks went dirty. Even approx. 90 seconds after the
>>> compilation job had finished, nothing was getting written. Based
>>> on yesterday's experience, nothing would have been written even
>>> many minutes later.
>>>
>>> What gives? Why are my kernel and its helper processes being so
>>> lazy?
>>>
>>> Thanks.
>>>
>> I'm no expert BUT I thought pages weren't flushed at all until the
>> buffers were needed..or at least until the system was pretty much idle.
>>
>> http://www.makelinux.net/books/lkd2/ch15lev1sec4
>>
>> seems to have some detail.
>
> Yes, that page has info similar to what I had already studied.
> These are what is in /proc/sys/vm/dirty*:
>
> /proc/sys/vm/dirty_background_bytes:0
> /proc/sys/vm/dirty_background_ratio:10
> /proc/sys/vm/dirty_bytes:0
> /proc/sys/vm/dirty_expire_centisecs:3000
> /proc/sys/vm/dirty_ratio:20
> /proc/sys/vm/dirty_writeback_centisecs:500
>
> According to the descriptions on that page and elsewhere, the
> worker threads should wake up ever 5 seconds and flush out
> anything older than 30 second old. The system had been idle
> (>90% idle CPU with 12 logical CPUs for the worker threads to
> use, disks essentially completely idle) for over 90 seconds, and
> _nothing_ was getting flushed to disk.
>
> On other Linux systems, I have seen writeback happen very well
> right around 30 seconds after the pages went dirty. Manually
> running sync ever minute or so should _NOT_ be necessary.
>
Maybe the drivers were buffering everything...not the kernel.
I had that problem with a RAID once. you could even sync, but it was no
guarantee the data was on the disks. The RAID sub sytem reported 'all
written' and that was that. Except it wasn't. it was in the RAID system
buffers.
|
|
0
|
|
|
|
Reply
|
tnp (2247)
|
1/31/2012 1:47:01 PM
|
|
Robert Riches <spamtrap42@jacob21819.net> writes:
> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
> writing out dirty blocks to disk?
>
> Yesterday, the contents of a number of files were lost when I
> crashed the machine in a failed experiment with SCSI rescan to
> remove a hot-unplugged ESATA disk. (Rebuilding the software
> RAID10s is another story...) Today, recompiling WINE left 860MB
> of dirty blocks needing to be written to disk, according to
> /proc/meminfo. After the compilation finished, I waited over 90
> seconds, with the system essentially completely idle, and the
> amount of dirty data did not go down. Then, when I did a 'sync'
> (as a non-root user), immediately everything flushed out to disk
> in about 10-15 seconds time and the amount of dirty data went to
> zero. (I had "vmstat 3" and xcpustate running, along with a
> periodic manual "grep Dirty /proc/meminfo".)
>
> There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
> which accoring to this page (and others I have read)
>
> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
pdflush no longer exists, and dirty_expire_centisecs is no longer used.
Which doesn't answer your question as such but does explain why the old
documentation isn't helping.
As a starting point for the new logic:
http://kernelnewbies.org/Linux_2_6_32#head-72c3f91947738f1ea52f9ed21a89876730418a61
--
http://www.greenend.org.uk/rjk/
|
|
0
|
|
|
|
Reply
|
rjk (492)
|
1/31/2012 3:47:03 PM
|
|
On 2012-01-31, The Natural Philosopher <tnp@invalid.invalid> wrote:
>...
>
> Maybe the drivers were buffering everything...not the kernel.
> I had that problem with a RAID once. you could even sync, but it was no
> guarantee the data was on the disks. The RAID sub sytem reported 'all
> written' and that was that. Except it wasn't. it was in the RAID system
> buffers.
The 'Dirty' line in /proc/meminfo said the pages were still in
dirty state. A manual sync did clear them out.
--
Robert Riches
spamtrap42@jacob21819.net
(Yes, that is one of my email addresses.)
|
|
0
|
|
|
|
Reply
|
spamtrap421 (75)
|
2/1/2012 5:51:42 AM
|
|
On 2012-01-31, Richard Kettlewell <rjk@greenend.org.uk> wrote:
> Robert Riches <spamtrap42@jacob21819.net> writes:
>> Why is my kernel (Mageia 1's 2.6.38.8-server-10.mga) _NOT_
>> writing out dirty blocks to disk?
>>
>> Yesterday, the contents of a number of files were lost when I
>> crashed the machine in a failed experiment with SCSI rescan to
>> remove a hot-unplugged ESATA disk. (Rebuilding the software
>> RAID10s is another story...) Today, recompiling WINE left 860MB
>> of dirty blocks needing to be written to disk, according to
>> /proc/meminfo. After the compilation finished, I waited over 90
>> seconds, with the system essentially completely idle, and the
>> amount of dirty data did not go down. Then, when I did a 'sync'
>> (as a non-root user), immediately everything flushed out to disk
>> in about 10-15 seconds time and the amount of dirty data went to
>> zero. (I had "vmstat 3" and xcpustate running, along with a
>> periodic manual "grep Dirty /proc/meminfo".)
>>
>> There is a value of 3000 in /proc/sys/vm/dirty_expire_centisecs,
>> which accoring to this page (and others I have read)
>>
>> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>
> pdflush no longer exists, and dirty_expire_centisecs is no longer used.
> Which doesn't answer your question as such but does explain why the old
> documentation isn't helping.
>
> As a starting point for the new logic:
>
> http://kernelnewbies.org/Linux_2_6_32#head-72c3f91947738f1ea52f9ed21a89876730418a61
Thanks. I hadn't heard solid reporting of the demise of pdflush,
but I had noticed some changes in the names of the kernel worker
or helper processes (but wrote it off as a result of going to my
first SMP machine). I bookmarked that page for study.
--
Robert Riches
spamtrap42@jacob21819.net
(Yes, that is one of my email addresses.)
|
|
0
|
|
|
|
Reply
|
spamtrap421 (75)
|
2/1/2012 5:53:18 AM
|
|
|
7 Replies
46 Views
(page loaded in 0.922 seconds)
Similiar Articles: Controlling File Descriptor Flush - comp.unix.solaris... over the data being or not being written to the ... call on files with no/few dirty pages is ... the entire list of cached pages for this file deciding which need writing out ... copy a file while it is actively being written - comp.os.linux ...... if I copy a file that is actively being written to ... copy of a table out of a pdf file - comp.text.pdf copy a file while it is actively being written - comp.os.linux ... [comp.publish.cdrom] CD-Recordable FAQ, Part 1/4 - comp.publish ...Archive-name: cdrom/cd-recordable/part1 Posting-Frequency: monthly Last-modified: 2008/10/09 Version: 2.71 Send corrections and updates to And... syslogd not logging anymore - comp.unix.solarisKernel 108528-20 For a few days already, I noticed that nothing is being written to /var ... out -f /usr/sbin/syslogd" and see if anything interesting is in /tmp/syslog.out. Does reloading CR3 with the same value flush the TLB? - comp.lang ...I tried writing a small program to figure this out: http://my.execpc.com ... generally assumes a write to cr3 will > flush regardless of the value being written, so ... mencpy 128 bytes - comp.lang.asm.x86I assumed they already had been crunched, and were being written out to RAM for storage ("an exercise in caching"). But only the OP knows. > Afair this depends, among ... Memory and Scan Rate - comp.unix.solaris... such as memory leaks and dodgy scripts written by ... just malloc() all over the place (should run out of VM, but not ... is a meaningful indicator of the system being low ... how to put my pdf "file" (actually a byte array) into iText PDF ...The problem comes in figuring out how the heck I get my byte[] in ... level objects, wouldn't that result in the PDF syntax being written to a PDF page instead of being ... How to diagnose performance problems in web applications? - comp ...... are >frequently due to the application either not being ... identify the real problem, which may indeed turn out ... DBMS applications bog at the DBMS (even the badly written ... Configuration register strangeness - comp.dcom.sys.cisco ...Someone has pointed out something strange going on with the configuration ... with the console, then it seems likely that the register is being written correctly but being ... Paging - Wikipedia, the free encyclopedia... if it has become "dirty"), it must be written back to its location in secondary storage before being freed; otherwise, the contents of the page ... Page out is transferring a ... Page replacement algorithm - Wikipedia, the free encyclopedia... dirty (that is, contains data that have to be written to the stable storage before page ... pages that manage to get re-dirtied before being ... out over a new page that has not ... 7/28/2012 9:54:17 PM
|