Greetings,
Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It
looks really bad. I am trying to find a way to identify which cause high
iowait.
Any idea will be greatly appreciated. Thanks in advance!
Evan
|
|
0
|
|
|
|
Reply
|
music4
|
2/22/2005 7:22:56 AM |
|
music4 <music4@163.net> wrote:
> Greetings,
> Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It
> looks really bad.
I/Owait is not always a problem. Why do you think it is bad in this
case?
> I am trying to find a way to identify which cause high iowait.
Whenever your CPU has some idle time, and at least one thread has an
outstanding I/O call, you'll accumulate I/O wait time.
If you have very little CPU needs but a lot of I/O needs (think of a
database serving lots of simple queries), then you'd probably see lots
of iowait time.
--
Darren Dunham ddunham@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
|
|
0
|
|
|
|
Reply
|
Darren
|
2/22/2005 7:58:05 AM
|
|
music4 wrote:
> Greetings,
>
> Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It
> looks really bad. I am trying to find a way to identify which cause high
> iowait.
>
> Any idea will be greatly appreciated. Thanks in advance!
>
> Evan
>
>
Why is it that folks *always* assume that IOWait is bad?
I wrote a doc on this a bit over a year ago, have a read of
http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
Because of all of the misunderstanding associated with IOwait, it is
defined to be 0 in Solaris 10.
The really important thing to take away from that document is that
IOWait is a subset of idle. You only get IOwait time if there is nothing
else ready to run from the dispatch queues.
alan.
--
Alan Hargreaves - http://blogs.sun.com/tpenta
Kernel/VOSJEC/Performance Engineer
Product Technical Support (APAC)
Sun Microsystems
|
|
0
|
|
|
|
Reply
|
Alan
|
2/23/2005 1:20:47 AM
|
|
Alan Hargreaves - Product Technical Support (APAC) wrote:
> Why is it that folks *always* assume that IOWait is bad?
>
> I wrote a doc on this a bit over a year ago, have a read of
> http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
>
>
> Because of all of the misunderstanding associated with IOwait, it is
> defined to be 0 in Solaris 10.
>
> The really important thing to take away from that document is that
> IOWait is a subset of idle. You only get IOwait time if there is nothing
> else ready to run from the dispatch queues.
>
> alan.
That's the pinnacle of wrong answers. A hard-coded zero, I mean.
Hey, scan rate doesn't mean crap either, even though I read more
articles in this group from folks that say *any* scan rate is bad.
Should I assume Solaris 11 will have that hard-coded to zero too?
A loose cough doesn't mean you have pneumonia. But that doesn't mean
you should ignore it either.
Rich
|
|
0
|
|
|
|
Reply
|
Richard
|
2/23/2005 3:03:30 AM
|
|
Iowait per cpu doesn't really map to any real-world information...
IOwait for threads would be more meaningful, but isn't collected
right now.
If I run NCPU cpu-bound threads on a machine, I will never see
any IOwait regardless of how many thousands of other threads that
block waiting for IO... and if I write a program that spawns NCPU
threads on an otherwise idle machine and those threads do nothing
but random reads from a dvd, I'll see 100% iowait on every cpu.
Because this statistic has virtually no meaning on an MP, iowait
per cpu is no longer reported.
- Bart
|
|
0
|
|
|
|
Reply
|
barts
|
2/23/2005 4:43:44 AM
|
|
Even,
Have you tried "iostat/sar" to see on which disk huge number of I/O
operation are executed. "iostat -P" can even give report for each disk
partition. Then you may decide which application program causes that based
on your knowledge of applications on your server.
Use "serv" (service time, in fact it's response time) in their report as an
indicator of disk I/O performance, if it keeps great than 30 (ms), you may
think about to distribute I/O operations on different disk, use disk mirror
and etc.
Regards,
Michael
"music4" <music4@163.net> wrote in message
news:cveml7$c3o@netnews.proxy.lucent.com...
> Greetings,
>
> Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait.
It
> looks really bad. I am trying to find a way to identify which cause high
> iowait.
>
> Any idea will be greatly appreciated. Thanks in advance!
>
> Evan
>
>
|
|
0
|
|
|
|
Reply
|
Michael
|
2/23/2005 4:50:21 AM
|
|
>
> Why is it that folks *always* assume that IOWait is bad?
>
> I wrote a doc on this a bit over a year ago, have a read of
> http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
>
>
> Because of all of the misunderstanding associated with IOwait, it is
> defined to be 0 in Solaris 10.
>
> The really important thing to take away from that document is that
> IOWait is a subset of idle. You only get IOwait time if there is nothing
> else ready to run from the dispatch queues.
>
> alan.
> --
> Alan Hargreaves - http://blogs.sun.com/tpenta
> Kernel/VOSJEC/Performance Engineer
> Product Technical Support (APAC)
> Sun Microsystems
Alan,
Thanks for the article. Now, I understatnd what iowait means. But when CPU
is occupied by a thread that is waiting for IO, can CPU be used by other
threads?
If not, although CPU is idle (do nothing but wait), I will think wait is
also busy. High iowait means a lot of CPU time are idle but can not be used
to process other threads. That's reason why I feel high iowait is bad. And
therefore I want to analysis why iowait is hight, and try to reduce iowait
to make more CPU time to be available for other threads.
Correct me please.
Evan
|
|
0
|
|
|
|
Reply
|
music4
|
2/23/2005 5:34:46 AM
|
|
"music4" <music4@163.net> wrote in message
news:cvh4mf$rn0@netnews.proxy.lucent.com...
>
> But when CPU is occupied by a thread that is
> waiting for IO, can CPU be used by other
> threads?
>
If a thread (or process) is waiting for one
or more outstanding i/o's to complete, it is
not using ("occupying" to use your term) any
CPU. It is suspended.
dk
|
|
0
|
|
|
|
Reply
|
Dan
|
2/23/2005 6:40:15 AM
|
|
"music4" <music4@163.net> wrote in message
news:cvh4mf$rn0@netnews.proxy.lucent.com...
> >
>> Why is it that folks *always* assume that IOWait is bad?
>>
>> I wrote a doc on this a bit over a year ago, have a read of
>> http://sunsolve.sun.com/search/document.do?assetkey=1-9-75659-1
>>
>>
>> Because of all of the misunderstanding associated with IOwait, it is
>> defined to be 0 in Solaris 10.
>>
>> The really important thing to take away from that document is that
>> IOWait is a subset of idle. You only get IOwait time if there is nothing
>> else ready to run from the dispatch queues.
>>
>> alan.
>> --
>> Alan Hargreaves - http://blogs.sun.com/tpenta
>> Kernel/VOSJEC/Performance Engineer
>> Product Technical Support (APAC)
>> Sun Microsystems
>
> Alan,
>
> Thanks for the article. Now, I understatnd what iowait means. But when CPU
> is occupied by a thread that is waiting for IO, can CPU be used by other
> threads?
>
> If not, although CPU is idle (do nothing but wait), I will think wait is
> also busy. High iowait means a lot of CPU time are idle but can not be
> used
> to process other threads. That's reason why I feel high iowait is bad. And
> therefore I want to analysis why iowait is hight, and try to reduce iowait
> to make more CPU time to be available for other threads.
>
> Correct me please.
>
Hello ?!?
Iowait means precisely that the CPU is available.
Is English your mother tongue? If not, you should
focus your efforts towards learning the language
so you can read manuals profficiently ;-)
dk
|
|
-1
|
|
|
|
Reply
|
Dan
|
2/23/2005 6:48:13 AM
|
|
"Dan Koren" <dankoren@yahoo.com> wrote in message
news:421c2748$1@news.meer.net...
>
> Hello ?!?
>
> Iowait means precisely that the CPU is available.
>
> Is English your mother tongue? If not, you should
> focus your efforts towards learning the language
> so you can read manuals profficiently ;-)
>
>
> dk
>
>
I am not an English speaking man. I need to improve my English skill. But
have you read Alan's artical about how iowait is calculated? My
understanding was based on Alan's article rather than the word "IOwait".
According to Alan's article, there are four values for CPU statistic: idle,
kernal, user and wait. When a thread is waiting for IO, wait counter is
increased. But if CPU can be occupied by other thread, kernal or user
counter will also be increased. Is that true?
|
|
0
|
|
|
|
Reply
|
music4
|
2/23/2005 7:22:31 AM
|
|
Richard Pettit <richard.pettit@gmail.com> writes:
>That's the pinnacle of wrong answers. A hard-coded zero, I mean.
The placeholder value is left there because so many tools depend
on looking at it. "0" is about as meaningful as the current value.
If you want to know about I/O, use iostat. I/O wait is really
only a measure of the relative time needed to process the data
versus the time needed to get it off disk. A characteristic
of the workload.
Also, when you start a CPU bound job, your I/O wait suddenly
drops to zero; yet the jobs which want the I/O are still
waiting just as much. That doesn't strike me as useful.
>Hey, scan rate doesn't mean crap either, even though I read more
>articles in this group from folks that say *any* scan rate is bad.
>Should I assume Solaris 11 will have that hard-coded to zero too?
No, a scanrate is a meaningful indicator of the system being low
on memory. (Having just seen a 2-way Opteron system stressed with
a scan rate of 5 million, I wouldn't say it's quite meaningful)
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
|
|
0
|
|
|
|
Reply
|
Casper
|
2/23/2005 8:45:30 AM
|
|
Casper H.S. Dik wrote:
> The placeholder value is left there because so many tools depend
> on looking at it. "0" is about as meaningful as the current value.
Whatever. It's a no-win. Both values are useless.
> If you want to know about I/O, use iostat. I/O wait is really
> only a measure of the relative time needed to process the data
> versus the time needed to get it off disk. A characteristic
> of the workload.
If I'd like to know about I/O, I'll use my own tools that sort
processes by which one is generating the most I/O. Then I'll know what
process is creating the problem. iostat paints with broad strokes and
is used as a next alternative. I would, however, like to see better
per-process I/O metrics.
> Also, when you start a CPU bound job, your I/O wait suddenly
> drops to zero; yet the jobs which want the I/O are still
> waiting just as much. That doesn't strike me as useful.
If you're that CPU bound, you have a bigger issue than the I/O wait and
you should be looking for CPU bound processes and what their problem is
anyway.
> No, a scanrate is a meaningful indicator of the system being low
> on memory. (Having just seen a 2-way Opteron system stressed with
> a scan rate of 5 million, I wouldn't say it's quite meaningful)
>
> Casper
Scan rate is useful in the same way that I/O wait is. In the absence of
better metrics, it must suffice. A per-process average page residency
time would be better. Then, the volatility of the working set of each
process can be examined. Any current measurement for APRT, at the
process or system level, is ad hoc and cannot be taken seriously.
Many think if they see a blip in the sr column of vmstat that they have
a memory shortage. First, the VM system scratching an itch does not
qualify as a shortfall. Second, the CPU power, bus bandwidth and disk
speeds of modern computers allows for a scan rate much higher and for
longer bursts than many will give berth for. A continuous high rate
(and that's a relative measure) is indicative of memory contention. An
extremely high spike over a short period, if it's an aberration, can be
noted, but not acted on. Such high spikes occurring frequently with the
VM system settling back down to quiescence can be disruptive and should
be treated with more physical memory.
|
|
0
|
|
|
|
Reply
|
richard
|
2/23/2005 3:59:27 PM
|
|
richard.pettit@gmail.com writes:
>If I'd like to know about I/O, I'll use my own tools that sort
>processes by which one is generating the most I/O. Then I'll know what
>process is creating the problem. iostat paints with broad strokes and
>is used as a next alternative. I would, however, like to see better
>per-process I/O metrics.
So use dtrace; it allows you to do exactly that in S10.
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
|
|
0
|
|
|
|
Reply
|
Casper
|
2/23/2005 9:31:31 PM
|
|
|
12 Replies
1848 Views
(page loaded in 0.206 seconds)
Similiar Articles: How to analysis high IOWait problem - comp.unix.solarisGreetings, Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It looks really bad. I am trying to find a way to identify which cause high iowait. problem analysis chart - comp.programmingHow to analysis high IOWait problem - comp.unix.solaris problem analysis chart - comp.programming How to analysis high IOWait problem - comp.unix.solaris problem analysis ... Problem with cputime - comp.soft-sys.matlabHow to analysis high IOWait problem - comp.unix.solaris How to analysis high IOWait problem - comp.unix.solaris High load average / nothing using much CPU time - comp.sys ... Solaris term problem - comp.unix.solarisHow to analysis high IOWait problem - comp.unix.solaris Solaris term problem - comp.unix.solaris How to analysis high IOWait problem - comp.unix.solaris not using ... sar vs mpstat - comp.unix.solarisHow to analysis high IOWait problem - comp.unix.solaris Even, Have you tried "iostat/sar" to see on which disk huge number of I/O ... high IOWait problem - comp.unix ... subfigure 6 x 6 - problem counter too large - comp.text.tex ...How to analysis high IOWait problem - comp.unix.solaris subfigure 6 x 6 - problem counter too large - comp.text.tex ... How to analysis high IOWait problem - comp.unix ... Measuring total amount of disk I/O - comp.unix.solarisHow to analysis high IOWait problem - comp.unix.solaris Measuring total amount of disk I/O - comp.unix.solaris How to analysis high IOWait problem - comp.unix.solaris Even ... High await in iostat? - comp.os.linux.miscHow to analysis high IOWait problem - comp.unix.solaris High await in iostat? - comp.os.linux.misc How to analysis high IOWait problem - comp.unix.solaris High await in ... sar VS. vmstat - comp.unix.programmerHow to analysis high IOWait problem - comp.unix.solaris Even, Have you tried "iostat/sar" to see on which disk huge number of I/O ... comp.unix.solaris How to analysis ... system analysis/audit report in HTML format. - comp.unix.solaris ...How to analysis high IOWait problem - comp.unix.solaris system analysis/audit report in HTML format. - comp.unix.solaris ... In other words, the system used those ... What's the meaning of ithr from mpstat? - comp.unix.solaris ...How to analysis high IOWait problem - comp.unix.solaris What's the meaning of ithr from mpstat? - comp.unix.solaris ... How to analysis high IOWait problem - comp.unix ... What means LWP column in the "top" command? Number of threads ...How to analysis high IOWait problem - comp.unix.solaris What means LWP column in the "top" command? Number of ... How to analysis high IOWait problem - comp.unix.solaris ... Fraction of Total problem in FM 5.0 - comp.databases.filemaker ...... is the number of running processes as a fraction of total ... found there are five numer in it. that 0.00 0.05 ... comp.unix.solaris How to analysis high IOWait problem ... Strange problem with CPU spikes on solaris 8 - comp.unix.solaris ...How to analysis high IOWait problem - comp.unix.solaris Such high spikes occurring frequently with the VM system settling ... How to analysis high IOWait problem - comp ... vmstat High Wait Queue - comp.unix.solarisHigh run queue length; why? - comp.unix.solaris vmstat High Wait Queue - comp.unix.solaris How to analysis high IOWait problem - comp.unix.solaris vmstat High Wait Queue ... How to analysis high IOWait problem - comp.unix.solaris | Computer ...Greetings, Our server (Sun Netra 1280, Solaris 2.8) always has more than 20% iowait. It looks really bad. I am trying to find a way to identify which cause high iowait. Solaris Operating System: How to analysis high IOWait problem ...software.itags.org: Solaris Operating System question: How to analysis high IOWait problem, created at:Sat, 24 May 2008 11:21:00 GMT with 257 bytes, last updated ... 7/19/2012 3:15:24 PM
|