f



thread scheduling - cpu bound job while others are under I/O wait.

Hi.

Is is possible of CPU task threads to fail to get CPUs
if other threads are waiting for disk IO?

There's a process that consists of 70 threads. (I'm using linux
pthread.)
60 of them are waiting  for disk I/O and
the others are doing some calculation tasks.

Curiously enough, when CPU's I/O wait goes up (around 15% or up),
total CPU utilization( that is user time + system time) goes down
under 5%
even there are lots of CPU bound threads.
Consequently, the calculation speed decreases a lot.

0
duddn
9/2/2010 2:30:40 AM
comp.linux.development.system 5436 articles. 0 followers. zixenus (12) is leader. Post Follow

4 Replies
255 Views

Similar Articles

[PageSpeed] 54

Errata

> Is is possible of CPU task threads to fail to get CPUs

-> Is it possible ...
0
duddn
9/2/2010 7:07:32 AM
On Sep 1, 7:30=A0pm, duddn <icho...@gmail.com> wrote:

> Is is possible of CPU task threads to fail to get CPUs
> if other threads are waiting for disk IO?

That's not supposed to happen.

> There's a process that consists of 70 threads. (I'm using linux
> pthread.)
> 60 of them are waiting =A0for disk I/O and
> the others are doing some calculation tasks.
>
> Curiously enough, when CPU's I/O wait goes up (around 15% or up),
> total CPU utilization( that is user time + system time) goes down
> under 5%
> even there are lots of CPU bound threads.
> Consequently, the calculation speed decreases a lot.

Most likely, your CPU bound threads aren't.

DS
0
David
9/2/2010 1:03:48 PM
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---1247968105-1730277038-1283453582=:19734
Content-Type: TEXT/PLAIN; charset=iso-8859-2; format=flowed
Content-Transfer-Encoding: 8BIT

On Thu, 2 Sep 2010, David Schwartz wrote:

> On Sep 1, 7:30�pm, duddn <icho...@gmail.com> wrote:
>
>> Is is possible of CPU task threads to fail to get CPUs if other threads 
>> are waiting for disk IO?
>
> That's not supposed to happen.
>
>> There's a process that consists of 70 threads. (I'm using linux 
>> pthread.)
>> 60 of them are waiting �for disk I/O and the others are doing some 
>> calculation tasks.
>>
>> Curiously enough, when CPU's I/O wait goes up (around 15% or up), total 
>> CPU utilization( that is user time + system time) goes down under 5% 
>> even there are lots of CPU bound threads.
>> Consequently, the calculation speed decreases a lot.
>
> Most likely, your CPU bound threads aren't.

.... possibly because they are waiting for data coming from those IO-bound 
threads, which are in turn waiting on the disk.

To the OP: if there is some queueing pattern from the IO-bound threads to 
the CPU-bound threads, you could record the queue depth each time the 
queue is accessed (pushed or popped).

I don't know how the "disk subsystem" beneath the program looks like, and 
anyway I'm in no way a storage guy, but naively, 60 threads reading even 
sequentially from distinct "block streams" (fragments, extents, whatever) 
will probably cause some contention for the disk heads.

Have you experimented with /sys/block/$DEV/queue/scheduler? Not 
necessarily to improve the situation, but to see if changing the scheduler 
affects the iowait percentage, and if so (which is probable IMO), whether 
the CPU percentage is also affected (which would confirm, again IMO, that 
your entire process is in fact IO-bound).

Overlapping CPU with IO is one valid application of threads, and it helps 
one finding the bottleneck (CPU or IO) for a given "access pattern". To 
parrot my favorite trivial formula, if IO excludes CPU, then the wall 
clock time taken is

     real time spent on CPU + real time spent on IO

while if they are perfectly overlapped, it is

     max { real time spent on CPU, real time spent on IO }

(This is why an "elastic buffer" filter between tar -- running on top of a 
cold cache -- and a compressor utility accelerates the entire pipeline.)

You may have found your bottleneck to be IO.

lacos
---1247968105-1730277038-1283453582=:19734--
0
Ersek
9/2/2010 6:53:02 PM
On Sep 3, 3:53=A0am, "Ersek, Laszlo" <la...@caesar.elte.hu> wrote:
> On Thu, 2 Sep 2010, David Schwartz wrote:
> > On Sep 1, 7:30=A0pm, duddn <icho...@gmail.com> wrote:
>
> >> Is is possible of CPU task threads to fail to get CPUs if other thread=
s
> >> are waiting for disk IO?
>
> > That's not supposed to happen.
>
> >> There's a process that consists of 70 threads. (I'm using linux
> >> pthread.)
> >> 60 of them are waiting =A0for disk I/O and the others are doing some
> >> calculation tasks.
>
> >> Curiously enough, when CPU's I/O wait goes up (around 15% or up), tota=
l
> >> CPU utilization( that is user time + system time) goes down under 5%
> >> even there are lots of CPU bound threads.
> >> Consequently, the calculation speed decreases a lot.
>
> > Most likely, your CPU bound threads aren't.
>
> ... possibly because they are waiting for data coming from those IO-bound
> threads, which are in turn waiting on the disk.
>
> To the OP: if there is some queueing pattern from the IO-bound threads to
> the CPU-bound threads, you could record the queue depth each time the
> queue is accessed (pushed or popped).
>
> I don't know how the "disk subsystem" beneath the program looks like, and
> anyway I'm in no way a storage guy, but naively, 60 threads reading even
> sequentially from distinct "block streams" (fragments, extents, whatever)
> will probably cause some contention for the disk heads.
>
> Have you experimented with /sys/block/$DEV/queue/scheduler? Not
> necessarily to improve the situation, but to see if changing the schedule=
r
> affects the iowait percentage, and if so (which is probable IMO), whether
> the CPU percentage is also affected (which would confirm, again IMO, that
> your entire process is in fact IO-bound).
>
> Overlapping CPU with IO is one valid application of threads, and it helps
> one finding the bottleneck (CPU or IO) for a given "access pattern". To
> parrot my favorite trivial formula, if IO excludes CPU, then the wall
> clock time taken is
>
> =A0 =A0 =A0real time spent on CPU + real time spent on IO
>
> while if they are perfectly overlapped, it is
>
> =A0 =A0 =A0max { real time spent on CPU, real time spent on IO }
>
> (This is why an "elastic buffer" filter between tar -- running on top of =
a
> cold cache -- and a compressor utility accelerates the entire pipeline.)
>
> You may have found your bottleneck to be IO.
>
> lacos


Thank you for good information.
0
duddn
9/3/2010 8:08:25 AM
Reply: