From: Troels Arvin on
Hello,

DB2 v. 9.7.1.

During certain special workloads (e.g. parallel REORGs/RUNSTATs), I've
seen the server consuming 450MB/s with 25000IO/s (yes, there is probably
some storage system caching happening here) while all CPU cores were
happily working in an even mix of usermode/wait. And disk benchmark tools
can also bring some very satisfying bandwith and IO/s numbers to the
table.

On the other hand, we also have another scenario: A single rather complex
query with at least one large table scan. "list applications" reports
that the query is Executing (not locked). IO: At most 10MB/s, 500 IO/s;
CPU: two cores in 99.9% wait state, all other cores 100% idle. The tables
which the query reads from have been altered to have LOCKSIZE=TABLE, so I
would think that lock list work is zero.
What's going on in such a situation? What tools/snapshots/... can I use
to gain better insight in such a case?

--
Regards, Troels Arvin
From: Ian on
On 3/4/10 3:52 PM, Troels Arvin wrote:
> Hello,
>
> DB2 v. 9.7.1.
>
> During certain special workloads (e.g. parallel REORGs/RUNSTATs), I've
> seen the server consuming 450MB/s with 25000IO/s (yes, there is probably
> some storage system caching happening here) while all CPU cores were
> happily working in an even mix of usermode/wait. And disk benchmark tools
> can also bring some very satisfying bandwith and IO/s numbers to the
> table.
>
> On the other hand, we also have another scenario: A single rather complex
> query with at least one large table scan. "list applications" reports
> that the query is Executing (not locked). IO: At most 10MB/s, 500 IO/s;
> CPU: two cores in 99.9% wait state, all other cores 100% idle. The tables
> which the query reads from have been altered to have LOCKSIZE=TABLE, so I
> would think that lock list work is zero.
> What's going on in such a situation? What tools/snapshots/... can I use
> to gain better insight in such a case?
>

This sounds like you are probably doing a lot of synchronous I/O,
reading scattered data from all over a table. The REORG and
RUNSTATS can do very efficient asynchronous I/O, batching up
lots of large reads. When you run these two different workloads,
what kind of average read times do you see for your tablespaces?

What does the query plan look like? Is it reading an index and fetching
rows from a (big) table? Perhaps this IXSCAN/FETCH pair
is the inner leg of an NLJOIN ? If this is the case, what is the
clusterfactor for the index?


Ian Bjorhovde