From: Richard on
B.U.M.P
Bring up my post. -Richard
From: Michel Esber on
> 2. If someone went from 2-3 minutes to 10 seconds on a query (I assume this
> is a data warehouse) then they must have used the DB2 defaults previously,
> including the pitifully small default bufferpools size of 1000 pages, or
> pitifully small sort heaps. Just because the DB2 database defaults are
> ridiculously small (most of them in 9.5 have not been changes since OS/2
> Database Manager when the largest PC's had 16 MB of main memory), does not
> mean STMM works well. Any half-competent DBA could change the defaults to
> run quite well without STMM. Unfortunately, not all DBA's are competent, or
> more often than not many managers don't think they even need DBA's.


Mark, I am one of the happy customers that have been using STMM in
production and I totally disagree with you.

I believe that STMM is really useful for companies that do not have a
very skilled team of DB2 DBAs. Most DBAs are not more than skilled DB2
users, but are very far from being experts. This includes me.

If you know DB2 internals and are an expert, then yes you probably
don't need STMM. If you'd rather sit in front of a DB2 instance and
fine tune it analyzing performance metrics for hours and hours, then
it is best to leave it disabled.

SInce i don't have enough time and hardcore DB2 DBAs, I use STMM :).

And no, my production instance was not using a default bufferpool.
STMM increased our original bufferpool sizes several times until our
response time, which was very very reasonable for our customers,
became virtually instantaneous.

I am not an IBM lover, really. But STMM is great.

-M
From: Mark A on
"ajstorm(a)ca.ibm.com" <ajstorm(a)gmail.com> wrote in message
news:6054bd1a-1210-4048-83e5-4630addec21b(a)t10g2000yqg.googlegroups.com...
> Mark,
>
> Thanks for the response. I can respond to some of your points here,
> but others we should probably discuss in more detail through email as
> they don't seem generally applicable to a broad audience.
>
> If you're willing, I'd be interested in learning more about these
> problems. You can send me these details via email.

Thanks for the offer, but after many months of desperation trying to get
STMM to work on a critical database server with multiple instances and
databases, we gave up and just hard-coded the memory values. It was not very
difficult. I don't think there is anything to discuss at this point.

Maybe you could contact someone is support and just go over every PMR opened
on STMM if you want more information on the problems customers have
encountered.

> You make a valid point, but it's only partially correct. I agree that
> DB2's default configuration will not give you optimal performance.
> That being said, there are two things that you're not considering.
> First of all, DB2 defaults have changed substantially over the past 10
> years. For example, it's now the case that the DB2 Configuration
> Advisor will run automatically as part of database creation. After
> the Configuration Advisor completes, it will have set 36 of the
> database configuration parameters (including the size of the default
> buffer pool) based on the machine specifications. The result is a
> "default" configuration that is tailored to the environment in which
> the database will be run.
>
> The second thing that you may not be considering is that even
> competent DBAs do not always have the time to optimally configure each
> and every database created at their shop. I know many DBAs that will
> devote hours and hours to optimally configure some of their databases
> and yet, for their test environments, they're happy to let STMM do the
> heavy lifting. That being said, some of these same DBAs have enabled
> STMM on their most critical databases and found that its tuning
> outperformed their hand tuned configuration.

There were very few default parameters that were changed 10 years ago, and
none of the important ones. In DB2 Version 8.2 (which was used by many until
about 2 two years ago, did not have STMM and had the following as defaults:

LOCKLIST 100 (400 KB)
(Linux/UNIX) IBMDEFAULTBP bufferpool 1000 (4 MB)
(Windows) IBMDEFAULTBP bufferpool 250 (1 MB)
LOGBUFSZ 8 (32 KB)
etc.

These are the same exact values as OS/2 Database Manager circa 1990, when
the largest PC's had 16 MB of memory.

Only after 8.2, STMM was added (introduced in 9.1 but changed in 9.5) and in
9.5 auto-configure was made the default (but not before then).

If the DB2 documentation had been written so people could understand typical
values that should be used (in the Reference Guides, not some other manual)
based on the types of applications, then STMM would not have necessary. The
auto-configure is nice, but that was not invoked automatically until 9.5 and
is not very easy to use properly IMO (using properly would force the
answering of the key questions about the intended database).

The problem of STMM is threefold:

1. When multiple bufferpools are desired to accommodate different priorities
for different tables, then STMM cannot know that, as it treats all SQL (and
the tables they go after) the same, giving equal weight to all of them. If
STMM is used for bufferpools (-2), I am not even sure if there is any
benefit to having multiple bufferpools.

2. STMM can cause severe database server problems, such as when STMM gives
up memory (as it frequently does when not needed at a particular moment) but
cannot get the memory back when it tries to get the memory back a few
seconds later. This obviously does not always happen, but when it does, it
can be catastrophic for a OLTP system with high transaction rates. When we
opened PMR's on this problem, IBM was not able to resolve it (9.5.4).

3. STMM has had problems in the past with being to manage a large number of
databases with multiple instances, especially under Linux. IBM support flat
out told us that automatic instance memory would not work under Linux if
more than one instance existed since DB2 could not coordinate the multiple
instance memory (and the databases within those instances).

As I pointed out, some of these may been fixed in 9.5.5 or 9.7.x, but they
were very serious problems. But even before these newer releases were
available, we were getting the same story from (some) in IBM who claimed
everything was fine with STMM, while we told by other IBM'ers (correctly)
that were still some problems that STMM that occurred in certain situations.
Even though improvements have been made in STMM in the most recent fixpacks
and in 9.7, I am skeptical about the claim now (from people who have not
admitted the past problems) that everything is now working perfectly.

> I'd be interested to hear more about these situations, perhaps over
> email.
>
> I'd be interested in hearing more about the problems you've hit over
> email. While there have been some issues with multiple instances that
> we've fixed, they only affected a hand-full of customers.

I don't have time to do that. We already opened several PMR's and engaged
Lab Services to assist, but nothing worked. So we just hard-coded the values
(which was quite simple for any competent DBA to do).

As to how many people have been affected by the problems, I don't think you
have a good gauge on the real number. In our case we use DB2 Linux for
mission critical databases running moderate to high transaction rates, and
we cannot afford to have any problems. A lot of customers don't have such
mission critical systems, or don't use DB2 LUW for them. For example, IBM
doesn't even have any mission critical systems (where they could go out of
business if a database was down for 4 hours).

Surveys conducted by independent consultants have shown that at least 25% of
customers have had at least some problems with STMM. Those running a single
instance and single database have probably had the fewest problems. AIX or
Windows has probably had much fewer problems than Linux.

For example, a poll conducted on 2010-03-12 by DB2Night (file
20100312DB2Night14.wmv on www.DB2NightShow.com) revealed the following:

Are you using STMM in Productions:

Yes, and with good results 17%
Yes, and uncertain of the measureable results 22%
No, we tried it but suffered with adverse consequences 28%
No, we are still on 8.2 or earlier 6%
No, we are not ready to turn it on yet 28%

BTW, IBM'ers are frequent presenters on the DB2NightShow sessions.

> I think you may not completely understand how STMM works with multiple
> buffer pools. STMM works to optimize the configuration of multiple
> buffer pools not by trying to increase their hit rates, but instead by
> determining a configuration that will lead to the minimum possible
> amount of time spent retrieving pages from disk. With this model, it
> is valuable to treat all of the buffer pools the "same" since each of
> them is caching pages in an attempt to prevent disk reads/writes. If
> all you care about is overall database performance, the tuning that
> STMM provides for multiple bufferpools is extremely effective.

On the contrary, I do understand. As a DBA I don't necessarily want to treat
access to all tables and indexes with the same priority, especially with a
very large database where the data is many times the size of server memory.

"Overall" database performance treats every table and every SQL the same,
which is often not optimum IMO.

Granted, there are many DBA's who don't understand how to set up
bufferpools, but if IBM had provided some documentation and guidance on
this, it could be tuned manually in a matter of seconds.

> That is correct. STMM is not recommended in the Balanced Warehouse
> because in DPF environments, STMM must be used only on partitions that
> have similar memory requirements. That being said, I know of several
> customers who are happily running STMM in DPF after exercising the
> necessary precautions. You can read more about the precautions here:
>
> http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.perf.doc/doc/c0023815.html

If you read your own doc carefully, it says that STMM can be used for DPF if
all partitions have the same characteristics as follows:

- All database partitions are on identical hardware, and there is an even
distribution of multiple logical database partitions to multiple physical
database partitions
- There is a perfect or near-perfect distribution of data
- Workloads are distributed evenly across database partitions, meaning that
no database partition has higher memory requirements for one or more heaps
than any of the others

For the Balance Warehouse offerings from IBM, all of the above requirements
are true (since IBM has complete control over them). The reason why the
consultants who configured the Balanced Warehouse don't use STMM is because
they have had problems with it, and not because it does not meet the
requeirements stated above.

If IBM solves all the problems with STMM, then that may change, but the
reason for not using it in 9.5 Balanced Warehouse was because of the many
problems they encountered. BTW, the IBM Balanced Warehouse config also says
that auto-configure must be turned off, since it creates havic. I think this
recommendation to shut it off also applied to single partition databases in
9.7.0 (fixpack 0), but I don't recall.

> That is unfortunate because that's not the official IBM position.

IBM'ers who make a living by giving consulting advice to customers for
hundreds of dollars per hour (and even more than that for classes) and who
actually go to customer sites to implement things, cannot worry about the "
the official IBM position" of a bunch of marketing people who are trying to
cram stuff down our throats. The customer comes first, and keeping the
customer systems up and running is more important than your marketing goals.

This is surely the most troublesome comment I have heard from IBM in a long
time, because it shows that IBM is not listening to customers or their own
consultants while trying to force things into the marketplace before they
are fully tested.

> I would strongly disagree with your argument. While memory may be
> cheap and plentiful in your shop, most of our customers are
> consolidating servers to the point where many databases are all
> fighting for the same small amount of memory. It is in these
> environments where STMM can be the most effective at managing the
> needs of the databases, especially if their peak workload requirements
> are at different times of the day. In general, I think you're greatly
> oversimplifying the configuration dilemma faced by a DBA in the
> absence of tools like STMM.

With the exception of bufferpools, the things that STMM controls uses an
insignificant amount of memory in 99% of the cases. I have either 32 GB or
64 GB of memory on all my database servers and with the exception of
bufferpools, the other things that STMM controls don't amount to anything
even close to 1 GB (even with multiple databases). If IBM had set more
realistic defaults for these, or better yet just documented how to set them
for realistic scenarios likely to be encountered, then there would be very
little wasted memory and no need for STMM.

> Mark, I've been personally involved in almost all of the STMM APARs to
> date.

There have been STMM APARs? I thought you clearly implied it has been
working quite well? In fact there have been many PMRs and many APARs and
many changes made to STMM without APARs, to fix the problems.

You are implying that the problems are all fixed now. I am sure there has
been improvements over time up to and including 9.7.2, but since you are not
exactly candid about the past problems with STMM, then how can I trust you
when you now say it works fine now? I can't risk my company on something
that only takes a few minutes to hardcode (about 5 parameters). Also, I
cannot migrate all my databases to 9.7 at this time due to the amount of
regression testing that would be needed on the application side.

If IBM had documented how to set the 5 parameters controlled by STMM (don't
recall the exact number) for various types of database scenarios, STMM would
not have been necessary. The one possible exception is bufferpools, but if
DBA's just allocate 50% of the server memory to bufferpools (the total of
all bufferpools for all databases on the server) then they wouldn't need
STMM to be constantly trying to adjust for them. You would be surprised how
many people try and use the default of 1000 4K pages per database and then
complain about performance.

> There are a great many DBAs (one of which has already posted to this
> thread) who are quite happy with STMM. I think it's a gross mis-
> statement of the facts to say that STMM was designed to sell DB2 to
> executives as opposed to helping out DBAs.

If IBM really cared about their current customers, they would not be
releasing code that has so many bugs. All of these changes are to sell DB2
to new customers who think DB2 is too complex. In some ways it is too
complex, but in reality if IBM manuals provided "how-to" documentation for
setting up the memory values (other than the default, min, and max values)
for various types of common database scenarios, very few, STMM would not be
needed.

> Again, I welcome feedback about STMM and am willing to help you
> through any issues you may be having. Please follow-up via email.
>
> Thanks,
> Adam

I appreciate your offer, but I am extremely busy. I don't need your help
since I solved the problems by hard-coding the STMM memory settings. If you
need my help to debug DB2, then you would have to pay my company for my
time, which I doubt you are willing to do.

One other point. I am not against automating the memory configurations for
DB2. Many of them are/were ridiculously complex. The use of automatic memory
was a great improvement, but STMM is a different story and I can not risk my
company on it based on the problems we have already encountered versus a
payback that questionable assuming a competent DBA is available (and no, it
doesn't take months to tune it, just minutes).


From: Mark A on
"Jean-Marc Blaise" <jmblaise(a)hotmail.com> wrote in message
news:4c167c5e$0$24821$426a74cc(a)news.free.fr...
>I have many customers using STMM in France in DB2 9.5 (from FP3 to FP5) and
>we have not problem with it.
> We have only deactivated because of 1 application the tuning of LOCKLIST,
> that's all.
>
> Regards,
>
> JM

Well, if the application using that one database were you had a problem with
LOCKLIST was a mission critical application, the STMM problems encountered
could have bankrupted your customer. I need a database that stays up, and/or
doesn't hang, all the time, not just most of the time.

Also, I can't play Russian Roulette trying to figure out when it works and
when it doesn't work.


From: Mark A on
"Michel Esber" <michel(a)automatos.com> wrote in message
news:dcdd14e2-cdbc-47a4-8fd4-0c1c688f69cf(a)i28g2000yqa.googlegroups.com...
> Mark, I am one of the happy customers that have been using STMM in
> production and I totally disagree with you.
>
> I believe that STMM is really useful for companies that do not have a
> very skilled team of DB2 DBAs. Most DBAs are not more than skilled DB2
> users, but are very far from being experts. This includes me.
>
> If you know DB2 internals and are an expert, then yes you probably
> don't need STMM. If you'd rather sit in front of a DB2 instance and
> fine tune it analyzing performance metrics for hours and hours, then
> it is best to leave it disabled.
>
> SInce i don't have enough time and hardcore DB2 DBAs, I use STMM :).
>
> And no, my production instance was not using a default bufferpool.
> STMM increased our original bufferpool sizes several times until our
> response time, which was very very reasonable for our customers,
> became virtually instantaneous.
>
> I am not an IBM lover, really. But STMM is great.
>
> -M

If you don't have skilled team of DBA's then you must not be using DB2 LUW
for any mission critical applications that your company absolutely depends
on. I can understand that because most companies, not even IBM, even have
ANY mission critical applications where the company would be out of business
if the database was hung or down for a day. Unfortunately, that is not the
case for my company.

If STMM increased your bufferpool size to noticeably improve performance,
then why didn't you just increase it yourself. DB2 bufferpools should be
about 50% of the server memory, unless your total database size is smaller.
Maybe if IBM had documented this, you wouldn't need STMM.

As far as your comment that "Most DBAs are not more than skilled DB2 users,"
maybe your company just needs some different DBA's That sounds like a
management problem, not a database or DBA problem. There are plenty of
competent DB2 DBA's around, and also competent managers who know how to hire
competent DBA's (and who know the value of having a competent DBA).