Admission Control [PgSql]

Prev: GSoC - code of implementation of materialized views
Next: [HACKERS] simplifying emode_for_corrupt_record

From: Robert Haas on 25 Jun 2010 16:44

On Fri, Jun 25, 2010 at 3:52 PM, Kevin Grittner
<Kevin.Grittner(a)wicourts.gov> wrote:
> Heck, I think an even *more* trivial admission control policy which
> limits the number of active database transactions released to
> execution might solve a lot of problems.

That wouldn't have any benefit over what you can already do with a
connection pooler, though, I think. In fact, it would probably be
strictly worse, since enlarging the number of backends slows the
system down even if they aren't actually doing anything much.

> Of course, what you
> propose is more useful, although I'd be inclined to think that we'd
> want an admission control layer which could be configured so support
> both of these and much more. �Done correctly, it could almost
> completely eliminate the downward slope after you hit the "knee" in
> many performance graphs.

And world peace!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 25 Jun 2010 17:09

Robert Haas <robertmhaas(a)gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner(a)wicourts.gov> wrote:

>> Heck, I think an even *more* trivial admission control policy
>> which limits the number of active database transactions released
>> to execution might solve a lot of problems.
>
> That wouldn't have any benefit over what you can already do with a
> connection pooler, though, I think. In fact, it would probably be
> strictly worse, since enlarging the number of backends slows the
> system down even if they aren't actually doing anything much.

Agreed -- *if* the connection pool is on another computer.
Otherwise it'll probably consume about he same resources on the same
machine as what I'm proposing, only with more setup and
configuration involved. Until we build a connection pooler into the
base product, though, you know what an uphill battle it is to get
people to control contention that way. We can usually get someone to
tweak a GUC when they report a problem, and maybe the tuning tools
could start to take core count and effective spindle count into
consideration and suggest a good setting for this, if we had it.

With the right connection pooler built in to core, though, this one
could go right out the window, and they could tune at that layer
instead. [thinks] Actually, the transaction count limit doesn't
need the planner to run first, so it could be considered part of the
first-tier admission control. Essentially, it *would be* a simple
but effective built-in connection pooler.

I still think an execution admission controller would be worthwhile,
but the policy I proposed doesn't belong there; it might be the
subject of a pretty simple patch which might solve a lot of
performance problems. Gotta sleep on that....

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Martijn van Oosterhout on 26 Jun 2010 11:03

On Fri, Jun 25, 2010 at 03:15:59PM -0400, Robert Haas wrote:
> A
> refinement might be to try to consider an inferior plan that uses less
> memory when the system is tight on memory, rather than waiting. But
> you'd have to be careful about that, because waiting might be better
> (it's worth waiting 15 s if it means the execution time will decrease
> by > 15 s).

I think you could go a long way by doing something much simpler. We
already generate multiple plans and compare costs, why not just include
memory usage as a cost? If you start doing accounting for memory across
the cluster you can assign a "cost" to memory. When there are only a
few processes running it's cheap and you get plans like now. But as the
total memory usage increases you increase the "cost" of memory and
there will be increased pressure to produce lower memory usage plans.

I think this is better than just cutting plans out at a certain
threshold since it would allow plans that *need* memory to work
efficiently will still be able to.

(It doesn't help in situations where you can't accurately predict
memory usage, like hash tables.)

Have a nice day,
--
Martijn van Oosterhout <kleptog(a)svana.org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

From: Robert Haas on 26 Jun 2010 11:37

On Sat, Jun 26, 2010 at 11:03 AM, Martijn van Oosterhout
<kleptog(a)svana.org> wrote:
> On Fri, Jun 25, 2010 at 03:15:59PM -0400, Robert Haas wrote:
>> �A
>> refinement might be to try to consider an inferior plan that uses less
>> memory when the system is tight on memory, rather than waiting. �But
>> you'd have to be careful about that, because waiting might be better
>> (it's worth waiting 15 s if it means the execution time will decrease
>> by > 15 s).
>
> I think you could go a long way by doing something much simpler. We
> already generate multiple plans and compare costs, why not just include
> memory usage as a cost? If you start doing accounting for memory across
> the cluster you can assign a "cost" to memory. When there are only a
> few processes running it's cheap and you get plans like now. But as the
> total memory usage increases you increase the "cost" of memory and
> there will be increased pressure to produce lower memory usage plans.
>
> I think this is better than just cutting plans out at a certain
> threshold since it would allow plans that *need* memory to work
> efficiently will still be able to.

That's an interesting idea.

> (It doesn't help in situations where you can't accurately predict
> memory usage, like hash tables.)

Not sure what you mean by this part. We already predict how much
memory a hash table will use.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Ross J. Reedstrom" on 28 Jun 2010 11:51

On Sat, Jun 26, 2010 at 01:19:57PM -0400, Robert Haas wrote:
>
> I'm not sure. What does seem clear is that it's fundamentally at odds
> with the "admission control" approach Kevin is advocating. When you
> start to run short on a resource (perhaps memory), you have to decide
> between (a) waiting for memory to become available and (b) switching
> to a more memory-efficient plan. The danger of (b) is that using less
> memory probably means using more of some other resource, like CPU or
> disk, and now you've just switched around which release you're
> overloading - but on the other hand, if the difference in CPU/disk is
> small and the memory savings is large, maybe it makes sense. Perhaps
> in the end we'll find we need both capabilities.
>
> I can't help feeling like some good instrumentation would be helpful
> in answering some of these questions, although I don't know where to
> put it.

One issue with this is that no matter how expensive you make a query,
it will run - it just may take a very long time (if the cost is a
reasonable estimate)

This is also an implied suggestion for a dynamically self-modifying cost
param, since the memory cost isn't absolute, but rather relative to free
memory. In addition, as Robert points out, the tradeoff between
resources is dynamic, as well.

Hmm, I'm suddenly struck by the idea of having a max_cost parameter,
that refuses to run (or delays?) queries that have "too high" a cost.
That might have some interactive-SQL uses, as well: catch the cases you
forgot a join condition, so have an unintended cartesian explosion, etc.
Could also be a belt-and-suspenders last defense for DB admins who
aren't sure the client software completely stops the users from doing
something stupid.

Clearly, default to current behavior, -1 (infinity).

Ross
--
Ross Reedstrom, Ph.D. reedstrm(a)rice.edu
Systems Engineer & Admin, Research Scientist phone: 713-348-6166
The Connexions Project http://cnx.org fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: GSoC - code of implementation of materialized views
Next: [HACKERS] simplifying emode_for_corrupt_record