Admission Control [PgSql]

Prev: GSoC - code of implementation of materialized views
Next: [HACKERS] simplifying emode_for_corrupt_record

From: Tom Lane on 28 Jun 2010 12:48

"Ross J. Reedstrom" <reedstrm(a)rice.edu> writes:
> Hmm, I'm suddenly struck by the idea of having a max_cost parameter,
> that refuses to run (or delays?) queries that have "too high" a cost.

That's been suggested before, and shot down on the grounds that the
planner's cost estimates are not trustworthy enough to rely on for
purposes of outright-failing a query. If you didn't want random
unexpected failures, you'd have to set the limit so much higher than
your regular queries cost that it'd be pretty much useless.

Maybe it'd be all right if it were just used to delay launching the
query a bit, but I'm not entirely sure I see the point of that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 28 Jun 2010 14:06

Josh Berkus <josh(a)agliodbs.com> wrote:

> We can go back to Kevin's originally proposed simple feature:
> just allowing the DBA to limit the number of concurrently
> executing queries by role and overall.

Well, that's more sophisticated than what I proposed, but it's an
interesting twist on it.

> This would consist of two parameters,
> max_concurrent_statements and max_statement_wait; the second would
> say how long the connection would wait for a query slot before
> giving up and returning an error.

The timeout is also an embellishment to what I proposed, but another
interesting idea.

> even this solution has two thorny issues to be solved:
>
> a) the "waiting query storm" issue mentioned above

I fleshed out the idea a bit more on the thread titled "Built-in
connection pool", since this would effectively function in a very
similar way to a connection pool. If you look at that proposal, at
most one transaction would be released to execution when one
transaction completed. I'm not seeing anything resembling a "storm"
in that, so you must be envisioning something rather different.
Care to clarify?

> b) pending queries are sitting on idle connections, which could
> easily block higher-priority queries, so managing max_connections
> per role would become much more complex.

That is a good point. The biggest difference between the
functionality of the proposal on the other thread and the connection
pool built in to our application framework is that the latter has a
prioritized FIFO queue, with ten levels of priority. A small query
which is run as you tab between controls in a GUI window runs at a
much higher priority than a query which fills a list with a large
number of rows. This involves both connections reserved for higher
priorities and having higher priority transactions "jump ahead" of
lower priority transactions. This helps the user perception of
performance in the fat-client GUI applications. I suppose if we had
limits by role, we could approach this level of functionality within
PostgreSQL.

On the other hand, our web apps run everything at the same priority,
so there would effectively be *no* performance difference between
what I proposed we build in to PostgreSQL and what our shop
currently puts in front of PostgreSQL for a connection pool.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jesper Krogh on 28 Jun 2010 14:32

On 2010-06-25 22:44, Robert Haas wrote:
> On Fri, Jun 25, 2010 at 3:52 PM, Kevin Grittner
> <Kevin.Grittner(a)wicourts.gov> wrote:
>
>> Heck, I think an even *more* trivial admission control policy which
>> limits the number of active database transactions released to
>> execution might solve a lot of problems.
>>
> That wouldn't have any benefit over what you can already do with a
> connection pooler, though, I think. In fact, it would probably be
> strictly worse, since enlarging the number of backends slows the
> system down even if they aren't actually doing anything much.
>

Sorry if I'm asking silly questions, but how does transactions and
connection pooler's interact?

Say if you have 100 clients all doing "fairly inactive" database work
in transactions lasting a couple of minutes at the same time. If I
understand
connection poolers they dont help much in those situations where an
"accounting" system on "limited resources" across all backends
definately would help.

(yes, its a real-world application here, wether it is clever or not... )

In a fully web environment where all transaction last 0.1s .. a pooler
might make fully sense (when traffic goes up).

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 28 Jun 2010 15:24

Jesper Krogh <jesper(a)krogh.cc> wrote:

> Sorry if I'm asking silly questions, but how does transactions and
> connection pooler's interact?

That depends a great deal on the pooler and its configuration, as
well as your client architecture. Our shop gathers up the
information needed for our database transaction and submits it to a
server application which has all the logic needed to use that data
to apply the transaction. We determined long ago that it is a Very
Bad Idea for us to have an open database transaction which is
waiting for a user to do something before it can proceed.

> Say if you have 100 clients all doing "fairly inactive" database
> work in transactions lasting a couple of minutes at the same time.

I'm not sure what you mean by that. If you mean that part of a
database transaction hits the database, and then it takes a while
for the rest of the statements for the transaction to be generated
(based on network latency or waits for user input), then it is hard
to see how you can effectively use a connection pool. I have yet to
see an environment where I think that's a good approach, but I
haven't seen everything yet. :-)

On the other hand, if the issue is that 100 transactions are fired
off at the same time and it takes two minutes for them all to
finish, unless you have *a lot* of CPUs and spindles, that's not
efficient use of your resources. A connection pool might indeed
help with that; you might start getting transactions back in one
second, and have them all done in a minute and a half if you used a
properly configured connection pool.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jesper Krogh on 28 Jun 2010 16:06

On 2010-06-28 21:24, Kevin Grittner wrote:
> Jesper Krogh<jesper(a)krogh.cc> wrote:
>
>
>> Sorry if I'm asking silly questions, but how does transactions and
>> connection pooler's interact?
>>
>
> That depends a great deal on the pooler and its configuration, as
> well as your client architecture. Our shop gathers up the
> information needed for our database transaction and submits it to a
> server application which has all the logic needed to use that data
> to apply the transaction. We determined long ago that it is a Very
> Bad Idea for us to have an open database transaction which is
> waiting for a user to do something before it can proceed.
>

The situation is more:
1) Grab a bunch of data (using pg_try_advisory_lock() to lock out
other processes from grabbing the same).
2) Process the data (in external software).
3) Push results back into the database, including a flag
telling that the data has been processed.
4) Release advisory locks.

Step 2 takes somewhere between a couple of seconds to a couple of
minutes depending on the task to be done.

It might not be "optimal" but it is extremely robust and simple
to wrap 1 to 4 within a BEGIN / COMMIT block.
On the application side is really nice not having to deal with
"partly processed" data in the database, which I can get around
with by just keeping the transaction open.

From my POV, a connection pooler doesn't buy anything, and
I cannot stop all processes from executing at the same time, allthough
it "most likely" will not happen. There is no "wait for user"
involved.

And that means somewhere in the 100+ backends, allthough they
are "mostly" idle, seen from a database perspective.

I have not hit any issues with the work_mem being too high, but
I'm absolutely sure that I could flood the system if they happened to
be working at the same time.

Jesper
--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: GSoC - code of implementation of materialized views
Next: [HACKERS] simplifying emode_for_corrupt_record