Prev: GSoC - code of implementation of materialized views
Next: [HACKERS] simplifying emode_for_corrupt_record
From: Robert Haas on 25 Jun 2010 15:15 On Fri, Jun 25, 2010 at 1:33 PM, Kevin Grittner <Kevin.Grittner(a)wicourts.gov> wrote: > Recent discussions involving the possible benefits of a connection > pool for certain users has reminded me of a brief discussion at The > Royal Oak last month, where I said I would post a reference a > concept which might alleviate the need for external connection > pools. �For those interested, check out section 2.4 of this > document: > > Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. > Architecture of a Database System. Foundations and Trends(R) in > Databases Vol. 1, No. 2 (2007) 141*259. > > http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf I think a good admission control system for memory would be huge for us. There are innumerable threads on pgsql-performance where we tell people to set work_mem to a tiny value (like 4MB or 16MB) because any higher value risks driving the machine into swap in the event that they get an unusually large number of connections or those connections issue queries that require an unusual number of hashes or sorts. There are also innumerable postings from people complaining that external sorts are way slower than in-memory sorts, and of course a hash join frequently mops the floor with a nested loop with inner index-scan. A really trivial admission control system might let you set a system-wide limit on work_mem. As we build a plan, we could estimate the total amount of work_mem it will require by examining all the sort, hash, and hash aggregate nodes it contains. In shared memory, we keep a total of this value across all back-ends. Just before beginning to execute a plan that uses >0 work_mem, we bump this value up by the value for the current plan, unless that would make us exceed the system-wide limit. In that case, we sleep, and then next person to finish executing (and decrease the value in shared memory) will wake us up to try again. (Ideally, we'd want to make maintenance_work_mem part of this accounting process also; otherwise, a system that was humming along nicely might suddenly start thrashing when a maintenance operation kicks off.) I suppose this would take a good deal of performance testing to see how well it actually works. A refinement might be to try to consider an inferior plan that uses less memory when the system is tight on memory, rather than waiting. But you'd have to be careful about that, because waiting might be better (it's worth waiting 15 s if it means the execution time will decrease by > 15 s). The idea of doling out queries to engine processes in an interesting one, but seems very different than our current query execution model. I can't even begin to speculate as to whether there's anything interesting we could do in that area without reading some literature on the topic - got any pointers? But even if we can't or don't want to do that, we already know that limiting the number of backends and round-robining queries among them performs MUCH better that setting max_connections to a large value, and not just because of memory exhaustion. Our answer to that problem is "use a connection pooler", but that's not a very good solution even aside from the possible administrative nuisance, because it only solves half the problem. In the general case, the question is not whether we can currently afford to allocate 0 or 1 backends to a given query, but whether we can afford to allocate 0, 1, or >1; furthermore, if we can afford to allocate >1 backend, we'd ideally like to reuse an existing backend rather than starting a new one. I don't think anyone's going to be too happy with a parallel query implementation with a dependency on an external connection poooler. One of the difficulties in reusing an existing backend for a new query, or in maintaining a pool of backends that could be used as workers for parallel queries, is that our current system does not require us to have, nor do we have, a way of resetting a backend to a completely pristine state. DISCARD ALL is probably pretty close, because AIUI that's what connection poolers are using today, and if it didn't work reasonably well, we'd have gotten complaints. But DISCARD ALL will not let you rebind to a different database, for example, which would be a pretty useful thing to do in a pooling environment, so that you needn't maintain separate pools for each database, and it doesn't let you restart the authentication cycle (with no backsies) either. Of course, addressing these problems wouldn't by itself give us a built-in connection pooler or parallel query execution, and there's some danger of destabilizing the code, but possibly it would be a good way to get off the ground. Not sure, though. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Kevin Grittner" on 25 Jun 2010 15:52 Robert Haas <robertmhaas(a)gmail.com> wrote: > Kevin Grittner <Kevin.Grittner(a)wicourts.gov> wrote: >> check out section 2.4 of this > A really trivial admission control system might let you set a > system-wide limit on work_mem. Heck, I think an even *more* trivial admission control policy which limits the number of active database transactions released to execution might solve a lot of problems. Of course, what you propose is more useful, although I'd be inclined to think that we'd want an admission control layer which could be configured so support both of these and much more. Done correctly, it could almost completely eliminate the downward slope after you hit the "knee" in many performance graphs. > A refinement might be to try to consider an inferior plan that > uses less memory when the system is tight on memory, rather than > waiting. I wouldn't try messing with that until we have the basics down. ;-) It is within the scope of what an execution admission controller is intended to be able to do, though. > The idea of doling out queries to engine processes in an > interesting one, but seems very different than our current query > execution model. That wasn't in section 2.4 itself -- you must have read the whole chapter. I think any discussion of that should spin off a separate thread -- the techniques are really orthogonal. And frankly, that's more ambitious a topic than *I'm* inclined to want to get into at the moment. An "execution admission controller" that starts simple but leaves room for growth seems within the realm of possibility. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 25 Jun 2010 16:10 On 6/25/10 12:15 PM, Robert Haas wrote: > I think a good admission control system for memory would be huge for > us. There are innumerable threads on pgsql-performance where we tell > people to set work_mem to a tiny value (like 4MB or 16MB) because any > higher value risks driving the machine into swap in the event that > they get an unusually large number of connections or those connections > issue queries that require an unusual number of hashes or sorts. Greenplum did this several years ago with the Bizgres project; it had a resource control manager which was made available for PostgreSQL core. However, it would have required a large and unpredictable amount of work to make it compatible with OLTP workloads. The problem with centralized resource control is the need for centralized locking on requests for resources. That forces transactions to be serialized in order to make sure resources are not double-allocated. This isn't much of a problem in a DW application, but in a web app with thousands of queries per second it's deadly. Performance engineering for PostgreSQL over the last 7 years has been partly about eliminating centralized locking; we don't want to add new locking. That means that a realistic admissions control mechanism would need to operate based on projections and estimates and "best effort" policies. Not only is this mathematically more complex, it's an open question whether it puts us ahead of where we are now vis-a-vis underallocation of memory. Realistically, a lot of tuning and testing would be required before such a tool was actually an improvement. Or, to put it another way: the "poor man's admission control" is a waste of time because it doesn't actually help performance. We're basically facing doing the hard version, or not bothering. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Kevin Grittner" on 25 Jun 2010 16:34 Josh Berkus <josh(a)agliodbs.com> wrote: > Greenplum did this several years ago with the Bizgres project > However, it [was not] compatible with OLTP workloads. > the "poor man's admission control" is a waste of time because it > doesn't actually help performance. We're basically facing doing > the hard version, or not bothering. I think it's premature to assume that without any evidence. I'm sure it's possible to create a policy which does more harm than good for any particular workload; there's no denying that could happen, but things such as limiting open transactions (as just one example) might be accomplished at very low cost. Since I have seen dramatic performance improvements from restricting this through a connection pool, I'm inclined to believe there could be benefit from such a simple policy as this. The total work memory policy Robert proposed sounds likely to more than pay for itself by allowing larger work_mem settings without risking cache flushing or swapping. One thing that seems clear to me is that the admission policy should be configurable, so that it can be tuned base on workload. That would also be consistent with a "start simple and expand the capabilities" approach. C'mon, don't be such a buzz-kill. Why should Greenplum have all the fun? ;-) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 25 Jun 2010 16:42 On Fri, Jun 25, 2010 at 4:10 PM, Josh Berkus <josh(a)agliodbs.com> wrote: > On 6/25/10 12:15 PM, Robert Haas wrote: >> I think a good admission control system for memory would be huge for >> us. �There are innumerable threads on pgsql-performance where we tell >> people to set work_mem to a tiny value (like 4MB or 16MB) because any >> higher value risks driving the machine into swap in the event that >> they get an unusually large number of connections or those connections >> issue queries that require an unusual number of hashes or sorts. > > Greenplum did this several years ago with the Bizgres project; it had a > resource control manager which was made available for PostgreSQL core. > However, it would have required a large and unpredictable amount of work > to make it compatible with OLTP workloads. > > The problem with centralized resource control is the need for > centralized locking on requests for resources. �That forces transactions > to be serialized in order to make sure resources are not > double-allocated. �This isn't much of a problem in a DW application, but > in a web app with thousands of queries per second it's deadly. > Performance engineering for PostgreSQL over the last 7 years has been > partly about eliminating centralized locking; we don't want to add new > locking. I haven't seen the Greenplum code - how did it actually work? The mechanism I just proposed would (except in the case of an overloaded system) only require holding a lock for long enough to test and update a single integer in shared memory, which doesn't seem like it would cause a serious serialization problem. I might be missing something, or it might suck for lots of other reasons, but if we already know that then let's try to be more specific about what the problems are. > That means that a realistic admissions control mechanism would need to > operate based on projections and estimates and "best effort" policies. > Not only is this mathematically more complex, it's an open question > whether it puts us ahead of where we are now vis-a-vis underallocation > of memory. �Realistically, a lot of tuning and testing would be required > before such a tool was actually an improvement. Before today, that's the only approach I'd ever considered, but this article made me rethink that. If you have a stream of queries that can be run quickly with 1GB of memory and much more slowly with any lesser amount, the only sensible thing to do is wait until there's a GB of memory available for you to grab. What projection or estimate of "best effort" would arrive at even approximately the same result? > Or, to put it another way: the "poor man's admission control" is a waste > of time because it doesn't actually help performance. �We're basically > facing doing the hard version, or not bothering. I think it's an oversimplification to group all approaches as "easy" and "hard", and even more of an oversimplification to say that all of the easy ones suck. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Next
|
Last
Pages: 1 2 3 4 5 6 Prev: GSoC - code of implementation of materialized views Next: [HACKERS] simplifying emode_for_corrupt_record |