Prev: Obtain a University Degree based on your professional experience for a more satisfactory life. admonishing akenes aglycone
Next: [HACKERS] review patch: Distinguish between unique indexes and unique constraints
From: "Joshua D. Drake" on 29 Jul 2010 13:12 On Thu, 2010-07-29 at 19:08 +0200, Vincenzo Romano wrote: > Hi all. > I'm wondering about PGSQL scalability. > In particular I have two main topics in my mind: > > 1. What'd be the behavior of the query planner in the case I have > a single huge table with hundreds or thousands of partial indexes > (just differing by the WHERE clause). > This is an idea of mine to make index-partitioning instead of > table-partitioning. Well the planner is not going to care about the partial indexes that don't match the where clause but what you are suggesting is going to make writes and maintenance extremely expensive. It will also increase planning time as the optimizer at a minimum has to discard the use of those indexes. > > 2. What'd be the behavior of the query planner in the case I have > hundreds or thousands of child tables, possibly in a multilevel hierarchy > (let's say, partitioning by year, month and company). Again, test it. Generally speaking the number of child tables directly correlates to planning time. Most experience that 60-100 tables is really the highest you can go. It all depends on actual implementation and business requirements however. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Joshua D. Drake" on 29 Jul 2010 13:39 On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote: > I expect that a more complex schema will imply higher workloads > on the query planner. What I don't know is how the increase in the > workload will happen: linearly, sublinearly, polinomially or what? > > Significant testing would require a prototype implementation with > an almost complete feed of data from the current solution. > But I'm at the feasibility study stage and have not enough resources > for that. > > Thanks anyway for the insights, Joshua. > Does the 60-100 tables limit applies to a single level > of inheritance? Or is it more general? I do not currently have experience (except that it is possible) with multi-level inheritance and postgresql. > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 30 Jul 2010 06:49 On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano <vincenzo.romano(a)notorand.it> wrote: > At a first glance it seems that for inheritance some bottleneck is > hindering a full exploit for table partitioning. There have been lengthy discussions of how to implement partitioning to fix these precise problems, yes. > Is there anyone who knows whether those algorithms are linear or not? They're linear in both cases. But they happen at plan time rather than query execution time. So if your application prepares all its queries and then uses them many times it would not slow down query execution but would slow down the query planning time. In some applications this is much better but in others unpredictable run-times is as bad as long run-times. Also in the case of having many partial indexes it would slow down inserts and updates as well, though to a lesser degree, and that would happen at execution time. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Vincenzo Romano on 30 Jul 2010 07:40 2010/7/30 Greg Stark <gsstark(a)mit.edu>: > On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano > <vincenzo.romano(a)notorand.it> wrote: >> At a first glance it seems that for inheritance some bottleneck is >> hindering a full exploit for table partitioning. > > There have been lengthy discussions of how to implement partitioning > to fix these precise problems, yes. Any reference? >> Is there anyone who knows whether those algorithms are linear or not? > > They're linear in both cases. But they happen at plan time rather than > query execution time. So if your application prepares all its queries > and then uses them many times it would not slow down query execution > but would slow down the query planning time. In some applications this > is much better but in others unpredictable run-times is as bad as long > run-times. Hmmm ... maybe I'm missing the inner meaning of your remarks, Greg. By using PREPARE I run the query planned sooner and I should use the plan with the later execution. You can bet that some of the PREPAREd query variables will pertain to either the child table's CHECK contraints (for table partitions) or to the partial index's WHERE condition (for index partitioning). It's exactly this point (execution time) where the "linearity" will kill the query over a largely partitioned table. Is this what you meant? :-) > Also in the case of having many partial indexes it would slow down > inserts and updates as well, though to a lesser degree, and that would > happen at execution time. This makes fully sense to me. -- Vincenzo Romano at NotOrAnd Information Technologies Software Hardware Networking Training Support Security -- NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Vincenzo Romano on 30 Jul 2010 15:50
2010/7/30 Josh Berkus <josh(a)agliodbs.com>: > >> Is there anyone who knows whether those algorithms are linear or not? > > Read the code? It's really very accessible, and there's lots and lots > of comments. While the person who wrote the code is around, isn't it > better to see the real implementation? If the programmer(s) who wrote that part is around, a simple hint would suffice. Even an hint to where look into the code would be very appreciated: the query planner is not as simple as the "ls" command (which is not that simple any more, though). It looks like I need to go the hard way ... Starting from postgresql-8.4.4/src/backend/optimizer -- Vincenzo Romano at NotOrAnd Information Technologies Software Hardware Networking Training Support Security -- cel +393398083886 fix +390823454163 fax +3902700506964 gtalk. vincenzo.romano(a)notorand.it skype. notorand.it -- NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |