From: Robert Haas on 3 Dec 2009 11:00 On Sat, Nov 28, 2009 at 9:54 PM, David Rowley <dgrowley(a)gmail.com> wrote: > Robert Haas Wrote: >> Hmm. I'm not able to reliably detect a performance difference between >> unpatched CVS HEAD (er... git master branch) and same with spcoptions- >> v2.patch applied. I figured that if there were going to be an impact, >> it would be most likely to manifest itself in a query that touches lots >> and lots of tables but does very little actual work. So I used the >> attached script to create 200 empty tables, 100 in the default >> tablespace and 100 in tablespace "dork" (also known as, why I am >> working on this at 11 PM on Thanksgiving). Then I did: >> >> SELECT * FROM a1, a2, a3, ..., a100; > > (I've not read the patch, but I've just read the thread) > If you're just benchmarking the planner times to see if the extra lookups > are affecting the planning times, would it not be better to benchmark > EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ? > Otherwise any small changes might be drowned out in the execution time. > Scanning 100 relations even if they are empty could account for quite a bit > of that time, right? Possibly, but even if I can measure a difference doing it that way, it's not clear that it matters. It's fairly certain that there will be a performance degradation if we measure carefully enough, but if that difference is imperceptible in real-world scanerios, then it's not worth worrying about. Still, I probably will test it just to see. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 17 Dec 2009 20:57 On Thu, Dec 3, 2009 at 11:00 AM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Sat, Nov 28, 2009 at 9:54 PM, David Rowley <dgrowley(a)gmail.com> wrote: >> Robert Haas Wrote: >>> Hmm. I'm not able to reliably detect a performance difference between >>> unpatched CVS HEAD (er... git master branch) and same with spcoptions- >>> v2.patch applied. I figured that if there were going to be an impact, >>> it would be most likely to manifest itself in a query that touches lots >>> and lots of tables but does very little actual work. So I used the >>> attached script to create 200 empty tables, 100 in the default >>> tablespace and 100 in tablespace "dork" (also known as, why I am >>> working on this at 11 PM on Thanksgiving). Then I did: >>> >>> SELECT * FROM a1, a2, a3, ..., a100; >> >> (I've not read the patch, but I've just read the thread) >> If you're just benchmarking the planner times to see if the extra lookups >> are affecting the planning times, would it not be better to benchmark >> EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ? >> Otherwise any small changes might be drowned out in the execution time. >> Scanning 100 relations even if they are empty could account for quite a bit >> of that time, right? > > Possibly, but even if I can measure a difference doing it that way, > it's not clear that it matters. It's fairly certain that there will > be a performance degradation if we measure carefully enough, but if > that difference is imperceptible in real-world scanerios, then it's > not worth worrying about. Still, I probably will test it just to see. I did some fairly careful benchmarking of EXPLAIN SELECT * FROM a1, a2, a3, ..., a100. I explained this query 100 times via DBD::Pg and used time to measure how long the script took to run. I ran the script three times. And the result is... the unpatched version came out 1.7% SLOWER than the patched version. This seems difficult to take seriously, since it can't possibly be faster to do a syscache lookup and parse an array than it is to fetch a constant from a known memory address, but that's what I got. At any rate, it seems pretty clear that it's not hurting much. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 17 Dec 2009 21:15 On Thu, Nov 26, 2009 at 4:25 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >> I don't think there's even a >> solid consensus right now on which GUCs people would want to set at the >> tablespace level. > > This seems like an important point that we need to nail down. The > original motivation for this patch was based on seq_page_cost and > random_page_cost, to cover the case where, for example, one tablespace > is on an SSD and another tablespace is on a RAID array. > > Greg Stark proposed adding effective_io_concurrency, and that makes > plenty of sense to me, but I'm sort of disinclined to attempt to > implement that as part of this patch because I have no familiarity > with that part of the code and no hardware that I can use to test > either the current behavior or the modified behavior. Since I'm > recoding this to use the reloptions mechanism, a patch to add support > for that should be pretty easy to write as a follow-on patch once this > goes in. > > Any other suggestions? Going once... going twice... since no one has suggested anything or spoken against the proposal above, I'm just going to implement seq_page_cost and random_page_cost for now. > Current version of patch is attached. I've revised it to use the > reloptions stuff, but I don't think it's committable as-is because it > currently thinks that extracting options from a pg_tablespace tuple is > a cheap operation, which was true in the non-reloptions-based > implementation but is less true now. At least, some benchmarking > needs to be done to figure out whether and to what extent this is an > issue. Per the email that I just sent a few minutes ago, there doesn't appear to be a performance impact in doing this even in a relatively stupid way - every call that requires seq_page_cost and/or random_page_cost results in a syscache lookup and then uses the relcache machinery to parse the returned array. That leaves the question of what the most elegant design is here. Tom suggested upthread that we should tag every RelOptInfo - and, presumably, IndexOptInfo, though it wasn't discussed - with this information. I don't however much like the idea of adding identically named members in both places. Should the number of options expand in the future, this will become silly very quickly. One option is to define a struct with seq_page_cost and random_page_cost that is then included in RelOptInfo and IndexOptInfo. It would seem to make sense to make the struct, rather than a pointer to the struct, the member, because it makes the copyfuncs/equalfuncs stuff easier to handle, and there's not really any benefit in incurring more palloc overhead. However, I'm sort of inclined to go ahead and invent a mini-cache for tablespaces. It avoids the (apparently insignificant) overhead of reparsing the array multiple times, but it also avoids bloating RelOptInfo and IndexOptInfo with more members than really necessary. It seems like a good idea to add one member to those structures anyway, for reltablespace, but copying all the values into every one we create just seems silly. Admittedly there are only two values right now, but again we may want to add more someday, and caching at the tablespace level just seems like the right way to do it. Thoughts? ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Jaime Casanova on 3 Jan 2010 18:56 On Mon, Dec 28, 2009 at 2:52 AM, Robert Haas <robertmhaas(a)gmail.com> wrote: > > Hearing no thoughts, I have implemented as per the above.  PFA the > latest version.  Any reviews, comments, feedback, etc. much > appreciated. > btw, you need to change STATRELATT, for STATRELATTINH, in syscache.c -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL AsesorÃa y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Alvaro Herrera on 3 Jan 2010 23:19
> --- 49,63 ---- > * ---------------- > */ > > ! #define Natts_pg_tablespace 6 Should be 5? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |