From: Andrew Dunstan on 10 Apr 2010 09:15 Heikki Linnakangas wrote: > > 1. Keep the materialized view up-to-date when the base tables change. > This can be further divided into many steps, you can begin by supporting > automatic updates only on very simple views with e.g a single table and > a where clause. Then extend that to support joins, aggregates, > subqueries etc. Keeping it really limited, you could even require the > user to write the required triggers himself. > > That last bit doesn't strike me as much of an advance. Isn't the whole point of this to automate it? Creating greedy materialized views is usually not terribly difficult now, but you do have to write the triggers. The other thing that could be interesting about this would be some scheme for lazy refresh that didn't involve re-extracting the whole data set. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 10 Apr 2010 09:18 2010/4/10 Andrew Dunstan <andrew(a)dunslane.net>: > Heikki Linnakangas wrote: >> >> 1. Keep the materialized view up-to-date when the base tables change. >> This can be further divided into many steps, you can begin by supporting >> automatic updates only on very simple views with e.g a single table and >> a where clause. Then extend that to support joins, aggregates, >> subqueries etc. Keeping it really limited, you could even require the >> user to write the required triggers himself. > > That last bit doesn't strike me as much of an advance. Isn't the whole point > of this to automate it? Creating greedy materialized views is usually not > terribly difficult now, but you do have to write the triggers. Yeah, I agree. > The other thing that could be interesting about this would be some scheme > for lazy refresh that didn't involve re-extracting the whole data set. One way to do this would be to infer a primary key for the result set based on the input query. But I think we don't really have the infrastructure to do this right now, so not really a project for a beginner. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Kevin Grittner" on 10 Apr 2010 13:13 Greg Smith wrote: > And work on MERGE support is itself blocked behind the fact that > PostgreSQL doesn't have a good way to lock access to a key value > that doesn't exist yet--what other databases call key range > locking. The bulk of the serializable implementation WIP is work to implement just this sort of locking. There are already a couple possible spin-off uses on the horizon based on the ability of these locks to survive their initiating transactions and detect conflicting writes. Both spinoffs involve somehow flagging a transaction as being one for which the locks should be kept until further notice, and issuing a notification when a conflicting write occurs. That seems consistent with the needs of materialized views, too. It probably won't be solid in time to be useful for GSoC, but if someone's looking to map out a plan for materialized views, I thought this information might be germane. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 10 Apr 2010 23:03 Heikki Linnakangas wrote: > Your proposal basically describes > doing 1, in a limited fashion where the view is not updated > automatically, but only when the DBA runs a command to refresh it. I'm > not sure if that's useful enough on its own, writing "CREATE > MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just > writing "CREATE TABLA AS ...". But if you can do something about 2, or > even a very limited part of 1, keeping the view up-to-date > automatically, it becomes much more useful. > You've hit upon the core issue here. You can build materialized views right now using "CREATE TABLE AS". You can even update them by creating a new table the same way, with a new name, and doing the LOCK/RENAME/DROP shuffle--what people used to do for rewriting bloated tables before there was CLUSTER. The first step in the proposal here is essentially syntax to give an easier UI for that. That's an interesting step, but recognize that it doesn't actually provide anything you can't do already. If you then note that doing any sort of incremental update to the view is a hard problem, and that a lot of the useful cases for materialized views involve tables where it's impractical to recreate the whole thing anyway, you'll inevitably find yourself deeply lost in the minutia of how to handle the updates. It's really the core problem in building what people expect from a materialized view implementation in a serious database. Chipping away at the other pieces around it doesn't move the feature that far forward, even if you get every single one of them except incremental updates finished, because everything else combined is still not that much work in comparison to the issues around updates. There certainly are a fair number of subproblems you can break out of here. I just think it's important to recognize that the path that leads to a useful GSoC project and the one that gives a production quality materialized view implementation may not have that much in common, and to manage expectations on both sides accordingly. If Pavel thinks he's going to end up being able to say "I added materialized views to PostgreSQL" at the end of the summer, that's going to end in disappointment. And if people think this project plan will lead to being able to claim PostgreSQL now has this feature, that's also not going to go well. If the scope is "add initial grammar and rewriting moving toward a future materialized view feature", which the underlying implementation noted as a stub prototype, that might work out OK. This is why I likened it to the work on "Syntax for partitioning", which has a similarly focused subgoal structure. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 10 Apr 2010 23:40
Robert Haas wrote: > It's not obvious to me > that a brief full-table lock wouldn't be acceptable for an initial > implementation. Obviously it wouldn't be suitable for every use case > but since we're talking about manually refreshed views that was bound > to be true anyway. > There already is an initial implementation of sorts. There are a couple of ways you can build these right now, so any new development has to look like it will end with good odds of being an improvement over what's already available before it's worth putting development resources into. As a rough idea of what people want these for in the field, based on what I've seen requests for, imagine that someone has a 1TB table they're materializing a view on in order to get at least a 10:1, and hopefully close to a 100:1, speedup on viewing summary data. Now, picture what happens if you have someone doing a sequential scan on the MV, which is still quite big, the updater process lines up to grab an exclusive lock when it's done, and now a second user wanting to read a single row quickly comes along behind it. Given a full-table lock implementation, that scenario is unlikely to play out with the second user getting a fast response. They'll likely sit in a lock queue for some potentially long period of time instead, waiting for the active seq scan to finish then the update to happen. You have to build it that way or a steady stream of people reading could block out updates forever. To be frank, that makes for a materalized view implementation of little value over what you can currently do as far as I'm concerned. It might be interesting as a prototype, but that's not necessarily going to look like what's needed to do this for real at all. I'm not a big fan of dumping work into projects when you can see exactly how it's going to fail before you even get started. As I see if, if you know where it's going to fall down, you don't need to build a prototype as an exercise to show you how to build it--you should work on that part first instead. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |