From: Boxuan Zhai on
---------- Forwarded message ----------
From: Boxuan Zhai <bxzhai2010(a)gmail.com>
Date: 2010/7/17
Subject: Re: [HACKERS] gSoC - ADD MERGE COMMAND - code patch submission
To: Simon Riggs <simon(a)2ndquadrant.com>




2010/7/17 Simon Riggs <simon(a)2ndquadrant.com>

On Fri, 2010-07-16 at 08:26 +0800, Boxuan Zhai wrote:
> > The merge actions are transformed into lower level queries. I create a
> > Query node for each of them and append them in a newly create List
> > field mergeActQry. The action queries have different command type and
> > specific target list and qual list, according to their declaration by
> > user. But they all share the same range table. This is because we
> > don't need the action queries to be planned latter. The joining
> > strategy is decided by the top query. We are only interest in their
> > specific action qualifications. In other words, these action queries
> > are only containers for their target list and qualifications.
> >
> > 2. When the query is ready, it will be send to rewriter. In this part,
> > we can call RewriteQuery() to handle the action queries. The UPDATE
> > action will trigger rules on UPDATE, and so on. What need to be
> > noticed are: 1. the actions of the same type should not be rewritten
> > repeatedly. If there are two UPDATE actions in merge command, we
> > should not trigger the ON UPDATE rules twice. 2. if an action type is
> > fully replaced by rules, we should remove all actions of this type
> > from the action list.
> > Rewriter will also do some process on the target list of each action.
>
> IMHO it is a bad thing that we are attempting to execute each action
> statement as a query. That means we need to execute an inner SQL
> statement for each row returned by the top level query.
>
> That design makes MERGE similar in performance to an upsert PL/pgsql
> function, which will perform terribly on large numbers of rows.
>
> Dear Simmon,

Thanks for your feedback. I may not present my idea clearly.
In my design, the merge actions are not executed as separate queries. Only
the top level query (that is a query like "<source table> LEFT JOIN
<target_table> ON <matching_qual>" ) will be planned and executed. For each
tuple return by this plan, we will choose a proper action for it and do the
corresponding modification. The tables will only be scanned and joined
once. One merge action will not do a full run of tables join and then modify
table as a standard UPDATE/DELETE/INSERT query. (Is this what you are
worried about?)

In fact, for one action, we only need the information of: 1. the action type
(UPDATE or DELTE or INSERT). 2 the target list. and 3. the additional
qualifications. And a Query node is a perfect container for these infor.
That's why I transform them in to Query nodes. But all through the analyzer,
rewriter, planner and executor. I just call related functions to formalize
the expressions in their target list and qual lists. The range table and
join tree is only dermined by the top level query, they will not be effected
by merge actions.




> This was exactly the point where I stopped implementation previously:
> attempting to make MERGE work with rules is enough to prevent a tighter
> in-executor implementation of the action list.
>
I am sorry that I don't catch your meanning here clearly.
As my understanding, if there is a rule on the target table, the rewriter
will add a new query in the execution queue. (or replace the original
query). I think the rule queries will not effect the process within the
original query, because they are totally separate queries which will be run
before or after the original query. Are you suggest that we should not allow
rules on MERGE command?



> [To Boxuan, on a personal note, you seem to be coping quite well with
> the code and the process; congratulations and keep going.]
>
>
Thank you. Your encouragement is very important to me.


> --
>
> Simon Riggs www.2ndQuadrant.com <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Training and Services
>
>