From: Tom Lane on
Simon Riggs <simon(a)2ndQuadrant.com> writes:
> [ v2 patch ]

BTW, while I'm looking at this, why bother with the separate
KnownAssignedXidsValid[] array? Wouldn't it be cheaper
(certainly so in storage, probably so in access/update times)
to have just the KnownAssignedXids[] array and store
InvalidTransactionId in unused entries?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Sun, 2010-04-25 at 11:50 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
> > On Sat, Apr 24, 2010 at 5:17 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> > Both Heikki and I objected to that patch.
> >>
> >> Please explain your objection, based upon the patch and my explanations.
>
> > Well, we objected to the locking.
>
> Not only is the locking overly complex, but it's outright wrong;
> or at least the comments are. KnownAssignedXidsAdd in particular
> has a comment that is 100% misleading, and an actual behavior that
> I find extremely likely to provoke bugs (eg, consider caller calling
> it twice under the illusion it stlll has only shared lock).

I will reword that comment.

> I'm not
> even convinced that it works correctly, since it will happily write
> into KnownAssignedXids[] while holding only shared ProcArrayLock.
> What happens when two processes are doing that concurrently?

Heikki and I discussed this upthread. Only the Startup process ever
calls KnownAssignedXids, so the situation you describe does not happen.
We currently rely on the fact that we have only a single writer in other
places in Hot Standby, so this is not a restriction we are imposing on
the system just to support this change.

Both you and Heikki have previously spoken against having recovery run
in parallel, so the likelihood of having multiple readers in the future
seems low.

If we were to allow multiple callers of the routine in future, the use
of spinlocks to protect the head of the queue means we can safely
reserve slots and so KnownAssignedXidsAdd can easily be made to accept
multiple writers in the future. I think its possible to future proof
that now without significant change or performance drop, so I'll do
that.

> This needs a redesign before it can be considered committable. I don't
> really care whether it makes things faster; it's too complicated and too
> poorly documented to be maintainable.

It is sufficiently complex to achieve its objective, which is keeping
track of xids during recovery, using the same locking as we do during
normal running. This is the third re-write of the code, developed over
more than 20 months and specifically modularised by me to make future
changes drop-in replacements. Initially we had an insert-sorted array,
then a hash table and now this current proposal. Recently, Heikki and I
independently came up with almost exactly the same design, with the
algorithmic characteristics we need for performance. The call points,
frequency and workload of calls are documented and well understood by
multiple people.

The only remaining point of discussion has been the code that takes
ProcArrayLock in shared mode. The difference being discussed here is
whether we access the head and tail of the array with both Shared
+spinlock or just Exclusive. The patch would be almost identical with
that change, but would not meet its locking objectives. So making the
locking stricter isn't going to make it more maintainable by a
measurable amount.

There are more than 60 lines of header comment explaining in detail how
this works, with a full algorithmic analysis. The remaining code is
commented to project standards, with easily more than 100 lines of
comments.

I will recheck every comment and see if other comments can be usefully
added, and further review the code to see if there's anything that can
be done to improve it further. I was going to do this anyway, since I
will be adding an agreed Assert() into the patch.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Simon Riggs <simon(a)2ndQuadrant.com> writes:
> On Sun, 2010-04-25 at 11:50 -0400, Tom Lane wrote:
>> This needs a redesign before it can be considered committable. I don't
>> really care whether it makes things faster; it's too complicated and too
>> poorly documented to be maintainable.

> There are more than 60 lines of header comment explaining in detail how
> this works, with a full algorithmic analysis. The remaining code is
> commented to project standards, with easily more than 100 lines of
> comments.

If the comments were correct, I wouldn't be complaining. They're
misleading or outright wrong on many points. In particular, I don't
think you actually understand the weak-memory-ordering issue, because
the comments about that are entirely wrong. I don't think they are
adequate on other points either --- for example, I now understand why my
complaint of a few minutes ago about KnownAssignedXidsValid is wrong,
but the comments certainly didn't help me on that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Sun, 2010-04-25 at 12:43 -0400, Tom Lane wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
> > [ v2 patch ]
>
> BTW, while I'm looking at this, why bother with the separate
> KnownAssignedXidsValid[] array? Wouldn't it be cheaper
> (certainly so in storage, probably so in access/update times)
> to have just the KnownAssignedXids[] array and store
> InvalidTransactionId in unused entries?

Well, that was exactly my first design.

Heikki came up with the additional array and I think it is an inspired
suggestion, because it allows the search() to use a simple binary
search. I attempted to write the binary-search-with-holes and it was
going to be very ugly code, so I went for another algorithm which was
also quite cool, but not as cool as Heikki's idea - which I was going to
give credit for.

The array adds 520*max_connections bytes of shared memory, but seems a
good tradeoff for the additional performance it allows. I guess we could
use a bit array, but that's slower to access. The bit array would be
9*max_connections bytes of shared memory, so a 40kB saving.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Sun, 2010-04-25 at 12:51 -0400, Tom Lane wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
> > On Sun, 2010-04-25 at 11:50 -0400, Tom Lane wrote:
> >> This needs a redesign before it can be considered committable. I don't
> >> really care whether it makes things faster; it's too complicated and too
> >> poorly documented to be maintainable.
>
> > There are more than 60 lines of header comment explaining in detail how
> > this works, with a full algorithmic analysis. The remaining code is
> > commented to project standards, with easily more than 100 lines of
> > comments.
>
> If the comments were correct, I wouldn't be complaining. They're
> misleading or outright wrong on many points. In particular, I don't
> think you actually understand the weak-memory-ordering issue, because
> the comments about that are entirely wrong.

The comments says "on CPUs with
+ * weak-memory ordering we can't reliably move pointers atomically, so
the
+ * rule is that updates of head and tail of the array require
ProcArrayLock
+ * in exclusive mode or (shared mode and known_assigned_xids_lck
spinlock)"

I will reword this, so it is clear that I'm talking about the head and
tail of the array, not pointers in general.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers