Hot Standby b-tree delete records review [PgSql]

Prev: vcregress.bat check triggered Heap error in the Debugversionof win32 build
Next: [HACKERS] global temporary tables

From: Heikki Linnakangas on 22 Apr 2010 04:28

Simon Riggs wrote:
> On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote:
>> btree_redo:
>>> /*
>>> * Note that if all heap tuples were LP_DEAD then we will be
>>> * returning InvalidTransactionId here. This seems very unlikely
>>> * in practice.
>>> */
>> If none of the removed heap tuples were present anymore, we currently
>> return InvalidTransactionId, which kills/waits out all read-only
>> queries. But if none of the tuples were present anymore, the read-only
>> queries wouldn't have seen them anyway, so ISTM that we should treat
>> InvalidTransactionId return value as "we don't need to kill anyone".
>
> That's not the point. The tuples were not themselves the sole focus,

Yes, they were. We're replaying a b-tree deletion record, which removes
pointers to some heap tuples, making them unreachable to any read-only
queries. If any of them still need to be visible to read-only queries,
we have a conflict. But if all of the heap tuples are gone already,
removing the index pointers to them can'ẗ change the situation for any
query. If any of them should've been visible to a query, the damage was
done already by whoever pruned the heap tuples leaving just the
tombstone LP_DEAD item pointers (in the heap) behind.

Or do we use the latestRemovedXid value for something else as well?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 22 Apr 2010 04:41

On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote:
> >> btree_redo:
> >>> /*
> >>> * Note that if all heap tuples were LP_DEAD then we will be
> >>> * returning InvalidTransactionId here. This seems very unlikely
> >>> * in practice.
> >>> */
> >> If none of the removed heap tuples were present anymore, we currently
> >> return InvalidTransactionId, which kills/waits out all read-only
> >> queries. But if none of the tuples were present anymore, the read-only
> >> queries wouldn't have seen them anyway, so ISTM that we should treat
> >> InvalidTransactionId return value as "we don't need to kill anyone".
> >
> > That's not the point. The tuples were not themselves the sole focus,
>
> Yes, they were. We're replaying a b-tree deletion record, which removes
> pointers to some heap tuples, making them unreachable to any read-only
> queries. If any of them still need to be visible to read-only queries,
> we have a conflict. But if all of the heap tuples are gone already,
> removing the index pointers to them can'ẗ change the situation for any
> query. If any of them should've been visible to a query, the damage was
> done already by whoever pruned the heap tuples leaving just the
> tombstone LP_DEAD item pointers (in the heap) behind.

You're missing my point. Those tuples are indicators of what may lie
elsewhere in the database, completely unreferenced by this WAL record.
Just because these referenced tuples are gone doesn't imply that all
tuple versions written by the as yet-unknown-xids are also gone. We
can't infer anything about the whole database just from one small group
of records.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 22 Apr 2010 04:56

Simon Riggs wrote:
> On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote:
>>>> btree_redo:
>>>>> /*
>>>>> * Note that if all heap tuples were LP_DEAD then we will be
>>>>> * returning InvalidTransactionId here. This seems very unlikely
>>>>> * in practice.
>>>>> */
>>>> If none of the removed heap tuples were present anymore, we currently
>>>> return InvalidTransactionId, which kills/waits out all read-only
>>>> queries. But if none of the tuples were present anymore, the read-only
>>>> queries wouldn't have seen them anyway, so ISTM that we should treat
>>>> InvalidTransactionId return value as "we don't need to kill anyone".
>>> That's not the point. The tuples were not themselves the sole focus,
>> Yes, they were. We're replaying a b-tree deletion record, which removes
>> pointers to some heap tuples, making them unreachable to any read-only
>> queries. If any of them still need to be visible to read-only queries,
>> we have a conflict. But if all of the heap tuples are gone already,
>> removing the index pointers to them can'ẗ change the situation for any
>> query. If any of them should've been visible to a query, the damage was
>> done already by whoever pruned the heap tuples leaving just the
>> tombstone LP_DEAD item pointers (in the heap) behind.
>
> You're missing my point. Those tuples are indicators of what may lie
> elsewhere in the database, completely unreferenced by this WAL record.
> Just because these referenced tuples are gone doesn't imply that all
> tuple versions written by the as yet-unknown-xids are also gone. We
> can't infer anything about the whole database just from one small group
> of records.

Have you got an example of that?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 22 Apr 2010 05:00

On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote:

> >>>> If none of the removed heap tuples were present anymore, we currently
> >>>> return InvalidTransactionId, which kills/waits out all read-only
> >>>> queries. But if none of the tuples were present anymore, the read-only
> >>>> queries wouldn't have seen them anyway, so ISTM that we should treat
> >>>> InvalidTransactionId return value as "we don't need to kill anyone".
> >>> That's not the point. The tuples were not themselves the sole focus,
> >> Yes, they were. We're replaying a b-tree deletion record, which removes
> >> pointers to some heap tuples, making them unreachable to any read-only
> >> queries. If any of them still need to be visible to read-only queries,
> >> we have a conflict. But if all of the heap tuples are gone already,
> >> removing the index pointers to them can'ẗ change the situation for any
> >> query. If any of them should've been visible to a query, the damage was
> >> done already by whoever pruned the heap tuples leaving just the
> >> tombstone LP_DEAD item pointers (in the heap) behind.
> >
> > You're missing my point. Those tuples are indicators of what may lie
> > elsewhere in the database, completely unreferenced by this WAL record.
> > Just because these referenced tuples are gone doesn't imply that all
> > tuple versions written by the as yet-unknown-xids are also gone. We
> > can't infer anything about the whole database just from one small group
> > of records.
>
> Have you got an example of that?

I don't need one, I have suggested the safe route. In order to infer
anything, and thereby further optimise things, we would need proof that
no cases can exist, which I don't have. Perhaps we can add "yet", not
sure about that either.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 22 Apr 2010 05:18

Simon Riggs wrote:
> On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote:
>
>>>>>> If none of the removed heap tuples were present anymore, we currently
>>>>>> return InvalidTransactionId, which kills/waits out all read-only
>>>>>> queries. But if none of the tuples were present anymore, the read-only
>>>>>> queries wouldn't have seen them anyway, so ISTM that we should treat
>>>>>> InvalidTransactionId return value as "we don't need to kill anyone".
>>>>> That's not the point. The tuples were not themselves the sole focus,
>>>> Yes, they were. We're replaying a b-tree deletion record, which removes
>>>> pointers to some heap tuples, making them unreachable to any read-only
>>>> queries. If any of them still need to be visible to read-only queries,
>>>> we have a conflict. But if all of the heap tuples are gone already,
>>>> removing the index pointers to them can'ẗ change the situation for any
>>>> query. If any of them should've been visible to a query, the damage was
>>>> done already by whoever pruned the heap tuples leaving just the
>>>> tombstone LP_DEAD item pointers (in the heap) behind.
>>> You're missing my point. Those tuples are indicators of what may lie
>>> elsewhere in the database, completely unreferenced by this WAL record.
>>> Just because these referenced tuples are gone doesn't imply that all
>>> tuple versions written by the as yet-unknown-xids are also gone. We
>>> can't infer anything about the whole database just from one small group
>>> of records.
>> Have you got an example of that?
>
> I don't need one, I have suggested the safe route. In order to infer
> anything, and thereby further optimise things, we would need proof that
> no cases can exist, which I don't have. Perhaps we can add "yet", not
> sure about that either.

It's good to be safe rather than sorry, but I'd still like to know
because I'm quite surprised by that, and got me worried that I don't
understand how hot standby works as well as I thought I did. I thought
the point of stopping replay/killing queries at a b-tree deletion record
is precisely that it makes some heap tuples invisible to running
read-only queries. If it doesn't make any tuples invisible, why do any
queries need to be killed? And why was it OK for them to be running just
before replaying the b-tree deletion record?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2
Prev: vcregress.bat check triggered Heap error in the Debugversionof win32 build
Next: [HACKERS] global temporary tables