From: KaiGai Kohei on
(2010/01/28 18:21), Takahiro Itagaki wrote:
>
> KaiGai Kohei<kaigai(a)ak.jp.nec.com> wrote:
>
>> The attached patch uses one TOC entry for each blob objects.
>
> When I'm testing the new patch, I found "ALTER LARGE OBJECT" command
> returns "ALTER LARGEOBJECT" tag. Should it be "ALTER LARGE(space)OBJECT"
> instead? As I remember, we had decided not to use LARGEOBJECT
> (without a space) in user-visible messages, right?

Sorry, I left for fix this tag when I was pointed out LARGEOBJECT should
be LARGE(space)OBJECT.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>
From: Takahiro Itagaki on

KaiGai Kohei <kaigai(a)ak.jp.nec.com> wrote:

> > When I'm testing the new patch, I found "ALTER LARGE OBJECT" command
> > returns "ALTER LARGEOBJECT" tag. Should it be "ALTER LARGE(space)OBJECT"
> > instead?
>
> Sorry, I left for fix this tag when I was pointed out LARGEOBJECT should
> be LARGE(space)OBJECT.

Committed. Thanks.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Takahiro Itagaki on

KaiGai Kohei <kaigai(a)ak.jp.nec.com> wrote:

> The attached patch uses one TOC entry for each blob objects.

This patch does not only fix the existing bugs, but also refactor
the dump format of large objects in pg_dump. The new format are
more similar to the format of tables:

Section <Tables> <New LO> <Old LO>
-----------------------------------------------------
Schema "TABLE" "BLOB ITEM" "BLOBS"
Data "TABLE DATA" "BLOB DATA" "BLOBS"
Comments "COMMENT" "COMMENT" "BLOB COMMENTS"

We will allocate BlobInfo in memory for each large object. It might
consume much more memory compared with former versions if we have
many large objects, but we discussed it is an acceptable change.

As far as I read, the patch is almost ready to commit
except the following issue about backward compatibility:

> * "BLOB DATA"
> This section is same as existing "BLOBS" section, except for _LoadBlobs()
> does not create a new large object before opening it with INV_WRITE, and
> lo_truncate() will be used instead of lo_unlink() when --clean is given.

> The legacy sections ("BLOBS" and "BLOB COMMENTS") are available to read
> for compatibility, but newer pg_dump never create these sections.

I wonder we need to support older versions in pg_restore. You add a check
"PQserverVersion >= 80500" in CleanupBlobIfExists(), but out documentation
says we cannot use pg_restore 9.0 for 8.4 or older servers:

http://developer.postgresql.org/pgdocs/postgres/app-pgdump.html
| it is not guaranteed that pg_dump's output can be loaded into
| a server of an older major version

Can we remove such path and raise an error instead?
Also, even if we support the older servers in the routine,
the new bytea format will be another problem anyway.


> One remained issue is the way to decide whether blobs to be dumped, or not.
> Right now, --schema-only eliminate all the blob dumps.
> However, I think it should follow the manner in any other object classes.
>
> -a, --data-only ... only "BLOB DATA" sections, not "BLOB ITEM"
> -s, --schema-only ... only "BLOB ITEM" sections, not "BLOB DATA"
> -b, --blobs ... both of "BLOB ITEM" and "BLOB DATA" independently
> from --data-only and --schema-only?

I cannot image situations that require data-only dumps -- for example,
restoring database has a filled pg_largeobject_metadata and an empty
or broken pg_largeobject -- but it seems to be unnatural.

I'd prefer to keep the existing behavior:
* default or data-only : dump all attributes and data of blobs
* schema-only : don't dump any blobs
and have independent options to control blob dumps:
* -b, --blobs : dump all blobs even if schema-only
* -B, --no-blobs : [NEW] don't dump any blobs even if default or data-only

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on
(2010/02/01 14:19), Takahiro Itagaki wrote:
> As far as I read, the patch is almost ready to commit
> except the following issue about backward compatibility:
>
>> * "BLOB DATA"
>> This section is same as existing "BLOBS" section, except for _LoadBlobs()
>> does not create a new large object before opening it with INV_WRITE, and
>> lo_truncate() will be used instead of lo_unlink() when --clean is given.
>
>> The legacy sections ("BLOBS" and "BLOB COMMENTS") are available to read
>> for compatibility, but newer pg_dump never create these sections.
>
> I wonder we need to support older versions in pg_restore. You add a check
> "PQserverVersion>= 80500" in CleanupBlobIfExists(), but out documentation
> says we cannot use pg_restore 9.0 for 8.4 or older servers:
>
> http://developer.postgresql.org/pgdocs/postgres/app-pgdump.html
> | it is not guaranteed that pg_dump's output can be loaded into
> | a server of an older major version
>
> Can we remove such path and raise an error instead?
> Also, even if we support the older servers in the routine,
> the new bytea format will be another problem anyway.

OK, I'll fix it.

>> One remained issue is the way to decide whether blobs to be dumped, or not.
>> Right now, --schema-only eliminate all the blob dumps.
>> However, I think it should follow the manner in any other object classes.
>>
>> -a, --data-only ... only "BLOB DATA" sections, not "BLOB ITEM"
>> -s, --schema-only ... only "BLOB ITEM" sections, not "BLOB DATA"
>> -b, --blobs ... both of "BLOB ITEM" and "BLOB DATA" independently
>> from --data-only and --schema-only?
>
> I cannot image situations that require data-only dumps -- for example,
> restoring database has a filled pg_largeobject_metadata and an empty
> or broken pg_largeobject -- but it seems to be unnatural.

Indeed, it might not be a sane situation.

However, we can assume the situation that user wants to backup a database
into two separated files (one for schema definition; one for data contents).
Just after restoring schema definitions, all the large obejcts are created
as empty blobs. Then, we can restore their data contents.

I wonder if the behavior is easily understandable for end users.
The "BLOB ITEM" section contains properties of a certain large obejct,
such as identifier (loid), comment, ownership and access privileges.
These are categorized to schema definitions in other object classes,
but we still need special treatment for blobs.

The --schema-only with large objects might be unnatural, but the
--data-only with properties of large objects are also unnatural.
Which behavior is more unnatural?

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Takahiro Itagaki on

KaiGai Kohei <kaigai(a)ak.jp.nec.com> wrote:

> > Can we remove such path and raise an error instead?
> > Also, even if we support the older servers in the routine,
> > the new bytea format will be another problem anyway.
>
> OK, I'll fix it.

I think we might need to discuss about explicit version checks in pg_restore.
It is not related with large objects, but with pg_restore's capability.
We've not supported to restore a dump to older servers, but we don't have any
version checks, right? Should we do the checks at the beginning of restoring?
If we do so, LO patch could be more simplified.


> The --schema-only with large objects might be unnatural, but the
> --data-only with properties of large objects are also unnatural.
> Which behavior is more unnatural?

I think large object metadata is a kind of row-based access controls.
How do we dump and restore ACLs per rows when we support them for
normal tables? We should treat LO metadata as same as row-based ACLs.
In my opinion, I'd like to treat them as part of data (not of schema).

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers