From: Andrew Dunstan on


Hannu Krosing wrote:

[plperl can return data that is not valid in the server encoding and it
is not caught]

> This results in a table, which has invalid utf sequence in it and
> consequently does not pass dump/load
>
> What would be the best place to fix this ?
>
> Should there be checks in all text types ?
> (probably too expensive)
>

The plperl code has no type-specific checks, and in any case limiting it
to "text" types would defeat third party and contrib types of which it
knows nothing (think citext). We should check all strings returned by
plperl.
> Or should pl/perl check it's return values for compliance with
> server_encoding ?
>

I think the plperl glue code should check returned strings using
pg_verifymbstr().

> Or should postgresql itself check that pl-s return what they promise to
> return ?
>
>


There is no central place for it to check. The pl glue code is the right
place, I think.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


Andrew Dunstan wrote:
>
> I think the plperl glue code should check returned strings using
> pg_verifymbstr().
>
>

Please test this patch. I think we'd probably want to trap the encoding
error and issue a customised error message, but this plugs all the holes
I can see with the possible exception of values inserted via SPI calls.
I'll check that out.

cheers

andrew
From: Andrew Dunstan on


Andrew Dunstan wrote:
>
>
> Andrew Dunstan wrote:
>>
>> I think the plperl glue code should check returned strings using
>> pg_verifymbstr().
>>
>>
>
> Please test this patch. I think we'd probably want to trap the
> encoding error and issue a customised error message, but this plugs
> all the holes I can see with the possible exception of values inserted
> via SPI calls. I'll check that out.
>
>

I think the attached patch plugs the direct SPI holes as well.

One thing that I am pondering is: how does SPI handle things if the
client encoding and server encoding are not the same? Won't the strings
it passes the parser be interpreted in the client encoding? If so, that
doesn't seem right at all, since these strings come from a server side
call and not from the client at all. It looks to me like the call to
pg_parse_query() in spi.c should possibly be surrounded by code to
temporarily set the client encoding to the server encoding and then
restore it afterwards.

cheers

andrew


From: Tom Lane on
Andrew Dunstan <andrew(a)dunslane.net> writes:
> One thing that I am pondering is: how does SPI handle things if the
> client encoding and server encoding are not the same?

What? client_encoding is not used anywhere within the backend.
Everything should be server_encoding.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


Tom Lane wrote:
> Andrew Dunstan <andrew(a)dunslane.net> writes:
>
>> One thing that I am pondering is: how does SPI handle things if the
>> client encoding and server encoding are not the same?
>>
>
> What? client_encoding is not used anywhere within the backend.
> Everything should be server_encoding.
>
>
>

Oh, for some reason I thought the translation was done in the scanner.
Sorry for the noise.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers