From: Andrew Dunstan on


I wrote:
>
> I think the attached patch plugs the direct SPI holes as well.

There are two issues with this patch. First, how far if at all should it
be backpatched? All the way, or 8.3, where we tightened the encoding
rules, or not at all?

Second, It produces errors like this:

andrew=# select 'a' || invalid_utf_seq() || 'b';
ERROR: invalid byte sequence for encoding "UTF8": 0xd0
HINT: This error can also happen if the byte sequence does not
match the encoding expected by the server, which is controlled by
"client_encoding".
CONTEXT: PL/Perl function "invalid_utf_seq"
andrew=#


That hint seems rather misleading. I'm not sure what we can do about it
though. If we set the noError param on pg_verifymbstr() we would miss
the error message that actually identified the bad data, so that doesn't
seem like a good plan.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "David E. Wheeler" on
On Jan 3, 2010, at 11:54 AM, Andrew Dunstan wrote:

> There are two issues with this patch. First, how far if at all should it be backpatched? All the way, or 8.3, where we tightened the encoding rules, or not at all?

8.3 seems reasonable.

> Second, It produces errors like this:
>
> andrew=# select 'a' || invalid_utf_seq() || 'b';
> ERROR: invalid byte sequence for encoding "UTF8": 0xd0
> HINT: This error can also happen if the byte sequence does not
> match the encoding expected by the server, which is controlled by
> "client_encoding".
> CONTEXT: PL/Perl function "invalid_utf_seq"
> andrew=#
>
>
> That hint seems rather misleading. I'm not sure what we can do about it though. If we set the noError param on pg_verifymbstr() we would miss the error message that actually identified the bad data, so that doesn't seem like a good plan.

I'm sure I'm just revealing my ignorance here, but how is the hint misleading?

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


David E. Wheeler wrote:
>> Second, It produces errors like this:
>>
>> andrew=# select 'a' || invalid_utf_seq() || 'b';
>> ERROR: invalid byte sequence for encoding "UTF8": 0xd0
>> HINT: This error can also happen if the byte sequence does not
>> match the encoding expected by the server, which is controlled by
>> "client_encoding".
>> CONTEXT: PL/Perl function "invalid_utf_seq"
>> andrew=#
>>
>>
>> That hint seems rather misleading. I'm not sure what we can do about it though. If we set the noError param on pg_verifymbstr() we would miss the error message that actually identified the bad data, so that doesn't seem like a good plan.
>>
>
> I'm sure I'm just revealing my ignorance here, but how is the hint misleading?
>
>
>

The string that causes the trouble does not come from the client and has
nothing to do with client_encoding.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Andrew Dunstan <andrew(a)dunslane.net> writes:
> andrew=# select 'a' || invalid_utf_seq() || 'b';
> ERROR: invalid byte sequence for encoding "UTF8": 0xd0
> HINT: This error can also happen if the byte sequence does not
> match the encoding expected by the server, which is controlled by
> "client_encoding".
> CONTEXT: PL/Perl function "invalid_utf_seq"

> That hint seems rather misleading. I'm not sure what we can do about it
> though. If we set the noError param on pg_verifymbstr() we would miss
> the error message that actually identified the bad data, so that doesn't
> seem like a good plan.

Yeah, we want the detailed error info. The problem is that the hint is
targeted to the case where we are checking data coming from the client.
We could add another parameter to pg_verifymbstr to indicate the
context, perhaps. I'm not sure how to do it exactly --- just a bool
that suppresses the hint, or do we want to make a provision for some
other hint or detail message?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Andrew Dunstan <andrew(a)dunslane.net> writes:
> There are two issues with this patch. First, how far if at all should it
> be backpatched? All the way, or 8.3, where we tightened the encoding
> rules, or not at all?

Forgot to mention --- I'm not in favor of backpatching. First because
tightening encoding verification has been a process over multiple
releases; it's not a bug fix in the normal sense of the word, and might
break things that people had been doing without trouble. Second because
I think we'll have to change pg_verifymbstr's API, and that's not
something to back-patch if we can avoid it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers