From: Tom Lane on
Andrew Dunstan <andrew(a)dunslane.net> writes:
> Roger Leigh wrote:
>> Here we just force the locale to C. This does have the disadvantage
>> that --no-locale is made redundant, and any tests which are dependent
>> upon locale (if any?) will be run in the C locale.

> That is not a solution.

Right. I think you may have missed the point of what Peter was saying:
it's okay to force the locale to C on the *client* side of the tests.
The trick is to not change the environment that the temp server is
started with. This will take some redesign inside pg_regress, I think.
Probably initialize_environment needs to be split into two functions,
one that's run before firing off the temp server and one that runs
after.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Roger Leigh on
On Tue, Sep 29, 2009 at 04:32:49PM -0400, Tom Lane wrote:
> Roger Leigh <rleigh(a)codelibre.net> writes:
> >> C locale means POSIX behavior and nothing but.
>
> > Indeed it does. However, making LC_CTYPE be UTF-8 rather than
> > ASCII is both possible and still strictly conforming to the
> > letter of the standard. There would be some collation and
> > other restrictions ("digit" and other character classes would
> > be contrained to the ASCII characters compared with other UTF-8
> > locales). However, any existing programs using ASCII would continue
> > to function without any changes to their behaviour. The only
> > observable change will be that nl_langinfo(CODESET) will return
> > UTF-8, and it will be valid for programs to use UTF-8 encoded
> > text in formatted print functions, etc..
>
> I really, really don't believe that that meets either the letter or
> the spirit of the C standard, at least not if you are intending to
> silently substitute LC_CTYPE=UTF8 when the program has specified
> C/POSIX locale. (If this is just a matter of what the default
> LANG environment is, of course you can do anything.)

We have spent some time reading the relevant standards documents
(C, POSIX, SUSv2, SUSv3) and haven't found anything yet that would
preclude this. While they all specify minimal requirements for
what the C locale character set must provide (and POSIX/SUS are the
most strict, specifying ASCII outright for each 0-127 codepoint),
these are the minimal requirements for the locale, and
implementation-specific extensions to ASCII are allowed, which would
therefore permit UTF-8. Note that LC_CTYPE=C is not required to
specify ASCII in any standard (though POSIX/SUS require that it
must contain ASCII as a subset of the whole set).

The language in SUSv2 in fact explicitly states that this is
allowed. In fact, I've seen documentation that some UNIX systems such
as HPUX already do have a UTF-8 C locale as an option.


Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


Andrew Dunstan wrote:
>
>
> Roger Leigh wrote:
>> Here we just force the locale to C. This does have the disadvantage
>> that --no-locale is made redundant, and any tests which are dependent
>> upon locale (if any?) will be run in the C locale.
>>
>>
>>
>
> That is not a solution. We have not that long ago gone to some lengths
> to provide for buildfarm testing in various locales. We're not going
> to undo that.
>
>

Thinking about this some more, ISTM a much better way of approaching it
would be to provide a flag for psql to turn off the fancy formatting,
and have pg_regress use that flag.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Roger Leigh <rleigh(a)codelibre.net> writes:
> The language in SUSv2 in fact explicitly states that this is
> allowed. In fact, I've seen documentation that some UNIX systems such
> as HPUX already do have a UTF-8 C locale as an option.

I don't argue with the concept of a C.UTF8 locale --- in fact I think
it sounds pretty useful. What I think is 100% broken is trying to make
C locale work that way. C locale is supposed to be the traditional
locale-ignorant, characters-are-bytes behavior.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Andrew Dunstan <andrew(a)dunslane.net> writes:
> Thinking about this some more, ISTM a much better way of approaching it
> would be to provide a flag for psql to turn off the fancy formatting,
> and have pg_regress use that flag.

Yeah, that's not a bad idea. There are likely to be other client
programs that won't want this behavioral change either.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Prev: Hot Standby 0.2.1
Next: Rejecting weak passwords