From: Heikki Linnakangas on 27 Apr 2010 07:06 Takahiro Itagaki wrote: > I heard pg_get_encoding_from_locale() failed in kor locale. > > WARNING: could not determine encoding for locale "kor": codeset is "CP949" > > I found the following description in the web: > CP949 is EUC-KR, extended with UHC (Unified Hangul Code). > http://www.opensource.apple.com/source/libiconv/libiconv-13.2/libiconv/lib/cp949.h > > but we define CP51949 for EUC-KR in chklocale.c. > {PG_EUC_KR, "CP51949"}, /* or 20949 ? */ > > Which is the compatible codeset with our PG_EUC_KR encoding? > 949, 51949, or 20949? A bit of googling suggests that 51949 is indeed the Windows codepage that's equivalent with EUC-KR. > Should we add (or replace) CP949 for EUC-KR? No. CP949 is not plain EUC-KR, but EUC-KR with some extensions (UHC). At least on CVS HEAD, we recognize CP949 as an alias for the PostgreSQL PG_UHC encoding. There's a significant difference between the two, because PG_EUC_KR is supported as a server-encoding while PG_UHC is not. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Takahiro Itagaki on 27 Apr 2010 07:23 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> wrote: > > Should we add (or replace) CP949 for EUC-KR? > > No. CP949 is not plain EUC-KR, but EUC-KR with some extensions (UHC). At > least on CVS HEAD, we recognize CP949 as an alias for the PostgreSQL > PG_UHC encoding. That's it! We should have added an additional alias to chklocale, too. Index: src/port/chklocale.c =================================================================== --- src/port/chklocale.c (HEAD) +++ src/port/chklocale.c (fixed) @@ -172,6 +172,7 @@ {PG_GBK, "CP936"}, {PG_UHC, "UHC"}, + {PG_UHC, "CP949"}, {PG_JOHAB, "JOHAB"}, {PG_JOHAB, "CP1361"}, Except UHC, we don't have any codepage aliases for the encodings below. I assume we don't need to add CPxxx because Windows does not have corresponding codepages for them, right? {PG_LATIN6, "ISO-8859-10"}, {PG_LATIN7, "ISO-8859-13"}, {PG_LATIN8, "ISO-8859-14"}, {PG_LATIN10, "ISO-8859-16"}, {PG_SHIFT_JIS_2004, "SJIS_2004"}, Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 27 Apr 2010 15:40 Takahiro Itagaki wrote: > That's it! We should have added an additional alias to chklocale, too. > > Index: src/port/chklocale.c > =================================================================== > --- src/port/chklocale.c (HEAD) > +++ src/port/chklocale.c (fixed) > @@ -172,6 +172,7 @@ > {PG_GBK, "CP936"}, > > {PG_UHC, "UHC"}, > + {PG_UHC, "CP949"}, > > {PG_JOHAB, "JOHAB"}, > {PG_JOHAB, "CP1361"}, Yeah, seems correct. > Except UHC, we don't have any codepage aliases for the encodings below. > I assume we don't need to add CPxxx because Windows does not have > corresponding codepages for them, right? > > {PG_LATIN6, "ISO-8859-10"}, > {PG_LATIN7, "ISO-8859-13"}, > {PG_LATIN8, "ISO-8859-14"}, > {PG_LATIN10, "ISO-8859-16"}, > {PG_SHIFT_JIS_2004, "SJIS_2004"}, Yeah, I guess so. I can't find Windows codepages for these either, by google. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Ioseph Kim" on 4 May 2010 01:09 Hi, I'm Korean. CP51949 is EUC-KR correct. so, that defined code is correct too. But in Korea, EUC-KR code is not good to use all Korean character. In recent years, many people in Korea use the CP949 code. MS Windows codepage also is CP949. ----- Original Message ----- From: "Takahiro Itagaki" <itagaki.takahiro(a)oss.ntt.co.jp> To: <pgsql-hackers(a)postgresql.org> Sent: Tuesday, April 27, 2010 7:27 PM Subject: [HACKERS] CP949 for EUC-KR? >I heard pg_get_encoding_from_locale() failed in kor locale. > > WARNING: could not determine encoding for locale "kor": codeset is "CP949" > > I found the following description in the web: > CP949 is EUC-KR, extended with UHC (Unified Hangul Code). > http://www.opensource.apple.com/source/libiconv/libiconv-13.2/libiconv/lib/cp949.h > > but we define CP51949 for EUC-KR in chklocale.c. > {PG_EUC_KR, "CP51949"}, /* or 20949 ? */ > > Which is the compatible codeset with our PG_EUC_KR encoding? > 949, 51949, or 20949? Should we add (or replace) CP949 for EUC-KR? > > Regards, > --- > Takahiro Itagaki > NTT Open Source Software Center > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Takahiro Itagaki on 5 May 2010 22:14 "Ioseph Kim" <pgsql-kr(a)postgresql.kr> wrote: > CP51949 is EUC-KR correct. > > {PG_EUC_KR, "CP51949"}, /* or 20949 ? */ Thank you for the information. I removed "or 20949 ?" from the line. Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Pages: 1 Prev: [HACKERS] CP949 for EUC-KR? Next: [HACKERS] Differential backup |