Prev: Oracle Spatial SDO Aggregate Union fails with ORA-29913 ODCIAGGREGATEMERGE
Next: java.sql.SQLException: Data size bigger than max size for this type: 3991
From: Álvaro G. Vicario on 23 Jul 2008 08:29 I'm setting up Oracle libraries in a Windows Server box so two PHP web sites can access a remote Oracle 10g database. I've downloaded the Instant Client *.zip package and everything works fine so far, except for some charset issues (Spanish letters do not show properly). Setting NLS_LANG environmental variable seems to be the clue but I don't know which value I should use. Given that data is in Spanish I have three likely candidates: SPANISH_SPAIN.WE8MSWIN1252 (the Windows codepage) SPANISH_SPAIN.WE8ISO8859P1 (ISO-Latin-1, used in first site) SPANISH_SPAIN.WE8ISO8859P15 (ISO-Latin-9, used in second site) From the "Oracle Database Globalization Support Guide" I understand that NLS_LANG makes Oracle perform a charset conversion. Since both sites use different charsets and I can convert in my PHP app when necessary, I think it'd be a good idea that Oracle does not make any conversion. But I don't know how to find out the charset that database tables are using natively. How could I find out? Does all this make any sense? -- -- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain -- Mi sitio sobre programaci�n web: http://bits.demogracia.com -- Mi web de humor al ba�o Mar�a: http://www.demogracia.com --
From: sybrandb on 23 Jul 2008 11:11 On Wed, 23 Jul 2008 14:29:20 +0200, "�lvaro G. Vicario" <alvaroNOSPAMTHANKS(a)demogracia.com> wrote: >I'm setting up Oracle libraries in a Windows Server box so two PHP web >sites can access a remote Oracle 10g database. I've downloaded the >Instant Client *.zip package and everything works fine so far, except >for some charset issues (Spanish letters do not show properly). Setting >NLS_LANG environmental variable seems to be the clue but I don't know >which value I should use. Given that data is in Spanish I have three >likely candidates: > >SPANISH_SPAIN.WE8MSWIN1252 (the Windows codepage) >SPANISH_SPAIN.WE8ISO8859P1 (ISO-Latin-1, used in first site) >SPANISH_SPAIN.WE8ISO8859P15 (ISO-Latin-9, used in second site) > > From the "Oracle Database Globalization Support Guide" I understand >that NLS_LANG makes Oracle perform a charset conversion. Since both >sites use different charsets and I can convert in my PHP app when >necessary, I think it'd be a good idea that Oracle does not make any >conversion. But I don't know how to find out the charset that database >tables are using natively. How could I find out? Does all this make any >sense? > > >-- >-- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain >-- Mi sitio sobre programaci�n web: http://bits.demogracia.com >-- Mi web de humor al ba�o Mar�a: http://www.demogracia.com You could find out by querying NLS_DATABASE_PARAMETERS. However, for Windows the default is MSWIN1252. WE8ISO8859P1 does not contain the euro sign. -- Sybrand Bakker Senior Oracle DBA
From: Álvaro G. Vicario on 24 Jul 2008 02:41 sybrandb(a)hccnet.nl escribi�: >> SPANISH_SPAIN.WE8MSWIN1252 (the Windows codepage) >> SPANISH_SPAIN.WE8ISO8859P1 (ISO-Latin-1, used in first site) >> SPANISH_SPAIN.WE8ISO8859P15 (ISO-Latin-9, used in second site) >> >> From the "Oracle Database Globalization Support Guide" I understand >> that NLS_LANG makes Oracle perform a charset conversion. Since both >> sites use different charsets and I can convert in my PHP app when >> necessary, I think it'd be a good idea that Oracle does not make any >> conversion. But I don't know how to find out the charset that database >> tables are using natively. How could I find out? Does all this make any >> sense? > > You could find out by querying NLS_DATABASE_PARAMETERS. A query to that table shows that NLS_CHARACTERSET is WE8ISO8859P1 so I'll work on that base. > However, for Windows the default is MSWIN1252. > WE8ISO8859P1 does not contain the euro sign. You are right (I've seen apps that use the currency symbol instead). Luckily it's not really an issue in web sites since you can always use the € entity--I just need to know what charset the data is actually using. Thank you very much. -- -- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain -- Mi sitio sobre programaci�n web: http://bits.demogracia.com -- Mi web de humor al ba�o Mar�a: http://www.demogracia.com --
From: Laurenz Albe on 24 Jul 2008 05:14 "�lvaro G. Vicario" <alvaroNOSPAMTHANKS(a)demogracia.com> wrote: >>> SPANISH_SPAIN.WE8MSWIN1252 (the Windows codepage) >>> SPANISH_SPAIN.WE8ISO8859P1 (ISO-Latin-1, used in first site) >>> SPANISH_SPAIN.WE8ISO8859P15 (ISO-Latin-9, used in second site) >>> >>> From the "Oracle Database Globalization Support Guide" I understand >>> that NLS_LANG makes Oracle perform a charset conversion. Since both >>> sites use different charsets and I can convert in my PHP app when >>> necessary, I think it'd be a good idea that Oracle does not make any >>> conversion. But I don't know how to find out the charset that database >>> tables are using natively. How could I find out? Does all this make any >>> sense? >> >> You could find out by querying NLS_DATABASE_PARAMETERS. > > A query to that table shows that NLS_CHARACTERSET is WE8ISO8859P1 so > I'll work on that base. > >> However, for Windows the default is MSWIN1252. >> WE8ISO8859P1 does not contain the euro sign. > > You are right (I've seen apps that use the currency symbol instead). > Luckily it's not really an issue in web sites since you can always use > the € entity--I just need to know what charset the data is actually > using. If you really plan to use WE8ISO8859P1 on the Windows client, please write your application so that it makes dead sure that only LATIN-1 characters are entered into the database. Oracle has a special "feature" in that it will not check your input for validity if client and server character sets are the same. This means that your client application can enter arbitrary junk into the database that will be happily stored as is. Let me give you an example. If the PHP application decides to store an Euro sign in the database and it happens to run in the Windows codepage, it will try to store a byte 0x80. Oracle will happily store that byte, even though this byte does not make any sense in LATIN-1. You will not notice that until you try to access the database with a different client character set, e.g. from Java. Then Oracle will return garbage instead of the Euro signs, and there is no way to fix that. But even if you plan to use a different character set on the Windows client (which I personally would do, namely WE8MSWIN1252), you will not get error messages if invalid characters are stored. Oracle will check the characters, but if it detects - say - a Euro sign, it will not throw an error but clandestinely convert the character to a question mark. The best solution in my opinion would be to recreate the database using the character set AL32UTF8 and on the client either use WE8MSWIN1252 or AL32UTF8 (if your PHP code speaks UTF-8). Yours, Laurenz Albe
From: Álvaro G. Vicario on 24 Jul 2008 08:10
Laurenz Albe escribi�: > If you really plan to use WE8ISO8859P1 on the Windows client, please write > your application so that it makes dead sure that only LATIN-1 characters > are entered into the database. > > Oracle has a special "feature" in that it will not check your input for > validity if client and server character sets are the same. > This means that your client application can enter arbitrary junk into the > database that will be happily stored as is. Oh my! That's scary... > But even if you plan to use a different character set on the Windows client > (which I personally would do, namely WE8MSWIN1252), you will not get error > messages if invalid characters are stored. Oracle will check the characters, > but if it detects - say - a Euro sign, it will not throw an error but > clandestinely convert the character to a question mark. > > > The best solution in my opinion would be to recreate the database using the > character set AL32UTF8 and on the client either use WE8MSWIN1252 or > AL32UTF8 (if your PHP code speaks UTF-8). Well, the database server is quite out of my reach. The machine belongs to a customer, the main database has been growing for years and there're many other apps working on top of it. I couldn't even get any information about charsets from their DB admin. Is there any way to get an unmodified output from a certain table field that I know contains (or it's supposed to contain) an Euro symbol so I can check with a binary editor what numeric code it's actually using? -- -- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain -- Mi sitio sobre programaci�n web: http://bits.demogracia.com -- Mi web de humor al ba�o Mar�a: http://www.demogracia.com -- |