From: Walt on 27 May 2010 14:49 Hi, We're running Oracle 10g on Windows using ISO 8859 as the character set. We're exploring the idea of converting to unicode (UTF8) along with the upcoming upgrade to v11. I haven't found a good reference for how to best accomplish the conversion and what pitfalls to watch out for. Any suggestions? One thing I expect to be a problem is that we have about 200 columns that are defined as Varchar2(4000). My understanding is that this limit is 4000 *bytes* so some of our data may not "fit" if it contains enough characters with ascii values above 128. Anyone been through this before who'd like to offer sage advice? Thanks. //Walt
From: John Hurley on 27 May 2010 16:40 Walt: > We're running Oracle 10g on Windows using ISO 8859 as the character set. > We're exploring the idea of converting to unicode (UTF8) along with the > upcoming upgrade to v11. > > I haven't found a good reference for how to best accomplish the > conversion and what pitfalls to watch out for. Any suggestions? > > One thing I expect to be a problem is that we have about 200 columns > that are defined as Varchar2(4000). My understanding is that this limit > is 4000 *bytes* so some of our data may not "fit" if it contains enough > characters with ascii values above 128. > > Anyone been through this before who'd like to offer sage advice? Hard to believe you have not turned up relevant advice. This kind of question has been asked and answered a whole bunch of times on the internet. Yes even on cdos a bunch of times. I have not tried searching cdos with the google groups interface for a while ... it used to work pretty well. The Oracle documentation is pretty good here. Have you looked at it? Other places to check are the Oracle forums for the database area or asktom ...
From: Arne Ortlinghaus on 28 May 2010 04:57 Hi Walt, yes, every field with too much characters (more than 4000 bytes UTF8 code) can not be converted without data loss. It could be more secure to add new Unicode columns if you have already database with data in it and then add conversion programs. Arne Ortlinghaus ACS Data Systems "Walt" <walt_askier(a)SHOESyahoo.com> schrieb im Newsbeitrag news:G8zLn.326559$Up1.229692(a)en-nntp-09.dc1.easynews.com... > Hi, > > We're running Oracle 10g on Windows using ISO 8859 as the character set. > We're exploring the idea of converting to unicode (UTF8) along with the > upcoming upgrade to v11. > > I haven't found a good reference for how to best accomplish the conversion > and what pitfalls to watch out for. Any suggestions? > > One thing I expect to be a problem is that we have about 200 columns that > are defined as Varchar2(4000). My understanding is that this limit is > 4000 *bytes* so some of our data may not "fit" if it contains enough > characters with ascii values above 128. > > Anyone been through this before who'd like to offer sage advice? > > Thanks. > > //Walt
From: Lothar =?utf-8?Q?Armbr=C3=BCster?= on 28 May 2010 10:38 "Arne Ortlinghaus" <Arne.Ortlinghaus(a)acs.it> writes: > Hi Walt, > > yes, every field with too much characters (more than 4000 bytes UTF8 code) can > not be converted without data loss. It could be more secure to add new Unicode > columns if you have already database with data in it and then add conversion > programs. > > Arne Ortlinghaus > ACS Data Systems > Its not only 4000 bytes that impose a limit. I recently tried to import an export file from a database using ISO 8859 into a database uning UTF-8. I had some columns of type VARCHAR2(3) which meant VARCHAR2(3 BYTE). The columns in the new database were also VARCHAR2(3 BYTES) and some data contained german umlauts which are stored in two bytes in UTF-8. So these rows were not imported. IIRC there is a syntax like VARCHAR2(3 CHAR) when defining columns. That should allow 3 UTF-8 characters to be stored in the column. I think it's a good idea to checks this and possibly convert columns to allow a maximum number of characters rather than bytes before doing export/import involving multibyte charactersets. > "Walt" <walt_askier(a)SHOESyahoo.com> schrieb im Newsbeitrag > news:G8zLn.326559$Up1.229692(a)en-nntp-09.dc1.easynews.com... >> Hi, >> >> We're running Oracle 10g on Windows using ISO 8859 as the character >> set. We're exploring the idea of converting to unicode (UTF8) along with the >> upcoming upgrade to v11. >> >> I haven't found a good reference for how to best accomplish the conversion >> and what pitfalls to watch out for. Any suggestions? >> >> One thing I expect to be a problem is that we have about 200 columns that >> are defined as Varchar2(4000). My understanding is that this limit is 4000 >> *bytes* so some of our data may not "fit" if it contains enough characters >> with ascii values above 128. >> >> Anyone been through this before who'd like to offer sage advice? >> >> Thanks. >> >> //Walt > HTH, Lothar -- Lothar Armbrüster | lothar.armbruester(a)t-online.de Hauptstr. 26 | 65346 Eltville |
From: Serge Rielau on 29 May 2010 09:17 VARCHAR2(4000) just means that whoever defined the DB schema was too lazy to investigate the domain. If run a few MAX(LENGTH(..) that will giv eyou a pretty good idea as to what the domain really is. In all my engagements I have yet to come across a situation where there actually was a requirement for 4000 bytes. Chances are the vast majority of these are monsters under the bed. So grab a flashlight. :-) Cheers Serge -- Serge Rielau SQL Architect DB2 for LUW IBM Toronto Lab
|
Next
|
Last
Pages: 1 2 Prev: Top 10 Technical requirements for In-Memory Reporting Next: Understanding this query |