From: Richard Schulman on 8 Sep 2006 18:01 Sorry to be back at the goodly well so soon, but... ....when I execute the following -- variable mean_eng_txt being utf-16LE and its datatype nvarchar2(79) in Oracle: cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) VALUES (:id,:mean)""",id=id,mean=mean) I not surprisingly get this error message: "cx_Oracle.NotSupportedError: Variable_TypeByValue(): unhandled data type unicode" But when I try putting a codecs.BOM_UTF16_LE in various plausible places, I just end up generating different errors. Recommendations, please? TIA, Richard Schulman (Remove xx for email reply)
From: Diez B. Roggisch on 9 Sep 2006 05:38 Richard Schulman schrieb: > Sorry to be back at the goodly well so soon, but... > > ...when I execute the following -- variable mean_eng_txt being > utf-16LE and its datatype nvarchar2(79) in Oracle: > > cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) > VALUES (:id,:mean)""",id=id,mean=mean) > > I not surprisingly get this error message: > > "cx_Oracle.NotSupportedError: Variable_TypeByValue(): unhandled data > type unicode" > > But when I try putting a codecs.BOM_UTF16_LE in various plausible > places, I just end up generating different errors. Show us the alleged plausible places, and the different errors. Otherwise it's crystal ball time again. Diez
From: Richard Schulman on 9 Sep 2006 23:17 >> cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) >> VALUES (:id,:mean)""",id=id,mean=mean) >>... >> "cx_Oracle.NotSupportedError: Variable_TypeByValue(): unhandled data >> type unicode" >> >> But when I try putting a codecs.BOM_UTF16_LE in various plausible >> places, I just end up generating different errors. Diez: >Show us the alleged plausible places, and the different errors. >Otherwise it's crystal ball time again. More usefully, let's just try to fix the code above. Here's the error message I get: NotSupportedError: Variable_TypeByValue(): unhandled data type unicode Traceback (innermost last): File "c:\pythonapps\LoadMeanToOra.py", line 1, in ? # LoadMeanToOra reads a UTF-16LE input file one record at a time File "c:\pythonapps\LoadMeanToOra.py", line 23, in ? cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) What I can't figure out is whether cx_Oracle is saying it can't handle Unicode for an Oracle nvarchar2 data type or whether it can handle the input but that it needs to be in a specific format that I'm not supplying. - Richard Schulman
From: Diez B. Roggisch on 10 Sep 2006 05:42 Richard Schulman schrieb: >>> cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) >>> VALUES (:id,:mean)""",id=id,mean=mean) >>> ... >>> "cx_Oracle.NotSupportedError: Variable_TypeByValue(): unhandled data >>> type unicode" >>> >>> But when I try putting a codecs.BOM_UTF16_LE in various plausible >>> places, I just end up generating different errors. > > Diez: >> Show us the alleged plausible places, and the different errors. >> Otherwise it's crystal ball time again. > > More usefully, let's just try to fix the code above. Here's the error > message I get: > > NotSupportedError: Variable_TypeByValue(): unhandled data type unicode > > Traceback (innermost last): > > File "c:\pythonapps\LoadMeanToOra.py", line 1, in ? > # LoadMeanToOra reads a UTF-16LE input file one record at a time > File "c:\pythonapps\LoadMeanToOra.py", line 23, in ? > cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) > > What I can't figure out is whether cx_Oracle is saying it can't handle > Unicode for an Oracle nvarchar2 data type or whether it can handle the > input but that it needs to be in a specific format that I'm not > supplying. What does print repr(mean) give you? It _looks_ to me (don't have an orcacle available right now) as if it is a unicode object. That you have to consider as some abstract string representation. Which means it has to be encoded in some way before sent over the wire. There might exist db-api bindings that can deal with them, by applying a default encoding or somehow figuring out what encoding the DB expects. But I don't see any references to unicode in pep 249, so I presume you can't rely on that - which seems to be the case here. The oracle NLS is a sometimes tricky beast, as it sets the encoding it tries to be clever and assigns an existing connection some encoding, based on the users/machines locale. Which can yield unexpected results, such as "Dusseldorf" instead of "D?sseldorf" when querying a german city list with an english locale. So - you have to figure out, what encoding your db-connection expects. You can do so by issuing some queries against the session tables I believe - I don't have my oracle resources at home, but googling will bring you there, the important oracle term is NLS. Then you need to encode the unicode string before passing it - something like this: mean = mean.encode("latin1") That should help. Diez
From: John Machin on 10 Sep 2006 18:27
Richard Schulman wrote: > On Sun, 10 Sep 2006 11:42:26 +0200, "Diez B. Roggisch" > <deets(a)nospam.web.de> wrote: > > >What does print repr(mean) give you? > > That is a useful suggestion. > > For context, I reproduce the source code: > > in_file = codecs.open("c:\\pythonapps\\mean.my",encoding="utf_16_LE") > connection = cx_Oracle.connect("username", "password") > cursor = connection.cursor() > for row in in_file: > id = row[0] > mean = row[1] > print "Value of row is ", repr(row) #debug line > print "Value of the variable 'id' is ", repr(id) #debug line > print "Value of the variable 'mean' is ", repr(mean) #debug line > cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt) > VALUES (:id,:mean)""",id=id,mean=mean) > > Here is the result from the print repr() statements: > > Value of row is u"\ufeff(3,'sadness, lament; sympathize with, > pity')\r\n" > Value of the variable 'id' is u'\ufeff' > Value of the variable 'mean' is u'(' > > Clearly, the values loaded into the 'id' and 'mean' variables are not > satisfactory but are picking up the BOM. Well of course they're "unsatisfactory" and this is absolutely nothing to do with Oracle and cx_Oracle. row is a string of characters. row[0] is the BOM. Read my lips (from a previous thread): """ Use utf_16 -- it will strip off the BOM for you. """ and again: """ | >>> codecs.open('guff.utf16le', 'r', encoding='utf_16').read() | u'abc\n\rdef\n\rghi' ######### Look, Mom, no BOM! """ row[1] is the first ***character*** of what looks suspiciously like the Python representation of a tuple: """(3,'sadness, lament; sympathize with, pity')""" Who wrote that like that??? If it is at all under your control, do it like this: Encode each Unicode text field in UTF-8. Write the file as a CSV file using Python's csv module. Read the CSV file using the same module. Decode the text fields from UTF-8. You need to parse the incoming line into column values (the csv module does this for you) and then convert each column value from string/Unicode to a Python type that is compatible with the Oracle type for that column. My guess (not having used cx_Oracle) is that the error is happening because the column "id" has a numeric type and you are trying to jam a Unicode string into it. IOW, nothing to do with the "mean" column (yet!). BTW, I've managed to decode that "eng" means English not engineering and "mean" means meaning i.e. not average and not stingy. Holy obfuscation, Batman! HTH, John |