Prev: make a folder to .nsi file(which finally will convert to .exe) use python
Next: time between now and the next 2:30 am?
From: dirknbr on 23 Jul 2010 06:14 I am having some problems with unicode from json. This is the error I get UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in position 61: ordinal not in range(128) I have kind of developped this but obviously it's not nice, any better ideas? try: text=texts[i] text=text.encode('latin-1') text=text.encode('utf-8') except: text=' ' Dirk
From: Steven D'Aprano on 23 Jul 2010 06:42 On Fri, 23 Jul 2010 03:14:11 -0700, dirknbr wrote: > I am having some problems with unicode from json. > > This is the error I get > > UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in > position 61: ordinal not in range(128) > > I have kind of developped this but obviously it's not nice, any better > ideas? > > try: > text=texts[i] > text=text.encode('latin-1') > text=text.encode('utf-8') > except: > text=' ' Don't write bare excepts, always catch the error you want and nothing else. As you've written it, the result of encoding with latin-1 is thrown away, even if it succeeds. text = texts[i] # Don't hide errors here. try: text = text.encode('latin-1') except UnicodeEncodeError: try: text = text.encode('utf-8') except UnicodeEncodeError: text = ' ' do_something_with(text) Another thing you might consider is setting the error handler: text = text.encode('utf-8', errors='ignore') Other error handlers are 'strict' (the default), 'replace' and 'xmlcharrefreplace'. -- Steven
From: Chris Rebert on 23 Jul 2010 06:45 On Fri, Jul 23, 2010 at 3:14 AM, dirknbr <dirknbr(a)gmail.com> wrote: > I am having some problems with unicode from json. > > This is the error I get > > UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in > position 61: ordinal not in range(128) Please include the full Traceback and the actual code that's causing the error! We aren't mind readers. This error basically indicates that you're incorrectly mixing byte strings and Unicode strings somewhere. Cheers, Chris -- http://blog.rebertia.com
From: dirknbr on 23 Jul 2010 06:56 To give a bit of context. I am using twython which is a wrapper for the JSON API search=twitter.searchTwitter(s,rpp=100,page=str(it),result_type='recent',lang='en') for u in search[u'results']: ids.append(u[u'id']) texts.append(u[u'text']) This is where texts comes from. When I then want to write texts to a file I get the unicode error. Dirk
From: Thomas Jollans on 23 Jul 2010 12:27
On 07/23/2010 12:56 PM, dirknbr wrote: > To give a bit of context. I am using twython which is a wrapper for > the JSON API > > > search=twitter.searchTwitter(s,rpp=100,page=str(it),result_type='recent',lang='en') > for u in search[u'results']: > ids.append(u[u'id']) > texts.append(u[u'text']) > > This is where texts comes from. > > When I then want to write texts to a file I get the unicode error. So your data is unicode? Good. Well, files are just streams of bytes, so to write unicode data to one you have to encode it. Since Python can't know which encoding you want to use (utf-8, by the way, if you ask me), you have to do it manually. something like: outfile.write(text.encode('utf-8')) |