From: Antoine Pitrou on 24 Apr 2010 20:16 Hello, > I have to read the contents of a binary file (a PNG file exactly), and > dump it into an RTF file. > > The RTF-file has been opened with codecs.open in utf-8 mode. You should use the built-in open() function. codecs.open() is outdated in Python 3. > As I expected, the utf-8 decoder chokes on some combinations of bits; > how can I tell python to dump the bytes as they are, without > interpreting them? Well, the one thing you have to be careful about is to flush text buffers before writing binary data. But, for example: >>> f = open("TEST", "w", encoding='utf8') >>> f.write("héhé") 4 >>> f.flush() >>> f.buffer.write(b"\xff\x00") 2 >>> f.close() gives you: $ hexdump -C TEST 00000000 68 c3 a9 68 c3 a9 ff 00 |h..h....| (utf-8 encoded text and then two raw bytes which are invalid utf-8) Another possibility is to open the file in binary mode and do the encoding yourself when writing text. This might actually be a better solution, since I'm not sure RTF uses utf-8 by default. Regards Antoine.
From: Stefan Behnel on 25 Apr 2010 01:44 Antoine Pitrou, 25.04.2010 02:16: > Another possibility is to open the file in binary mode and do the > encoding yourself when writing text. This might actually be a better > solution, since I'm not sure RTF uses utf-8 by default. That's a lot cleaner as it doesn't use two interfaces to write to the same file, and doesn't rely on any specific coordination between those two interfaces. Stefan
|
Pages: 1 Prev: Wanted: Python solution for ordering dependencies Next: question about google project hosting! |