From: Lothar Kimmeringer on 12 Jan 2010 08:14 RedGrittyBrick wrote: > The output file contains a pound > sign encoded as code-point 0xa3 which is correct for UTF-8 and for > ISO-8859-1 Latin1. It surely isn't correct for UTF8. You have missed the peceding 0xc2 or there is something wrong with your test. Regards, Lothar -- Lothar Kimmeringer E-Mail: spamfang(a)kimmeringer.de PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81) Always remember: The answer is forty-two, there can only be wrong questions!
From: bugbear on 12 Jan 2010 09:53 Lothar Kimmeringer wrote: > RedGrittyBrick wrote: > >> The output file contains a pound >> sign encoded as code-point 0xa3 which is correct for UTF-8 and for >> ISO-8859-1 Latin1. > > It surely isn't correct for UTF8. You have missed the peceding > 0xc2 or there is something wrong with your test. Depends wether you're talking about and encoding or a code point. BugBear
From: Lothar Kimmeringer on 12 Jan 2010 10:54 bugbear wrote: > Lothar Kimmeringer wrote: >> RedGrittyBrick wrote: >> >>> The output file contains a pound >>> sign encoded as code-point 0xa3 which is correct for UTF-8 and for >>> ISO-8859-1 Latin1. >> >> It surely isn't correct for UTF8. You have missed the peceding >> 0xc2 or there is something wrong with your test. > > Depends wether you're talking about and encoding or a code point. The thread is about encoding and "RedGrittyBrick" say "encoded as" leading me to the assumtion that his posting is as well. Regards, Lothar -- Lothar Kimmeringer E-Mail: spamfang(a)kimmeringer.de PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81) Always remember: The answer is forty-two, there can only be wrong questions!
From: RedGrittyBrick on 12 Jan 2010 12:47 Lothar Kimmeringer wrote: > RedGrittyBrick wrote: > >> The output file contains a pound >> sign encoded as code-point 0xa3 which is correct for UTF-8 and for >> ISO-8859-1 Latin1. > > It surely isn't correct for UTF8. You have missed the peceding > 0xc2 or there is something wrong with your test. > I was using gvim to inspect the output file, it showed the £ correctly when I told it the file was utf8 encoded. I used Gvim's ga command to show the code-point of the character under the cursor. I forgot about the multibyte encoding details (For which I should have used the g8 command). Thanks for the correction. -- RGB
From: rossum on 12 Jan 2010 13:40
On Tue, 12 Jan 2010 01:25:50 -0800 (PST), loial <jldunn2000(a)googlemail.com> wrote: >I am reading and writing a files which contains the U.K pound sign £ > >But it is not being written correctly to the output file, even though >I am specifying UTF-8 [snip code] One alternative is to use "GBP" instead, at least for output. How much control do you have over the format of the input files? rossum |