Prev: FileSystemWatcher and open files
Next: This spanish character string "ñ" cause something that I don't understand
From: Tony Johansson on 31 Mar 2010 04:32 Hi! Here I encode the spanish character "�" to UTF-8 which is encoded as a two bytes with the values 195 and 177 which is understandable. As we know a char is a Unicode which is a signed 16-bits integer. Now to my question when I run this program and use the debugger and hover over this ch variabel that is of type char it shows 241. I mean because a char is Unicode(UTF-16) and this value is using two bytes when UTF-8 is used how can the debugger show 241 when I hover over this ch variable ? static void Main(string[] args) { UTF8Encoding utf8 = new UTF8Encoding(); string chars = "�"; char ch = '�'; byte[] byteArray = new byte[utf8.GetByteCount(chars)]; byteArray = utf8.GetBytes(chars); Console.WriteLine(utf8.GetString(byteArray)); } //Tony
From: Mihai N. on 31 Mar 2010 05:20
> Here I encode the spanish character "�" to UTF-8 which is encoded as a two > bytes with the values 195 and 177 which is understandable. > As we know a char is a Unicode which is a signed 16-bits integer. > Now to my question when I run this program and use the debugger and hover > over this ch variabel that is of type char > it shows 241. > I mean because a char is Unicode(UTF-16) and this value is using two bytes > when UTF-8 is used how can the debugger show 241 when I hover over this ch > variable ? The code point of � is U+00F1 This is 0xF1 (or 241 decimal) in UTF-16 or UTF-32, and C3 B1 (195 177 decimal) as UTF-8. You can have some fun starting with the table here: http://en.wikipedia.org/wiki/UTF-8#Description 195 177 decimal = C3 B1 hex = 11000011 10110001 binary Now you take the binary and compare it to the UTF-8 pattern: 11000011 10110001 110yyyxx 10xxxxxx (second line in the table) So you extract the usefull bits (above yyyxxxxxxxx) and get 00011 110001 Together that is 00011110001 or split in groups of 4 you get 000.1111.0001. That is exactly F1 (241). -- Mihai Nita [Microsoft MVP, Visual C++] http://www.mihai-nita.net ------------------------------------------ Replace _year_ with _ to get the real email |