Prev: This spanish character string "�" cause something that I don't understand
Next: about encoding UTF-8 and UTF-16
From: Harlan Messinger on 31 Mar 2010 07:55 Tony Johansson wrote: > Hi! > > Here I encode the spanish character "�" to UTF-8 which is encoded as a two > bytes with the values 195 and 177 which is understandable. > As we know a char is a Unicode which is a signed 16-bits integer. > Now to my question when I run this program and use the debugger and hover > over this ch variabel that is of type char > it shows 241. > I mean because a char is Unicode(UTF-16) and this value is using two bytes > when UTF-8 is used how can the debugger show 241 when I hover over this ch > variable ? Since the characters is represented in memory as UTF-16, why would the debugger show you what it would be in UTF-8? The UTF-16 representation for all Unicode characters with values less than 65536 is the straightforward 16-bit integer representation of the value. This isn't the case in UTF-8. |