From: Arne Vajhøj on 25 Mar 2010 19:55 On 25-03-2010 12:58, Chris Dunaway wrote: > On Mar 25, 10:05 am, "Tony Johansson"<johansson.anders...(a)telia.com> > wrote: >> This Unicode UTF-8 can use up to 24 bit for encoding. UTF-8 support almost >> all languages so what is the reason >> to use another Unicode then this UTF-8. > > http://www.joelonsoftware.com/articles/Unicode.html <quote> Back in the semi-olden days, when Unix was being invented and K&R were writing The C Programming Language, everything was very simple. EBCDIC was on its way out. </quote> I was not tempted to read any further. Arne
From: Mihai N. on 26 Mar 2010 00:08 > Most applications don't need to support almost all languages. > > If you're creating an application for use in Japan by Japanese people, > for example, then you might prefer to use Shift-JIS, which uses two > bytes per character, instead of UTF-8, which uses three bytes per > Japanese character. > > If you're building an app for use in the United States by English > speakers, and much of the input is likely to come from ASCII sources, or > much of your output is intended for use with software that understands > only ISO-8859-1, then it may have no use for UTF-8. Sorry, but this is prety bad advice. Was probably ok 10 years ago, but not now. First, the extra bytes here and there don't amount to much. To one team bringint this argument I have shown that the .jpg they used for splash-screen was bigger than all the strings together. Second, all system APIs are Unicode UTF-16. So if you use Shift-JIS or ASCII, you will waste time for conversions back and forth (happening in the belly of the OS). Same for keystrokes: the system gets unicode input and has to convert it to a code page for legacy application. Of course, if your application is running on Windows 95/98, then you are better without Unicode. This is also true for Mac OS X and Qt. Third, this is a C# newsgroup, so I would assume the question refers to that. So all strings are Unicode (UTF-16). Use any other code page, and you will have to convert. -- Mihai Nita [Microsoft MVP, Visual C++] http://www.mihai-nita.net ------------------------------------------ Replace _year_ with _ to get the real email
From: JeffWhitledge on 26 Mar 2010 10:32 On Mar 25, 10:05 am, "Tony Johansson" <johansson.anders...(a)telia.com> wrote: > Hi! > > This Unicode UTF-8 can use up to 24 bit for encoding. UTF-8 support almost > all languages so what is the reason > to use another Unicode then this UTF-8. > > //Tony UTF-8 supports the complete Unicode character set so it is a fine choice for many applications. It can be used for nearly all of the world's written languages, and it is a compact representation for latin-based texts (like English), which are very common. Except for interfacing with legacy applications, there is no good reason to use a non-Unicode character set. However, there are good reasons for using a Unicode character encoding other than UTF-8. Many platforms use UTF-16 internally (Windows NT,XP,Vista,7; the .Net Framework, C#), so by sticking with that you can avoid conversions. Many languages (especially Asian languages) have a more compact representation in UTF-16 than in UTF-8. UTF-16 will be simpler to process for many texts, since the characters in the Basic Multilingual Plane (plane 0, which encodes the vast majority of the characters used by living languages) are always represented by exactly 2 bytes in UTF-16. (Characters in the higher planes are represented in 4 bytes in UTF-16, but these characters are far less common.) For these reasons, UTF-16 can also be an excellent choice of encoding scheme. There are few applications where UTF-32 is the best choice, and probably all of them are for internal processing only. I can't imagine a scenerio in which UTF-32 would be the best choice for storing or transmitting text.
From: Chris Dunaway on 26 Mar 2010 17:51 On Mar 25, 6:55 pm, Arne Vajhøj <a...(a)vajhoej.dk> wrote: > > I was not tempted to read any further. > > Arne Ummm... why?
From: Arne Vajhøj on 26 Mar 2010 19:47 On 26-03-2010 17:51, Chris Dunaway wrote: > On Mar 25, 6:55 pm, Arne Vajh�j<a...(a)vajhoej.dk> wrote: >> I was not tempted to read any further. > > Ummm... why? Because that paragraph revealed a lack of knowledge so big that I considered it a sure waste of time to read any further. Arne
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Read Mail - tcp Next: How to prevent focus outlining on buttons |