From: Jeff Johnson on 19 Mar 2010 09:16 "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk... > I would not consider Unicode an encoding. Uh, why? An encoding is simply a means of associating a set of bytes with the characters they represent. That's what Unicode does.
From: Harlan Messinger on 19 Mar 2010 11:36 Jeff Johnson wrote: > "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message > news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk... > >> I would not consider Unicode an encoding. > > Uh, why? An encoding is simply a means of associating a set of bytes with > the characters they represent. That's what Unicode does. It isn't an encoding in the binary sense because it only assigns characters to numbers, it doesn't specify a representation. It doesn't specify, for example, whether "A" should be represented as 41 or 0041 or 00000041 (or something else), or whether an em-dash would be 2014 or 002014 or 00002014 (or something else).
From: Peter Duniho on 19 Mar 2010 11:54 Jeff Johnson wrote: > "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message > news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk... > >> I would not consider Unicode an encoding. > > Uh, why? An encoding is simply a means of associating a set of bytes with > the characters they represent. That's what Unicode does. I believe Arne's point is that "Unicode" by itself does not describe a way to encode characters as bytes. There are specific encodings within Unicode (as part of the standard): UTF-8, UTF-16, and UTF-32. But Unicode by itself describes a collection of valid characters, not how they are encoded as bytes. Pete
From: Jeff Johnson on 19 Mar 2010 14:25 "Peter Duniho" <no.peted.spam(a)no.nwlink.spam.com> wrote in message news:uIZGlx3xKHA.5364(a)TK2MSFTNGP05.phx.gbl... >>> I would not consider Unicode an encoding. >> >> Uh, why? An encoding is simply a means of associating a set of bytes with >> the characters they represent. That's what Unicode does. > > I believe Arne's point is that "Unicode" by itself does not describe a way > to encode characters as bytes. There are specific encodings within > Unicode (as part of the standard): UTF-8, UTF-16, and UTF-32. But Unicode > by itself describes a collection of valid characters, not how they are > encoded as bytes. Ah. I just go with the convention that "Unicode" by itself, at least in the ..NET world, means UTF-16LE.
From: Arne Vajhøj on 19 Mar 2010 21:10 On 19-03-2010 09:16, Jeff Johnson wrote: > "Arne Vajh�j"<arne(a)vajhoej.dk> wrote in message > news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk... >> I would not consider Unicode an encoding. > > Uh, why? An encoding is simply a means of associating a set of bytes with > the characters they represent. That's what Unicode does. No. Unicode is a mapping between the various symbols and a number. Encoding is the mapping between the number and 1-many bytes. Arne
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: How To Iterate the Hashtable in C# Next: Serialization/LINQ |