From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on 29 May 2010 01:15 On Fri, 28 May 2010 16:52:09 -0400, tedd wrote: > At 8:52 PM +0200 5/28/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >>On Fri, 28 May 2010 11:13:35 -0400, tedd wrote: >> >> > As is my understanding, UTF-8 will accommodate all the languages >>> (glyphs) of the world and then some. It will be a while before we >>> need UTF-16 or UTF-32 but those are just a larger super-sets. Again: >>The theoretical limits are: >> >> UTF-8 [0 - 7fffffff] >> UTF-16 [0 - 10ffff] >> UTF-32 [0 - ffffffff] In what way are UTF-16 and -32 super-sets of UTF-8? >>Also, there are many, many, *many* more glyphs than >>characters (code point) in the world. As an example, >>www.fonts.com lists 165,125 fonts. Every one has a >>*different* glyph for the characer "A"... > As you say, UTF-8 has a range of 0 to 7FFFFFFF No, I said that's the theoretical range. It is restricted to [0-10ffff] according to current specifications. > If you spend some time looking at the numerous char sets that Unicode > offers you will see that just about every symbol known to man has > been cataloged Yes. (Except those that are missing). > every language in the world and glyph known to man has been > included -- a truly massive project. No. There are no glyphs in Unicode. This is spelled out for you in chapter 2, figure 2-2. "Characters versus Glyphs". /Nisse
From: tedd on 29 May 2010 10:16 At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: > >No. There are no glyphs in Unicode. This is spelled out for >you in chapter 2, figure 2-2. "Characters versus Glyphs". *blink* *blink* *blink* I read it, but that's not addressing the issue here -- that's something different. You are not understanding the difference between characters, fonts, glyphs, and code points. Here are some definitions taken directly from a Unicode Standard that might help: -- quote Character. The smallest component of written language that has semantic-value; refers to the abstract meaning and/or shape, rather than a specific shape (see also glyph), though in code tables some form of visual representation is essential for members understanding. Font. A collection of glyphs used for the visual depiction of character data. A font is often associated with a set of parameters (for example, size posture, weight, and serifness), which, when set to particular values, generates a collection of imaginable glyphs. Glyph. (1) An abstract for that represents one or more glyph images. (2) A synonym for "glyph image". In displaying Unicode character data, one or more glyphs may be selected to depict a particular character. These glyphs are selected by a rendering engine during composition and layout processing. -- unquote As such, you cannot claim "There are no glyphs in Unicode" for that is silly. Code points are simply unique numbers assigned to specific characters in an approved char set. To better understand which character is represented a representative Glyph is used -- what else would we use, a chicken? I may have been liberal in my use of the term "Glyph" in previous brief email, but "Glyph" in Unicode has a special meaning. The Glyph 'A' is 'A' regardless of if it is Helvetical or Times, bold or italic, 12pt or 24pt glyph. Likewise the Yin-Yang symbol is a Glyph that has a single code point regardless of if it is red and black or green and blue glyph. But the point is -- there is a unique code point (041 HEX) for the Latin 'A' Glyph and one unique code point (262F HEX) for the Miscellaneous Symbols Yin-Yang Glyph -- WITH -- a representative Glyph in the Unicode table defining each code point! So, when I say that just about every Glyph in the world has been provided a code point I am basically and technically correct -- excepting of course those glyphs that are not considered appropriate for inclusion or are variation glyphs of the representative Glyph that is already included -- understand? After all is said and done, what is Unicode all about? It is assigning a universal and unique code point system to Glyphs that are considered to be appropriate representative members of abstract written forms of communication. But of course those are Glyphs for what else could they be? Cheers, tedd -- ------- http://sperling.com http://ancientstones.com http://earthstones.com
From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on 29 May 2010 16:20 On Sat, 29 May 2010 10:16:39 -0400, tedd wrote: > At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >> >>No. There are no glyphs in Unicode. This is spelled out for >>you in chapter 2, figure 2-2. "Characters versus Glyphs". > Code points are simply unique numbers assigned to specific characters > in an approved char set. To better understand which character is > represented a representative Glyph is used -- what else would we use, Right. I should have phrased that differently. > a chicken? U+9e21 ? U+540D ? /Nisse
From: tedd on 30 May 2010 10:20 At 10:20 PM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >On Sat, 29 May 2010 10:16:39 -0400, tedd wrote: > >> At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >>> >>>No. There are no glyphs in Unicode. This is spelled out for >>>you in chapter 2, figure 2-2. "Characters versus Glyphs". > >> Code points are simply unique numbers assigned to specific characters >> in an approved char set. To better understand which character is >> represented a representative Glyph is used -- what else would we use, > >Right. I should have phrased that differently. > >> a chicken? > >U+9e21 ? U+540D ? LOL I forgot that the word chicken appears in several other languages as a single character. Interesting to note that in the Chinese Dictionary, the character "U+9e21" Chicken (ji) is interchangeable with prostitution. Cheers, tedd -- ------- http://sperling.com http://ancientstones.com http://earthstones.com
From: "Angus Mann" on 31 May 2010 19:42
Dear Sir/Madam Please unsubscribe Angus Mann angusmann(a)pobox.com from your database. My husband passed away 6 May 2010. Thank you Sonya Mann ----- Original Message ----- From: "tedd" <tedd.sperling(a)gmail.com> To: <php-general(a)lists.php.net> Sent: Monday, May 31, 2010 12:20 AM Subject: Re: [PHP] Convert UTF-8 to PHP defines > At 10:20 PM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >>On Sat, 29 May 2010 10:16:39 -0400, tedd wrote: >> >>> At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >>>> >>>>No. There are no glyphs in Unicode. This is spelled out for >>>>you in chapter 2, figure 2-2. "Characters versus Glyphs". >> >>> Code points are simply unique numbers assigned to specific characters >>> in an approved char set. To better understand which character is >>> represented a representative Glyph is used -- what else would we use, >> >>Right. I should have phrased that differently. >> >>> a chicken? >> >>U+9e21 ? U+540D ? > > LOL > > I forgot that the word chicken appears in several other languages as a > single character. Interesting to note that in the Chinese Dictionary, the > character "U+9e21" Chicken (ji) is interchangeable with prostitution. > > Cheers, > > tedd > > -- > ------- > http://sperling.com http://ancientstones.com http://earthstones.com > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > |