From: Richard Quadling on 28 May 2010 03:53 On 28 May 2010 04:47, Guus Ellenkamp <Ellenkamp_Guus(a)hotmail.com> wrote: > And I need(ed) this stuff especially for non-ASCII characters like Chinese, > Arabic and stuff :) > > "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message > news:1274976794.2202.274.camel(a)localhost... > On Thu, 2010-05-27 at 12:08 -0400, Adam Richardson wrote: > >> On Thu, May 27, 2010 at 9:45 AM, Guus Ellenkamp >> <Ellenkamp_Guus(a)hotmail.com>wrote: >> >> > Thanks, but are you sure of that? I did some research a while ago and >> > found >> > that officially PHP files should be ascii and not have any specific >> > character encoding. I believe it will work anyhow (did not try this >> > one), >> > but would like to stick with the standards. >> > >> > "Ashley Sheridan" <ash(a)ashleysheridan.co.uk> wrote in message >> > news:1274883714.2202.228.camel(a)localhost... >> > > On Wed, 2010-05-26 at 22:20 +0800, Guus Ellenkamp wrote: >> > > >> > >> We use PHP defines for defining text in different languages. As far >> > >> as I >> > >> know PHP files are supposed to be ASCII, not UTF-8 or something like >> > >> that. >> > >> What I want to make is a conversion program that would convert a >> > >> given >> > >> UTF-8 >> > >> file with the format >> > >> >> > >> definetext1=this is a text in random UTF-8, probably arabic or >> > >> similar >> > >> text >> > >> definetext2=this is another text in random UTF-8, probably arabic or >> > >> similar >> > >> text >> > >> >> > >> into a file with the following defines >> > >> >> > >> >> > define('definetext1',chr(<t_value>).chr(<h_value>).chr(<i_value>)...<chr(<x_value>).chr(<t_value>)); >> > >> >> > define('definetext2,chr(<t_value>).chr(<h_value>).chr(<i_value>)...<chr(<x_value>).chr(<t_value>)); >> > >> >> > >> Not sure if I'm using the correct chr/ord function, but I hope the >> > >> above >> > >> is >> > >> clear enough to make clear what I'm looking for. Basically the output >> > >> file >> > >> should be ascii and not contain any utf-8. >> > >> >> > >> Any advise? The html_special_chars did not seem to work for >> > >> Vietnamese >> > >> text >> > >> I tried to convert, so something seems to get wrong with just reading >> > >> an >> > >> array of strings and converting the strings and putting them in >> > >> defines. >> > >> >> > >> >> > >> >> > > >> > > >> > > PHP files can contain utf-8, and in-fact is the preference of most >> > > developers I know of. >> > > >> > > Thanks, >> > > Ash >> > > http://www.ashleysheridan.co.uk >> > > >> > > >> > > >> > >> > >> > >> > -- >> > PHP General Mailing List (http://www.php.net/) >> > To unsubscribe, visit: http://www.php.net/unsub.php >> > >> > >> Because the lower range of UTF-8 matches the ascii character set >> (intentionally by design), you'll be able to use UTF-8 for PHP files >> without >> problem (i.e., ascii 7-bit chars have same encoding in UTF-8.) >> http://www.cl.cam.ac.uk/~mgk25/unicode.html >> >> However, if you were to use any of the multibyte characters of UTF-8 in a >> PHP file, you could run in to some trouble.  I use UTF-8 for most of my >> PHP >> files, but I've been sticking to the ASCII subset exclusively. >> >> Adam >> > > > I don't use the higher range of characters often, but I do sometimes use > them for things like the graphical glyphs (½??, etc) I know I could do > those with regular text and the Wingdings font, but that's not available > on every computer, and breaks the semantic meaning behind the glyphs. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > Do you mean ... <?php echo 'æ©æ¨å¥½'; ?> If you cut and paste that into your editor, make sure that the font you are using is a UTF-8 font. Otherwise you will see the font's unknown symbol glyph rather than the correct ones. If your font doesn't have the symbols, it doesn't affect the code. The editor is only displaying the code. It doesn't alter the code. Richard. -- ----- Richard Quadling "Standing on the shoulders of some very clever giants!" EE : http://www.experts-exchange.com/M_248814.html EE4Free : http://www.experts-exchange.com/becomeAnExpert.jsp Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731 ZOPA : http://uk.zopa.com/member/RQuadling
From: tedd on 28 May 2010 11:13 Bob wrtote: >>The real question is whether unicode is even relevant now that the UTF >>series is available. Ashley answered: >Bob, UTF is unicode (Unicode Transformation Format) Yes, Ashley is correct. UTF-8 is Unicode, as is UTF-16 and UTF-32, which all use different a number of bytes for each code point. Both UTF-8 and UTF-16 are variable length whereas UTF-32 is a fixed length of four bytes per code point. As is my understanding, UTF-8 will accommodate all the languages (glyphs) of the world and then some. It will be a while before we need UTF-16 or UTF-32 but those are just a larger super-sets. In any event, I always use UTF-8 in all my encoding. Cheers, tedd -- ------- http://sperling.com http://ancientstones.com http://earthstones.com
From: tedd on 28 May 2010 11:39 At 8:33 PM +0100 5/27/10, Ashley Sheridan wrote: >Tedd, does that URL actually go anywhere, as I got nothing when I >tried visiting it, both the actual URL and the punycode version. Ash: Try it again (it worked for me). In any event, the link was supposed to be redirected to this site: http://xn--fci.com If you run Safari, then the url will be shown as a check-mark. My most popular IDNS site is square-root dot com (option v): http://xn--19g.com The story about that site is on the web page -- you may read if interested. The site receives over 150 unique Mac visitors per day and that number keeps climbing -- I don't know why. For example, one day I had over 800 visitors from Spain -- why??? Obviously, I'm trying to sell the domain (for 6 figures), but have had no takers. I can always get back into Macintosh software development and use the site to sell my own apps -- that's an option I ponder whenever my clients don't call me for a week. Who knows what may happen. Cheers, tedd PS: I have over a dozen IDNS domains including the Pharmaceutical Icon, Yin-Yang Symbol, Sigma, Delta, and DOT dot com (option 8). -- ------- http://sperling.com http://ancientstones.com http://earthstones.com
From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on 28 May 2010 14:52 On Fri, 28 May 2010 11:13:35 -0400, tedd wrote: > Bob wrtote: > >>>The real question is whether unicode is even relevant now that the UTF >>>series is available. > > Ashley answered: > >>Bob, UTF is unicode (Unicode Transformation Format) Or more precisely, UTF-{8,16,32} are different ways to serialize Unicode code points into sequences of octets that makes it possible to store and transmit Unicode data. > Yes, Ashley is correct. UTF-8 is Unicode, as is UTF-16 and UTF-32, > which all use different a number of bytes for each code point. Both > UTF-8 and UTF-16 are variable length whereas UTF-32 is a fixed length > of four bytes per code point. > > As is my understanding, UTF-8 will accommodate all the languages > (glyphs) of the world and then some. It will be a while before we > need UTF-16 or UTF-32 but those are just a larger super-sets. *blink* They are all capable of representing the full Unicode range, which is restricted to U+0000 - U+10ffff. The theoretical limits are: UTF-8 [0 - 7fffffff] UTF-16 [0 - 10ffff] UTF-32 [0 - ffffffff] Also, there are many, many, *many* more glyphs than characters (code point) in the world. As an example, www.fonts.com lists 165,125 fonts. Every one has a *different* glyph for the characer "A"... /Nisse
From: tedd on 28 May 2010 16:52 At 8:52 PM +0200 5/28/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote: >On Fri, 28 May 2010 11:13:35 -0400, tedd wrote: > > > As is my understanding, UTF-8 will accommodate all the languages >> (glyphs) of the world and then some. It will be a while before we >> need UTF-16 or UTF-32 but those are just a larger super-sets. > >*blink* > >They are all capable of representing the full Unicode >range, which is restricted to U+0000 - U+10ffff. > >The theoretical limits are: > > UTF-8 [0 - 7fffffff] > UTF-16 [0 - 10ffff] > UTF-32 [0 - ffffffff] > >Also, there are many, many, *many* more glyphs than >characters (code point) in the world. As an example, >www.fonts.com lists 165,125 fonts. Every one has a >*different* glyph for the characer "A"... > >/Nisse *blink* *blink* As you say, UTF-8 has a range of 0 to 7FFFFFFF Forgive me, but isn't that 2,147,483,647 (DEC) code points? Please note that 165,125 * 48 (upper/lower case) is only 7,925,952 code points -- IF -- each letter of each font was to have it's own code point, which is not the case for Unicode. Code points are assigned to specific char sets that belong to specific language sets, such as English being assigned to the code point range that is common with ASCII. From that, we can have as many fonts as your software can handle. However, ASCII 65 DEC (41 HEX) or code point 65 (41 HEX) is still tied to the letter "A" regardless of if it is Helvetical or Times. So, don't confuse code points with fonts. If you spend some time looking at the numerous char sets that Unicode offers you will see that just about every symbol known to man has been cataloged -- even Klingon was considered. From Dingbats to Architectural symbols, from simplified Chinese to traditional Chinese, from Greek to Cherokee, from skull/cross-bones to yin/yang symbol, every language in the world and glyph known to man has been included -- a truly massive project. IMO, it will be a while before we use up all the range Unicode code points provides. Cheers, tedd -- ------- http://sperling.com http://ancientstones.com http://earthstones.com
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Problem in image placing Next: Google checkout nightmare |