From: John B. Matthews on 12 Jan 2010 13:49 In article <1vz3p9kbj0tqd$.dlg(a)kimmeringer.de>, Lothar Kimmeringer <news200709(a)kimmeringer.de> wrote: > bugbear wrote: > > > Lothar Kimmeringer wrote: > >> RedGrittyBrick wrote: > >> > >>> The output file contains a pound > >>> sign encoded as code-point 0xa3 which is correct for UTF-8 and for > >>> ISO-8859-1 Latin1. > >> > >> It surely isn't correct for UTF8. You have missed the peceding > >> 0xc2 or there is something wrong with your test. > > > > Depends wether you're talking about and encoding or a code point. > > The thread is about encoding and "RedGrittyBrick" say "encoded > as" leading me to the assumtion that his posting is as well. I couldn't see anything wrong with RGB's statement, but I frequently stumble over the terminology. This is especially true in column A of the Latin-1 Supplement [1], where the code-point and code-value seem to coincide. I've broken it down to make sure I understand [2]. I'd welcome any corrections or clarifications. Glyph: £ (pound sign) Unicode code-point, escape: \u00a3 UCS-4/UCS-32 code-value, hex: 0xa3 UTF-8 encoding, no BOM, hex octets: c2a3 Given the UTF-8 octet sequence for the UCS-4 range 0000 0080-0000 07FF, 110xxxxx 10xxxxxx -------- -------- c2 a3 = 11000010 10100011 10 100011 = 10100011 = a3 Mac users may like the desktop Calculator's "Programmer View", which conveniently displays ASCII or Unicode glyphs. [1]<http://www.unicode.org/charts/PDF/U0080.pdf> [2]<http://www.ietf.org/rfc/rfc2279.txt> -- John B. Matthews trashgod at gmail dot com <http://sites.google.com/site/drjohnbmatthews>
From: Roedy Green on 12 Jan 2010 17:38 On Tue, 12 Jan 2010 01:25:50 -0800 (PST), loial <jldunn2000(a)googlemail.com> wrote, quoted or indirectly quoted someone who said : >I am reading and writing a files which contains the U.K pound sign � > >But it is not being written correctly to the output file, even though >I am specifying UTF-8 See http://mindprod.com/jgloss/sscce.html If you post a complete programs, it is easier for people to help you. They don't then have to write a sandwich to run your code. For code to write/read a pound sign see http://mindprod.com/applet/fileio.html For the pound sign use '\u00a3'. If you use the character plain, it may get scrambled if your source code is not UTF-8 too. Recall that the console is likely not UTF-8, no displaying a result there will likely get scrambled. see http://mindprod.com/jgloss/encoding.html -- Roedy Green Canadian Mind Products http://mindprod.com There is no end to what can be accomplished if you don�t care who gets the credit. ~ Art Rennison
From: Martin Gregorie on 12 Jan 2010 18:49 On Tue, 12 Jan 2010 18:40:43 +0000, rossum wrote: > On Tue, 12 Jan 2010 01:25:50 -0800 (PST), loial > <jldunn2000(a)googlemail.com> wrote: > >>I am reading and writing a files which contains the U.K pound sign £ >> >>But it is not being written correctly to the output file, even though I >>am specifying UTF-8 > > [snip code] > > One alternative is to use "GBP" instead, at least for output. How much > control do you have over the format of the input files? > In a multicurrency financial program I'd expect to see the ISO currency codes used rather than currency symbols for both input and output. Many systems will accept further abbreviations too, e.g. "GBP 32.00" or "USD 1.5B", and I wouldn't expect drop-down currency lists to be used either, since entering a single field like those shown is faster than using a mouse to select from a currency list and then typing the amount. -- martin@ | Martin Gregorie gregorie. | Essex, UK org |
From: Roedy Green on 13 Jan 2010 02:17 On Tue, 12 Jan 2010 23:49:34 +0000 (UTC), Martin Gregorie <martin(a)address-in-sig.invalid> wrote, quoted or indirectly quoted someone who said : >In a multicurrency financial program I'd expect to see the ISO currency >codes used rather than currency symbols for both input and output. This is CSV file the CurrCon uses to determine the currency symbol: #currency abbr, decimals, currency symbol, currency name AED, 2, \u00a4, Utd. Arab Emir. Dirham AFA, 2, \u00a4, Afghanistan Afghani ALL, 2, \u00a4, Albanian Lek ANG, 2, \u00a4, NL Antillian Guilders AON, 2, \u00a4, Angolan New Kwanza ARS, 2, \u20b1, Argentine Pesos AUD, 2, $, Australian Dollars AWG, 2, \u00a4, Aruban Florins BBD, 2, $, Barbados Dollars BDT, 2, \u00a4, Bangladeshi Taka BGL, 2, \u00a4, Bulgarian Lev BHD, 2, \u00a4, Bahraini Dinars BIF, 0, \u20a3, Burundi Francs BMD, 2, $, Bermudian Dollars BND, 2, $, Brunei Dollars BOB, 2, \u00a4, Bolivian Boliviano BRL, 2, \u20a2, Brazilian Real BSD, 2, $, Bahamanian Dollars BTN, 2, \u00a4, Bhutan Ngultrum BWP, 2, \u00a4, Botswana Pula BZD, 2, $, Belize Dollars CAD, 2, $, Canadian Dollars CHF, 2, \u20a3, Swiss Francs CLP, 0, \u20b1, Chilean Pesos CNY, 2, \u00a4, Chinese Yuan Renminbi COP, 2, \u20b1, Colombian Pesos CRC, 2, \u20a1, Costa Rican Colon CSK, 2, \u00a4, Czech Koruna CUP, 2, \u20b1, Cuban Pesos CVE, 2, \u00a4, Cape Verde Escudos CYP, 2, \u00a3, Cyprus Pound DJF, 0, \u20a3, Djibouti Francs DKK, 2, \u00a4, Danish Krone DOP, 2, \u20b1, Dominican R. Pesos DZD, 2, \u00a4, Algerian Dinars ECS, 0, \u00a4, Ecuador Sucre EEK, 2, \u00a4, Estonian Kroon EGP, 2, \u00a3, Egyptian Pounds ETB, 2, \u00a4, Ethiopian Birr EUR, 2, \u20ac, Euros FJD, 2, $, Fiji Dollars FKP, 2, \u00a3, Falkland Islands Pounds GBP, 2, \u00a3, British Pounds GHC, 2, \u20b5, Ghanaian Cedi GIP, 2, \u00a3, Gibraltar Pounds GMD, 2, \u00a4, Gambian Dalasi GNF, 0, \u20a3, Guinea Francs GTQ, 2, \u00a4, Guatemalan Quetzal GYD, 2, $, Guyanese Dollars HKD, 2, $, Hong Kong Dollars HNL, 2, \u00a4, Honduran Lempira HRK, 2, \u00a4, Croatian Kuna HTG, 2, \u00a4, Haitian Gourde HUF, 2, \u00a4, Hungarian Forint IDR, 2, \u00a4, Indonesian Rupiah ILS, 2, \u20aa, Israeli New Shekels INR, 2, \u20a8, Indian Rupee IRR, 2, \ufdfc, Iranian Rial ISK, 2, \u00a4, Iceland Krona JMD, 2, $, Jamaican Dollars JOD, 2, \u00a4, Jordanian Dinars JPY, 0, \u00a5, Japanese Yen KES, 2, \u00a4, Kenyan Shillings KHR, 2, \u17db, Cambodian Riel KMF, 0, \u20a3, Comoros Francs KPW, 2, \u20a9, North Korean Won KRW, 0, \u20a9, South-Korean Won KWD, 2, \u00a4, Kuwaiti Dinar KYD, 2, $, Cayman Islands Dollars KZT, 2, \u00a4, Kazakhstan Tenge LAK, 2, \u20ad, Lao Kip LBP, 2, \u00a3, Lebanese Pounds LKR, 2, \u00a4, Sri Lanka Rupees LRD, 2, $, Liberian Dollars LSL, 2, \u00a4, Lesotho Loti LTL, 2, \u00a4, Lithuanian Litas LVL, 2, \u00a4, Latvian Lats LYD, 2, \u00a4, Libyan Dinar MAD, 2, \u00a4, Moroccan Dirham MGF, 0, \u20a3, Malagasy Francs MMK, 2, \u00a4, Myanmar Kyat MNT, 2, \u20ae, Mongolian Tugrik MOP, 2, \u00a4, Macau Pataca MRO, 2, \u00a4, Mauritanian Ouguiya MTL, 2, \u20a4, Maltese Lira MUR, 2, \u00a4, Mauritius Rupee MVR, 2, \u00a4, Maldive Rufiyaa MWK, 2, \u00a4, Malawi Kwacha MXP, 2, \u20b1, Mexican Pesos MYR, 2, \u00a4, Malaysian Ringgit MZM, 2, \u00a4, Mozambique Metical NAD, 2, $, Namibia Dollars NGN, 2, \u20a6, Nigerian Naira NIO, 2, \u00a4, Nicaraguan Cordoba Oro NOK, 2, \u00a4, Norwegian Kroner NPR, 2, \u00a4, Nepalese Rupees NZD, 2, $, New Zealand Dollars OMR, 2, \ufdfc, Omani Rial PAB, 2, \u00a4, Panamanian Balboa PEN, 2, \u00a4, Peruvian Nuevo Sol PGK, 2, \u00a4, Papua New Guinea Kina PHP, 2, \u20b1, Philippine Pesos PKR, 2, \u00a4, Pakistan Rupee PLZ, 2, \u00a4, Polish Zloty PYG, 0, \u20b2, Paraguay Guarani QAR, 2, \ufdfc, Qatari Rial ROL, 2, \u00a4, Romanian Leu RSD, 0, \u00a4, Serbian dinar RUB, 2, \u00a4, Russian Roubles SAR, 2, \u00a4, Saudi Riyal SBD, 2, $, Solomon Islands Dollars SCR, 2, \u00a4, Seychelles Rupees SDD, 2, \u00a4, Sudanese Dinars SEK, 2, \u00a4, Swedish Krona SGD, 2, $, Singapore Dollars SHP, 2, \u00a3, St. Helena Pounds SIT, 2, \u00a4, Slovenian Tolar SLL, 2, \u00a4, Sierra Leone Leone SOS, 2, \u00a4, Somali Shillings SRG, 2, \u00a4, Suriname Guilder STD, 2, \u00a4, Sao Tome/Principe Dobra SVC, 2, \u20a1, El Salvador Colon SYP, 2, \u00a3, Syrian Pounds SZL, 2, \u00a4, Swaziland Lilangeni THB, 2, \u0e3f, Thai Baht TND, 2, \u00a4, Tunisian Dinars TOP, 2, \u00a4, Tonga Pa'anga TRL, 0, \u20a4, Turkish Lira TTD, 2, $, Trinidad/Tobago Dollars TWD, 2, $, Taiwan Dollars TZS, 2, \u00a4, Tanzanian Shillings UAH, 2, \u20b4, Ukraine Hryvnia UGS, 2, \u00a4, Uganda Shillings USD, 2, $, US Dollars UYP, 2, \u20b1, Uruguayan Pesos VEB, 2, \u00a4, Venezuelan Bolivar VND, 2, \u20ab, Vietnamese Dong VUV, 0, \u00a4, Vanuatu Vatu WST, 2, \u00a4, Samoan Tala XAF, 0, \u20a3, CFA Franc BEAC XCD, 2, $, East Caribbean Dollars XOF, 0, \u20a3, CFA Franc BCEAO XAG, 2, \u0020, Silver (oz.) XAU, 3, \u0020, gold (oz.) XPT, 3, \u0020, platitum (oz.) YER, 2, \ufdfc, Yemeni Rial YUN, 2, \u00a4, Yugoslav Dinars ZAR, 2, \u00a4, South African Rand ZMK, 2, \u00a4, Zambian Kwacha ZWD, 2, $, Zimbabwe Dollars -- Roedy Green Canadian Mind Products http://mindprod.com There is no end to what can be accomplished if you don�t care who gets the credit. ~ Art Rennison
From: Martin Gregorie on 13 Jan 2010 08:08
On Tue, 12 Jan 2010 23:17:33 -0800, Roedy Green wrote: > On Tue, 12 Jan 2010 23:49:34 +0000 (UTC), Martin Gregorie > <martin(a)address-in-sig.invalid> wrote, quoted or indirectly quoted > someone who said : > >>In a multicurrency financial program I'd expect to see the ISO currency >>codes used rather than currency symbols for both input and output. > > This is CSV file the CurrCon uses to determine the currency symbol: > > #currency abbr, decimals, currency symbol, currency name > Yes, that's the info you'd need, but its faster for the user if its used to validate the input string after entry, rather than to produce a long, scrollable selection list. You need the decimal place info for both validation and to expand abbreviations like 1.5M correctly. I'm intrigued to see that there are no longer any currencies with three decimal places. Some years back a few middle eastern currencies used them. -- martin@ | Martin Gregorie gregorie. | Essex, UK org | |