From: John B. Matthews on
In article <1vz3p9kbj0tqd$.dlg(a)kimmeringer.de>,
Lothar Kimmeringer <news200709(a)kimmeringer.de> wrote:

> bugbear wrote:
>
> > Lothar Kimmeringer wrote:
> >> RedGrittyBrick wrote:
> >>
> >>> The output file contains a pound
> >>> sign encoded as code-point 0xa3 which is correct for UTF-8 and for
> >>> ISO-8859-1 Latin1.
> >>
> >> It surely isn't correct for UTF8. You have missed the peceding
> >> 0xc2 or there is something wrong with your test.
> >
> > Depends wether you're talking about and encoding or a code point.
>
> The thread is about encoding and "RedGrittyBrick" say "encoded
> as" leading me to the assumtion that his posting is as well.

I couldn't see anything wrong with RGB's statement, but I frequently
stumble over the terminology. This is especially true in column A of the
Latin-1 Supplement [1], where the code-point and code-value seem to
coincide. I've broken it down to make sure I understand [2]. I'd welcome
any corrections or clarifications.

Glyph: £ (pound sign)
Unicode code-point, escape: \u00a3
UCS-4/UCS-32 code-value, hex: 0xa3
UTF-8 encoding, no BOM, hex octets: c2a3

Given the UTF-8 octet sequence for the UCS-4 range 0000 0080-0000 07FF,

110xxxxx 10xxxxxx
-------- --------
c2 a3 = 11000010 10100011
10 100011 = 10100011 = a3

Mac users may like the desktop Calculator's "Programmer View", which
conveniently displays ASCII or Unicode glyphs.

[1]<http://www.unicode.org/charts/PDF/U0080.pdf>
[2]<http://www.ietf.org/rfc/rfc2279.txt>

--
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>
From: Roedy Green on
On Tue, 12 Jan 2010 01:25:50 -0800 (PST), loial
<jldunn2000(a)googlemail.com> wrote, quoted or indirectly quoted someone
who said :

>I am reading and writing a files which contains the U.K pound sign �
>
>But it is not being written correctly to the output file, even though
>I am specifying UTF-8

See http://mindprod.com/jgloss/sscce.html

If you post a complete programs, it is easier for people to help you.
They don't then have to write a sandwich to run your code.

For code to write/read a pound sign see
http://mindprod.com/applet/fileio.html

For the pound sign use '\u00a3'. If you use the character plain, it
may get scrambled if your source code is not UTF-8 too.

Recall that the console is likely not UTF-8, no displaying a result
there will likely get scrambled.

see http://mindprod.com/jgloss/encoding.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
There is no end to what can be accomplished if you don�t care who gets the credit.
~ Art Rennison
From: Martin Gregorie on
On Tue, 12 Jan 2010 18:40:43 +0000, rossum wrote:

> On Tue, 12 Jan 2010 01:25:50 -0800 (PST), loial
> <jldunn2000(a)googlemail.com> wrote:
>
>>I am reading and writing a files which contains the U.K pound sign £
>>
>>But it is not being written correctly to the output file, even though I
>>am specifying UTF-8
>
> [snip code]
>
> One alternative is to use "GBP" instead, at least for output. How much
> control do you have over the format of the input files?
>
In a multicurrency financial program I'd expect to see the ISO currency
codes used rather than currency symbols for both input and output.

Many systems will accept further abbreviations too, e.g. "GBP 32.00" or
"USD 1.5B", and I wouldn't expect drop-down currency lists to be used
either, since entering a single field like those shown is faster than
using a mouse to select from a currency list and then typing the amount.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
From: Roedy Green on
On Tue, 12 Jan 2010 23:49:34 +0000 (UTC), Martin Gregorie
<martin(a)address-in-sig.invalid> wrote, quoted or indirectly quoted
someone who said :

>In a multicurrency financial program I'd expect to see the ISO currency
>codes used rather than currency symbols for both input and output.

This is CSV file the CurrCon uses to determine the currency symbol:

#currency abbr, decimals, currency symbol, currency name
AED, 2, \u00a4, Utd. Arab Emir. Dirham
AFA, 2, \u00a4, Afghanistan Afghani
ALL, 2, \u00a4, Albanian Lek
ANG, 2, \u00a4, NL Antillian Guilders
AON, 2, \u00a4, Angolan New Kwanza
ARS, 2, \u20b1, Argentine Pesos
AUD, 2, $, Australian Dollars
AWG, 2, \u00a4, Aruban Florins
BBD, 2, $, Barbados Dollars
BDT, 2, \u00a4, Bangladeshi Taka
BGL, 2, \u00a4, Bulgarian Lev
BHD, 2, \u00a4, Bahraini Dinars
BIF, 0, \u20a3, Burundi Francs
BMD, 2, $, Bermudian Dollars
BND, 2, $, Brunei Dollars
BOB, 2, \u00a4, Bolivian Boliviano
BRL, 2, \u20a2, Brazilian Real
BSD, 2, $, Bahamanian Dollars
BTN, 2, \u00a4, Bhutan Ngultrum
BWP, 2, \u00a4, Botswana Pula
BZD, 2, $, Belize Dollars
CAD, 2, $, Canadian Dollars
CHF, 2, \u20a3, Swiss Francs
CLP, 0, \u20b1, Chilean Pesos
CNY, 2, \u00a4, Chinese Yuan Renminbi
COP, 2, \u20b1, Colombian Pesos
CRC, 2, \u20a1, Costa Rican Colon
CSK, 2, \u00a4, Czech Koruna
CUP, 2, \u20b1, Cuban Pesos
CVE, 2, \u00a4, Cape Verde Escudos
CYP, 2, \u00a3, Cyprus Pound
DJF, 0, \u20a3, Djibouti Francs
DKK, 2, \u00a4, Danish Krone
DOP, 2, \u20b1, Dominican R. Pesos
DZD, 2, \u00a4, Algerian Dinars
ECS, 0, \u00a4, Ecuador Sucre
EEK, 2, \u00a4, Estonian Kroon
EGP, 2, \u00a3, Egyptian Pounds
ETB, 2, \u00a4, Ethiopian Birr
EUR, 2, \u20ac, Euros
FJD, 2, $, Fiji Dollars
FKP, 2, \u00a3, Falkland Islands Pounds
GBP, 2, \u00a3, British Pounds
GHC, 2, \u20b5, Ghanaian Cedi
GIP, 2, \u00a3, Gibraltar Pounds
GMD, 2, \u00a4, Gambian Dalasi
GNF, 0, \u20a3, Guinea Francs
GTQ, 2, \u00a4, Guatemalan Quetzal
GYD, 2, $, Guyanese Dollars
HKD, 2, $, Hong Kong Dollars
HNL, 2, \u00a4, Honduran Lempira
HRK, 2, \u00a4, Croatian Kuna
HTG, 2, \u00a4, Haitian Gourde
HUF, 2, \u00a4, Hungarian Forint
IDR, 2, \u00a4, Indonesian Rupiah
ILS, 2, \u20aa, Israeli New Shekels
INR, 2, \u20a8, Indian Rupee
IRR, 2, \ufdfc, Iranian Rial
ISK, 2, \u00a4, Iceland Krona
JMD, 2, $, Jamaican Dollars
JOD, 2, \u00a4, Jordanian Dinars
JPY, 0, \u00a5, Japanese Yen
KES, 2, \u00a4, Kenyan Shillings
KHR, 2, \u17db, Cambodian Riel
KMF, 0, \u20a3, Comoros Francs
KPW, 2, \u20a9, North Korean Won
KRW, 0, \u20a9, South-Korean Won
KWD, 2, \u00a4, Kuwaiti Dinar
KYD, 2, $, Cayman Islands Dollars
KZT, 2, \u00a4, Kazakhstan Tenge
LAK, 2, \u20ad, Lao Kip
LBP, 2, \u00a3, Lebanese Pounds
LKR, 2, \u00a4, Sri Lanka Rupees
LRD, 2, $, Liberian Dollars
LSL, 2, \u00a4, Lesotho Loti
LTL, 2, \u00a4, Lithuanian Litas
LVL, 2, \u00a4, Latvian Lats
LYD, 2, \u00a4, Libyan Dinar
MAD, 2, \u00a4, Moroccan Dirham
MGF, 0, \u20a3, Malagasy Francs
MMK, 2, \u00a4, Myanmar Kyat
MNT, 2, \u20ae, Mongolian Tugrik
MOP, 2, \u00a4, Macau Pataca
MRO, 2, \u00a4, Mauritanian Ouguiya
MTL, 2, \u20a4, Maltese Lira
MUR, 2, \u00a4, Mauritius Rupee
MVR, 2, \u00a4, Maldive Rufiyaa
MWK, 2, \u00a4, Malawi Kwacha
MXP, 2, \u20b1, Mexican Pesos
MYR, 2, \u00a4, Malaysian Ringgit
MZM, 2, \u00a4, Mozambique Metical
NAD, 2, $, Namibia Dollars
NGN, 2, \u20a6, Nigerian Naira
NIO, 2, \u00a4, Nicaraguan Cordoba Oro
NOK, 2, \u00a4, Norwegian Kroner
NPR, 2, \u00a4, Nepalese Rupees
NZD, 2, $, New Zealand Dollars
OMR, 2, \ufdfc, Omani Rial
PAB, 2, \u00a4, Panamanian Balboa
PEN, 2, \u00a4, Peruvian Nuevo Sol
PGK, 2, \u00a4, Papua New Guinea Kina
PHP, 2, \u20b1, Philippine Pesos
PKR, 2, \u00a4, Pakistan Rupee
PLZ, 2, \u00a4, Polish Zloty
PYG, 0, \u20b2, Paraguay Guarani
QAR, 2, \ufdfc, Qatari Rial
ROL, 2, \u00a4, Romanian Leu
RSD, 0, \u00a4, Serbian dinar
RUB, 2, \u00a4, Russian Roubles
SAR, 2, \u00a4, Saudi Riyal
SBD, 2, $, Solomon Islands Dollars
SCR, 2, \u00a4, Seychelles Rupees
SDD, 2, \u00a4, Sudanese Dinars
SEK, 2, \u00a4, Swedish Krona
SGD, 2, $, Singapore Dollars
SHP, 2, \u00a3, St. Helena Pounds
SIT, 2, \u00a4, Slovenian Tolar
SLL, 2, \u00a4, Sierra Leone Leone
SOS, 2, \u00a4, Somali Shillings
SRG, 2, \u00a4, Suriname Guilder
STD, 2, \u00a4, Sao Tome/Principe Dobra
SVC, 2, \u20a1, El Salvador Colon
SYP, 2, \u00a3, Syrian Pounds
SZL, 2, \u00a4, Swaziland Lilangeni
THB, 2, \u0e3f, Thai Baht
TND, 2, \u00a4, Tunisian Dinars
TOP, 2, \u00a4, Tonga Pa'anga
TRL, 0, \u20a4, Turkish Lira
TTD, 2, $, Trinidad/Tobago Dollars
TWD, 2, $, Taiwan Dollars
TZS, 2, \u00a4, Tanzanian Shillings
UAH, 2, \u20b4, Ukraine Hryvnia
UGS, 2, \u00a4, Uganda Shillings
USD, 2, $, US Dollars
UYP, 2, \u20b1, Uruguayan Pesos
VEB, 2, \u00a4, Venezuelan Bolivar
VND, 2, \u20ab, Vietnamese Dong
VUV, 0, \u00a4, Vanuatu Vatu
WST, 2, \u00a4, Samoan Tala
XAF, 0, \u20a3, CFA Franc BEAC
XCD, 2, $, East Caribbean Dollars
XOF, 0, \u20a3, CFA Franc BCEAO
XAG, 2, \u0020, Silver (oz.)
XAU, 3, \u0020, gold (oz.)
XPT, 3, \u0020, platitum (oz.)
YER, 2, \ufdfc, Yemeni Rial
YUN, 2, \u00a4, Yugoslav Dinars
ZAR, 2, \u00a4, South African Rand
ZMK, 2, \u00a4, Zambian Kwacha
ZWD, 2, $, Zimbabwe Dollars

--
Roedy Green Canadian Mind Products
http://mindprod.com
There is no end to what can be accomplished if you don�t care who gets the credit.
~ Art Rennison
From: Martin Gregorie on
On Tue, 12 Jan 2010 23:17:33 -0800, Roedy Green wrote:

> On Tue, 12 Jan 2010 23:49:34 +0000 (UTC), Martin Gregorie
> <martin(a)address-in-sig.invalid> wrote, quoted or indirectly quoted
> someone who said :
>
>>In a multicurrency financial program I'd expect to see the ISO currency
>>codes used rather than currency symbols for both input and output.
>
> This is CSV file the CurrCon uses to determine the currency symbol:
>
> #currency abbr, decimals, currency symbol, currency name
>
Yes, that's the info you'd need, but its faster for the user if its used
to validate the input string after entry, rather than to produce a long,
scrollable selection list. You need the decimal place info for both
validation and to expand abbreviations like 1.5M correctly. I'm intrigued
to see that there are no longer any currencies with three decimal places.
Some years back a few middle eastern currencies used them.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |