From: Helmut Meukel on 25 May 2010 16:47 "Dee Earley" <dee.earley(a)icode.co.uk> schrieb im Newsbeitrag news:ewwI6BA$KHA.4652(a)TK2MSFTNGP06.phx.gbl... > On 23/05/2010 10:35, Jeff Caton wrote: >> I have to make an import function for a different program whose string >> encoding/ decoding I don't really understand yet. >> Some parts of the encoded string is ASCII, but some are not. >> For example the German character "�" is encoded by 5C (I looked it up in >> a hex editor), which would be 92. I don't have any idea which encoding >> they could have used to get the value 92 for this character. >> Any ideas? > > 92 is decimal, 5C is the hex value of 92. > Both of these however are the \ character. > > -- > Dee Earley (dee.earley(a)icode.co.uk) > i-Catcher Development Team Dee, didn't you read my answer? it's 7bit ISO-646. the original ASCII set was coded with 7 bits + parity bit and had no room for national european characters. IIRC, up to 10 positions in the ASCII code - Hex 5B to 5F and Hex 7B to 7F were therefore reserved for national characters. The german version of the 7bit code used only 8 of these 10 positions (ISO-646-DE). It was 5B -> �, 5C -> �, 5D -> �, 7B to 7D for the lower case �, �, �. I can't remember which Hex values were used for � and �. But that let us without [, \, ], {, |, }, an two other signs. To overcome this most IT people ignored the ISO set and used plain ASCII, substituting � with Ae, � with ae and so on. In cases where it mattered, you had to use a printer with at least 2 built-in character sets - same font, but one ASCII, the other german ISO. Into the text to print you had to insert non-printable control codes to switch between the two sets. So you could print [ and � in the same text. Usually the control codes SO (shift-out) = Hex0E and SI (shift-in) = Hex0F were used for switching between the sets. E.g. Hex 5D is ] in ISO-646, � in ISO-646-DE, � in ISO-646-DK. Then came IBM with its PC and its extended ASCII (=8bit, no parity) character set and finally the code pages (437, 850, ...) while the non-IBM mainframes and minicomputers still used 7-bit character sets. IBM mainframes used EBCDIC. Helmut.
From: Dee Earley on 26 May 2010 06:19 On 25/05/2010 21:47, Helmut Meukel wrote: > "Dee Earley" <dee.earley(a)icode.co.uk> schrieb im Newsbeitrag > news:ewwI6BA$KHA.4652(a)TK2MSFTNGP06.phx.gbl... >> On 23/05/2010 10:35, Jeff Caton wrote: >>> I have to make an import function for a different program whose string >>> encoding/ decoding I don't really understand yet. >>> Some parts of the encoded string is ASCII, but some are not. >>> For example the German character "�" is encoded by 5C (I looked it up in >>> a hex editor), which would be 92. I don't have any idea which encoding >>> they could have used to get the value 92 for this character. >>> Any ideas? >> >> 92 is decimal, 5C is the hex value of 92. >> Both of these however are the \ character. > > Dee, didn't you read my answer? Not until after my reply, but I was more making the point that the 92 and 5C were the same value but a different baser which no one else had mentioned. Rereading the question, it seems I completely missed the point... :p -- Dee Earley (dee.earley(a)icode.co.uk) i-Catcher Development Team iCode Systems (Replies direct to my email address will be ignored. Please reply to the group.)
First
|
Prev
|
Pages: 1 2 Prev: Ensuring files are copied correctly Next: ADO versus DAO and Windows 2000 |