String decoding [Visual Basic]

Prev: Ensuring files are copied correctly
Next: ADO versus DAO and Windows 2000

From: Helmut Meukel on 25 May 2010 16:47

"Dee Earley" <dee.earley(a)icode.co.uk> schrieb im Newsbeitrag
news:ewwI6BA$KHA.4652(a)TK2MSFTNGP06.phx.gbl...
> On 23/05/2010 10:35, Jeff Caton wrote:
>> I have to make an import function for a different program whose string
>> encoding/ decoding I don't really understand yet.
>> Some parts of the encoded string is ASCII, but some are not.
>> For example the German character "�" is encoded by 5C (I looked it up in
>> a hex editor), which would be 92. I don't have any idea which encoding
>> they could have used to get the value 92 for this character.
>> Any ideas?
>
> 92 is decimal, 5C is the hex value of 92.
> Both of these however are the \ character.
>
> --
> Dee Earley (dee.earley(a)icode.co.uk)
> i-Catcher Development Team

Dee, didn't you read my answer?
it's 7bit ISO-646.
the original ASCII set was coded with 7 bits + parity bit and
had no room for national european characters.
IIRC, up to 10 positions in the ASCII code -
Hex 5B to 5F and Hex 7B to 7F were therefore reserved
for national characters. The german version of the 7bit code
used only 8 of these 10 positions (ISO-646-DE).
It was 5B -> �, 5C -> �, 5D -> �, 7B to 7D for the lower
case �, �, �. I can't remember which Hex values were used
for � and �.
But that let us without [, \, ], {, |, }, an two other signs.

To overcome this most IT people ignored the ISO set and
used plain ASCII, substituting � with Ae, � with ae and so on.
In cases where it mattered, you had to use a printer with at
least 2 built-in character sets - same font, but one ASCII, the
other german ISO. Into the text to print you had to insert
non-printable control codes to switch between the two sets.
So you could print [ and � in the same text. Usually the control
codes SO (shift-out) = Hex0E and SI (shift-in) = Hex0F were
used for switching between the sets.
E.g. Hex 5D is ] in ISO-646, � in ISO-646-DE, � in
ISO-646-DK.

Then came IBM with its PC and its extended ASCII (=8bit,
no parity) character set and finally the code pages (437, 850, ...)
while the non-IBM mainframes and minicomputers still used 7-bit
character sets. IBM mainframes used EBCDIC.

Helmut.

From: Dee Earley on 26 May 2010 06:19

On 25/05/2010 21:47, Helmut Meukel wrote:
> "Dee Earley" <dee.earley(a)icode.co.uk> schrieb im Newsbeitrag
> news:ewwI6BA$KHA.4652(a)TK2MSFTNGP06.phx.gbl...
>> On 23/05/2010 10:35, Jeff Caton wrote:
>>> I have to make an import function for a different program whose string
>>> encoding/ decoding I don't really understand yet.
>>> Some parts of the encoded string is ASCII, but some are not.
>>> For example the German character "�" is encoded by 5C (I looked it up in
>>> a hex editor), which would be 92. I don't have any idea which encoding
>>> they could have used to get the value 92 for this character.
>>> Any ideas?
>>
>> 92 is decimal, 5C is the hex value of 92.
>> Both of these however are the \ character.
>
> Dee, didn't you read my answer?

Not until after my reply, but I was more making the point that the 92
and 5C were the same value but a different baser which no one else had
mentioned.

Rereading the question, it seems I completely missed the point... :p

--
Dee Earley (dee.earley(a)icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)

First | Prev |
Pages: 1 2
Prev: Ensuring files are copied correctly
Next: ADO versus DAO and Windows 2000