From: Simon on 26 Mar 2010 03:15 Thanks for all the replies. In the end I looked at the way notepad++ reads the files, as Mihai N. mentioned, they read the file in 'rb' and then call MultiByteToWideChar( .... ) because the file is read in Bytes they have various functions to check the file format, (UTF-8, UTF-16, ascci and so forth). Simon
From: Simon on 26 Mar 2010 03:18 > > if you on a Japanese system > probably UTF-8 or Shift-JIS (cp932) > else > probably UTF-8 Thanks for the replies, How do I know if I am on a Japanese system??? and even if I know, (using the local and so forth), how can I test if it is UTF-8 or Shift-JIS (cp932)? if it is UTF-8 I can, (now), read it properly, (using MultiByteToWideChar). But how can I convert 'read' Shift-JIS (cp932) and convert to wide char accordingly? > > So load the file as bytes, then use MultiByteToWideChar. Many thanks Simon
From: Giovanni Dicanio on 26 Mar 2010 08:25 "Simon" <bad(a)example.com> ha scritto nel messaggio news:OnxYESLzKHA.5040(a)TK2MSFTNGP02.phx.gbl... > But how can I convert 'read' Shift-JIS (cp932) and convert to wide char > accordingly? You can use MultiByteToWideChar with proper code page identifier: http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx Giovanni
From: Tom Serface on 26 Mar 2010 12:27 If your file is UTF-8 or Unicode and you are reading into Unicode for the memory string it shouldn't matter what kind of system you are on since the codepage would no longer be an issue. To test a file type you should check the Byte Order Mark (BOM) which is the first two or three bytes in the file: #define UTF8_BOM "\xef\xbb\xbf" // UTF-8 file "byte order mark" which goes at start of file #define UTF8_BOM_SIZE 3 #define UTF16_LE_BOM "\xff\xfe" // Unicode "byte order mark" which goes at start of file #define UTF16_BOM_SIZE 2 Tom "Simon" <bad(a)example.com> wrote in message news:OnxYESLzKHA.5040(a)TK2MSFTNGP02.phx.gbl... >> >> if you on a Japanese system >> probably UTF-8 or Shift-JIS (cp932) >> else >> probably UTF-8 > > Thanks for the replies, > > How do I know if I am on a Japanese system??? > and even if I know, (using the local and so forth), how can I test if it > is UTF-8 or Shift-JIS (cp932)? > > if it is UTF-8 I can, (now), read it properly, (using > MultiByteToWideChar). > > But how can I convert 'read' Shift-JIS (cp932) and convert to wide char > accordingly? > >> >> So load the file as bytes, then use MultiByteToWideChar. > > Many thanks > > Simon
First
|
Prev
|
Pages: 1 2 Prev: How to draw custom text on Title bar like "Send Feedback" on windows 7 beta Next: Deployment |