From: dududuil on 28 Apr 2010 08:56 My application reads from a file, and put the text in a CString. On JP (Japanese ) machine, the file might contain unicode characters. Although my application isn't compile with _UNICODE, CString still supports unicode characters, and I can do the simple task of Replace a certin string. After this Replace, I print the string back to the file and notice that some characters are replaced with other characters. This is my problem.
From: Jochen Kalmbach [MVP] on 28 Apr 2010 09:04 Hi dududuil! > Although my application isn't compile with _UNICODE, CString still supports > unicode characters, No. It does not! It only support it, you you compile with _UNICODE! In the current setting it is compiled with ANSI or MBCS... so it does not know anything about unicode. > and I can do the simple task of Replace a certin string. No, you can't. Because it will replace the "character" on per "byte" basis, which has the effect you see (currupts the string). This is because CString does not know about the eoncoding of your string and it assumes ASCII, if you have not set the thread locale! If you set the thread locale, it will correctly replace the characters! -- Greetings Jochen My blog about Win32 and .NET http://blog.kalmbachnet.de/
From: Tom Serface on 28 Apr 2010 09:43 Ah... then perhaps the problem is that the string you are modifying really is a Unicode string (which CString can hold even in a non-Unicode build). Are you reading the string from a file that might actually be Unicode or UTF-8 or some other encoding? If so you may still have to use CStringW and still use the L'\r' method for the character. Tom "dududuil" <dududuil(a)discussions.microsoft.com> wrote in message news:46371EC6-C173-4870-B38D-FDBFFD270CE8(a)microsoft.com... > My Application is compiled without _UNICODE - so _T() is ignored, and > L'\r' > will shrink back to '\r' when calling the Remove
From: Ulrich Eckhardt on 28 Apr 2010 10:03 Jochen Kalmbach [MVP] wrote: > Maybe you are using UTF8 in the string but the CRT/MFC locale is "C". So > the UTF8-Multibyte characters will also treated as "normal" chars and > therefor it will remove any '\r' (0x0d) in the multibyte character. The nice thing about UTF-8 is that no ASCII byte will ever have a different meaning than the one it has for ASCII. All bytes of a multibyte character have their bit 7 set, so they are outside the ASCII range. For that reason I think we can rule out UTF-8, otherwise it should work. ;) Uli -- C++ FAQ: http://parashift.com/c++-faq-lite Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
From: Tom Serface on 28 Apr 2010 13:45 I wish that CString had better built in handling for in memory UTF-8. As it is I typically only use UTF-8 for files and convert to Unicode for memory use. It takes more memory, but makes it much easier to interact with other SDKs. Tom "Ulrich Eckhardt" <eckhardt(a)satorlaser.com> wrote in message news:arrla7-5mf.ln1(a)satorlaser.homedns.org... > Jochen Kalmbach [MVP] wrote: >> Maybe you are using UTF8 in the string but the CRT/MFC locale is "C". So >> the UTF8-Multibyte characters will also treated as "normal" chars and >> therefor it will remove any '\r' (0x0d) in the multibyte character. > > The nice thing about UTF-8 is that no ASCII byte will ever have a > different > meaning than the one it has for ASCII. All bytes of a multibyte character > have their bit 7 set, so they are outside the ASCII range. > > For that reason I think we can rule out UTF-8, otherwise it should work. > ;) > > Uli > > -- > C++ FAQ: http://parashift.com/c++-faq-lite > > Sator Laser GmbH > Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: Performance of CString::Replace Next: automatic refreshing of data in vb6.0 |