Prev: [C++ builder] Problem with TShape
Next: gmtime unsafe
From: Yoavo on 25 Jan 2010 05:03 Hi, We are trying to read an XML file as text file (in "C"). The problem is when the text is unicode - we get garbage. (for ASCII file it works OK). Here is our "C" code: ---------------------------- FILE *fp; TCHAR buff[1000]; fp = _tfopen(_T("C:\\IniTest\\test.xml"), _T("r, ccs=UNICODE")); if (fp != NULL) { _fgetts(buff, 1000, fp) ; fclose(fp); } Thanks, Yoav
From: r_z_aret on 25 Jan 2010 11:36 On Mon, 25 Jan 2010 12:03:06 +0200, "Yoavo" <yoav(a)cimatron.co.il> wrote: >Hi, > >We are trying to read an XML file as text file (in "C"). >The problem is when the text is unicode - we get garbage. >(for ASCII file it works OK). > >Here is our "C" code: >---------------------------- > > FILE *fp; > TCHAR buff[1000]; > > fp = _tfopen(_T("C:\\IniTest\\test.xml"), _T("r, ccs=UNICODE")); > if (fp != NULL) > { > _fgetts(buff, 1000, fp) ; > fclose(fp); > } > If UNICODE is defined when you compile, then TCHAR will be WCHAR (UNICODE), and your program as written will read UNICODE files, but not ASCII files If UNICODE is not defined when you compile, then TCHAR (and thus buff) will be char (ASCII), and your program as written will read ASCII files, but not UNICODE files. You have two choices: 1) Explicitly declare a char array that you use for ASCII files _and_ a WCHAR file that you use for UNICODE files 2) Declare only one, and then use MultiByteToWideChar or WideCharToMultiByte to translate as needed. Determining whether a file is ASCII or UNICODE may be tricky. If the file is UNICODE and follows convention, it will start with a BOM (byte order marker). If you don't understand UNICODE, TCHAR, etc., I _strongly_ recommend taking time to learn. Other wise you will waste a lot of your time tracking down strange systems. You can start by using google (http://groups.google.com/advanced_search?q=&) to look up byte order mark in this newsgroup. I just did, and got 14 hits, at least some of which look useful. > >Thanks, > >Yoav ----------------------------------------- To reply to me, remove the underscores (_) from my email address (and please indicate which newsgroup and message). Robert E. Zaret, eMVP PenFact, Inc. 20 Park Plaza, Suite 400 Boston, MA 02116 www.penfact.com Useful reading (be sure to read its disclaimer first): http://catb.org/~esr/faqs/smart-questions.html
|
Pages: 1 Prev: [C++ builder] Problem with TShape Next: gmtime unsafe |