Prev: Send an email
Next: Weird mouse behavior
From: Rick Rothstein on 12 Feb 2010 17:59 Assuming any characters above ASCII 255 in a text string makes the text non-English, then does something like this work (note that is a space character after the exclamation point)? If StringValue Like "*[! -" & chr$(255) & "]*" Then ' Non-English characters present Else ' All text are English characters End If -- Rick (MVP - Excel) "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl... > Jeff Johnson wrote: >> "Phil Hunt" <aaa(a)aaa.com> wrote in message >> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl... >> >>> Thanks. I basically have to examine the bit patterns to determine. >>> I understand the ASCII, it is the Unicode I have some trouble >>> with. I know it is 16 bits insteads of 8. But in VB/debug window, >>> I have never been able to see a 16 bits character, maybe it does >>> not display on the screen. Do you know what i am talking ? >>> For the character 'A', how can I see the full 16 bits pattern in >>> VB ? >> >> I believe you can use the AscW() function to find this. If you get >> a value back > 255, I'd say you can safely assume it's a >> non-English character. > > Not even close. To prove that, in the Immediate Window: > > For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next > > -- > Jim Mack > Twisted tees at http://www.cafepress.com/2050inc > "We sew confusion"
From: Jim Mack on 12 Feb 2010 18:04 Phil Hunt wrote: > Ok, I 'll make it > 128 Maybe you missed the point. There are quite a few "English Characters" for which AscW() will return results > 128, or 255, etc. The same is true of any Ansi character set under Windows, both SBCS and MBCS. -- Jim Mack Twisted tees at http://www.cafepress.com/2050inc "We sew confusion" > > "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message > news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl... >> Jeff Johnson wrote: >>> "Phil Hunt" <aaa(a)aaa.com> wrote in message >>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl... >>> >>>> Thanks. I basically have to examine the bit patterns to >>>> determine. I understand the ASCII, it is the Unicode I have some >>>> trouble with. I know it is 16 bits insteads of 8. But in >>>> VB/debug window, I have never been able to see a 16 bits >>>> character, maybe it does not display on the screen. Do you know >>>> what i am talking ? For the character 'A', how can I see the >>>> full 16 bits pattern in VB ? >>> >>> I believe you can use the AscW() function to find this. If you get >>> a value back > 255, I'd say you can safely assume it's a >>> non-English character. >> >> Not even close. To prove that, in the Immediate Window: >> >> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next >> >> -- >> Jim Mack >> Twisted tees at http://www.cafepress.com/2050inc >> "We sew confusion"
From: Phil Hunt on 12 Feb 2010 18:24 Rick, I think your code would work. I just have to look up 'Like', never used it, but seems handy Jim, I think I missed your point. I used to memorize EBCDII codes. Since I move over to PC, bit pattern is thing of the past for me, until now. Thanks, I take a closer look Monday. "Rick Rothstein" <rick.newsNO.SPAM(a)NO.SPAMverizon.net> wrote in message news:uDC2NcDrKHA.732(a)TK2MSFTNGP06.phx.gbl... > Assuming any characters above ASCII 255 in a text string makes the text > non-English, then does something like this work (note that is a space > character after the exclamation point)? > > If StringValue Like "*[! -" & chr$(255) & "]*" Then > ' Non-English characters present > Else > ' All text are English characters > End If > > -- > Rick (MVP - Excel) > > > "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message > news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl... >> Jeff Johnson wrote: >>> "Phil Hunt" <aaa(a)aaa.com> wrote in message >>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl... >>> >>>> Thanks. I basically have to examine the bit patterns to determine. >>>> I understand the ASCII, it is the Unicode I have some trouble >>>> with. I know it is 16 bits insteads of 8. But in VB/debug window, >>>> I have never been able to see a 16 bits character, maybe it does >>>> not display on the screen. Do you know what i am talking ? >>>> For the character 'A', how can I see the full 16 bits pattern in >>>> VB ? >>> >>> I believe you can use the AscW() function to find this. If you get >>> a value back > 255, I'd say you can safely assume it's a >>> non-English character. >> >> Not even close. To prove that, in the Immediate Window: >> >> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next >> >> -- >> Jim Mack >> Twisted tees at http://www.cafepress.com/2050inc >> "We sew confusion" >
From: Helmut Meukel on 12 Feb 2010 19:04 Phil, just run charmap.exe It shows the hex code of the selected character. I looked at the Win2000 and the Vista version, in the Vista version you can select more DOS code pages (extended ASCII). Both show you Unicode and a variety of Windows code pages (ANSI). AFAIK, when looking at a non-Unicode text file, you have to _know_ what code page was used to create it. IBM and Micro$oft introduced code pages with DOS 4.0 but forgot to define anything to distinguish between text coded with different code pages. Same is true for ANSI. In Unicode texts with western characters the high byte is usually Hex00. Hex00FE is the lowercase icelandic character Thom. The netherlandic ij is usually written as 2 characters, but in Unicode you can use a single character Hex0133. The trademark sign TM is Hex2122, the %o sign is Hex2030, the Peseta sign Pts is Hex20A7, c/o is Hex2105, the danish/norvegian A/S (Aktieselskab) is Hex214D. HTH Helmut. "Phil Hunt" <aaa(a)aaa.com> schrieb im Newsbeitrag news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl... > Thanks. I basically have to examine the bit patterns to determine. > I understand the ASCII, it is the Unicode I have some trouble with. I know it > is 16 bits insteads of 8. But in VB/debug window, I have never been able to > see a 16 bits character, maybe it does not display on the screen. Do you know > what i am talking ? > For the character 'A', how can I see the full 16 bits pattern in VB ? > > "Helmut Meukel" <NoSpam(a)NoProvider.de> wrote in message > news:uW6t4oBrKHA.4636(a)TK2MSFTNGP06.phx.gbl... >> >> "Phil Hunt" <aaa(a)aaa.com> schrieb im Newsbeitrag >> news:OthUFOBrKHA.4220(a)TK2MSFTNGP05.phx.gbl... >>> Ok. Forget French for a moment. How can i tell if the string contain >>> "Eastern Asia" character ? >>> >>> >>> "Jeff Johnson" <i.get(a)enough.spam> wrote in message >>> news:eO%236mFBrKHA.5940(a)TK2MSFTNGP02.phx.gbl... >>>> "Phil Hunt" <aaa(a)aaa.com> wrote in message >>>> news:unthJABrKHA.1796(a)TK2MSFTNGP02.phx.gbl... >>>> >>>>> What is the best way to determine if a string contains "non Eglish" >>>>> character ? >>>> >>>> That's not an easy question to answer. Consider the word "resum?" It's an >>>> English word (taken from French) but it contains an accented character that >>>> is not "native" to English. If your code encountered that word, would you >>>> want it to judge that it contains a "non-English character"? >>>> >>> >> >> Let's start with the code table. >> Characters in strings are just byte or integer values. >> In old Dos ASCII (IIRC: American Standard Code for Information >> Interchange) was used, 7 data bits + 1 parity bit. >> IBM created Extended ASCII (8 data bits, no parity bit) and used >> the doubled capacity to code some european characters and grafic >> characters (card symbols, lines...). >> This exteded ASCII became finally Code Page 437. Other code >> pages like 850 (multilingual), 865 (scandinavian) used the same >> code values for different characters. My first Vectra PC used the >> Roman8 character set, also used by HP's 250, 1000 and 3000 >> Systems. >> With Windows Microsoft switched to ANSI, still 8 bit and >> finally to Unicode (16 bit). >> >> So first you have to know how your text is coded, to determine >> which codes are used for eastern asian characters. >> >> HTH. >> >> Helmut. > >
From: Jim Mack on 12 Feb 2010 20:42
Rick Rothstein wrote: > Assuming any characters above ASCII 255 in a text string makes the > text non-English, then does something like this work (note that is > a space character after the exclamation point)? You have to distinguish AscW() results from Asc() results. If you use AscW(), you will see 'English' characters with codes > 255. Using Asc() you won't, but you may then qualify some non-English characters as English (which may be OK depending on the circumstance). I don't know if Like examines the Unicode characters... if so, then it will act the way AscW() does and fail some valid characters. -- Jim > > If StringValue Like "*[! -" & chr$(255) & "]*" Then > ' Non-English characters present > Else > ' All text are English characters > End If > > > "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message > news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl... >> Jeff Johnson wrote: >>> "Phil Hunt" <aaa(a)aaa.com> wrote in message >>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl... >>> >>>> Thanks. I basically have to examine the bit patterns to >>>> determine. I understand the ASCII, it is the Unicode I have some >>>> trouble with. I know it is 16 bits insteads of 8. But in >>>> VB/debug window, I have never been able to see a 16 bits >>>> character, maybe it does not display on the screen. Do you know >>>> what i am talking ? For the character 'A', how can I see the >>>> full 16 bits pattern in VB ? >>> >>> I believe you can use the AscW() function to find this. If you get >>> a value back > 255, I'd say you can safely assume it's a >>> non-English character. >> >> Not even close. To prove that, in the Immediate Window: >> >> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next >> >> -- >> Jim Mack >> Twisted tees at http://www.cafepress.com/2050inc >> "We sew confusion" |