Prev: Convert word document to JPEG. The word doc may contain heade
Next: Posso guardar as copias de segurança automáticas em pasta diferen.
From: Chand on 31 Dec 2009 03:09 I am developing unicode to general text program in VSTO (VB.NET). Iam working on Gurmukhi/Punkabi unicode i.e. Raavi. There are many chars in it which are formed ny combining two or more diferant chars like ਕ + ਿ = ਕਿ or ਕ + ਿ + ਂ = ਕਿਂ. Earlier i ws not able to search any single instance of ਿ or ਂ but later i tried use wildcard option along with [ਿ] . This char is sarched perfectly but while replacing it with general char "i" replaces whole "ਕਿ" or "ਕਿਂ" with single char "i". Can anyone help How can i search single chars and replace with some other chars to normalize Unicode to General font convertion ? ? ?
From: Klaus Linke on 5 Jan 2010 12:06 "Chand" <Chand(a)discussions.microsoft.com> wrote: > I am developing unicode to general text program in VSTO (VB.NET). > Iam working on Gurmukhi/Punkabi unicode i.e. Raavi. > There are many chars in it which are formed ny combining two or more > diferant chars like ਕ + ਿ = ਕਿ or ਕ + ਿ + ਂ = ਕਿਂ. > Earlier i ws not able to search any single instance of ਿ or ਂ > but later i tried use wildcard option along with [ਿ] . That's a problem I know from diacritics in other languages. The diacritic alone, or the letter it's combined with alone, isn't found. The work-around I use is the same as yours. I've filed a bug report years ago, but didn't hear back if it's going to be fixed. > This char is sarched perfectly but while replacing it with general > char "i" replaces whole "ਕਿ" or "ਕਿਂ" with single char "i". > Can anyone help How can i search single chars and replace > with some other chars to normalize Unicode to General font > convertion ? ? ? I'm not sure from your description what you want to replace with what. Since you're doing a wildcard replacement, you can re-use anything matched if you put parentheses around it in "Find what", and then use the appropriate placeholder in "Replace with" (\1 for the first expression in parentheses, \2 for the second...). I have no idea about Gurmukhi/Punkabi unicode. In other scripts using ligatures and diacritics like say Arabic, the ligatures form automatically if a well-designed font is used. The glyphs for ligatures may be only in Unicode for compatibility reasons -- because old fonts don't do the ligatures, or because old files used the ligatures since fonts back when didn't do them automatically. So maybe ask in a group with knowledgeable people (say microsoft.public.word.international.features) if the replacements you are trying to make are sensible, or if instead you can use a font that handles the ligatures automatically? Regards, Klaus
From: Rich B. Rich on 11 Jan 2010 23:15
I'm unsure if this addresses the problem, but the code that I wrote to replace both ANSI and Unicode character strings with ligatures required: 1. Specify match case (prevents character variants from matching) 2. Exclude small caps 3. Enable format matching (to detect caps and bold/italic properly) Cheers "Klaus Linke" wrote: > "Chand" <Chand(a)discussions.microsoft.com> wrote: > > I am developing unicode to general text program in VSTO (VB.NET). > > Iam working on Gurmukhi/Punkabi unicode i.e. Raavi. > > There are many chars in it which are formed ny combining two or more > > diferant chars like ਕ + ਿ = ਕਿ or ਕ + ਿ + ਂ = ਕਿਂ. > > Earlier i ws not able to search any single instance of ਿ or ਂ > > but later i tried use wildcard option along with [ਿ] . > > That's a problem I know from diacritics in other languages. The diacritic > alone, or the letter it's combined with alone, isn't found. > The work-around I use is the same as yours. > I've filed a bug report years ago, but didn't hear back if it's going to be > fixed. > > > This char is sarched perfectly but while replacing it with general > > char "i" replaces whole "ਕਿ" or "ਕਿਂ" with single char "i". > > Can anyone help How can i search single chars and replace > > with some other chars to normalize Unicode to General font > > convertion ? ? ? > > I'm not sure from your description what you want to replace with what. > Since you're doing a wildcard replacement, you can re-use anything matched > if you put parentheses around it in "Find what", and then use the > appropriate placeholder in "Replace with" (\1 for the first expression in > parentheses, \2 for the second...). > > I have no idea about Gurmukhi/Punkabi unicode. In other scripts using > ligatures and diacritics like say Arabic, the ligatures form automatically > if a well-designed font is used. The glyphs for ligatures may be only in > Unicode for compatibility reasons -- because old fonts don't do the > ligatures, or because old files used the ligatures since fonts back when > didn't do them automatically. > So maybe ask in a group with knowledgeable people (say > microsoft.public.word.international.features) if the replacements you are > trying to make are sensible, or if instead you can use a font that handles > the ligatures automatically? > > Regards, > Klaus > > . > |