From: Dennis Nedry on 16 Jun 2010 11:40 I have a routine for converting ansi with "extended" ibm characters to html. It is as follows... EXTENDED_ANSI_TABLE = { 227.chr => "<br>", 32.chr => " ", 128.chr => "Ç", #128 C, cedilla (199) 129.chr => "ü", #129 u, umlaut (252) 130.chr => "é", #130 e, acute accent (233) 131.chr => "â", #131 a, circumflex accent (226) 132.chr => "ä", #132 a, umlaut (228) 133.chr => "à", #133 a, grave accent (224) 134.chr => "å", #134 a, ring (229) 135.chr => "ç", #135 c, cedilla (231) 136.chr => "ê", #136 e, circumflex accent (234) 137.chr => "ë", #137 e, umlaut (235) 138.chr => "è", #138 e, grave accent (232) 139.chr => "ï", #139 i, umlaut (239) 140.chr => "î", #140 i, circumflex accent (238) 141.chr => "ì", #141 i, grave accent (236) #big huge list continues for pages... } def parse_ansi_ext(str) EXTENDED_ANSI_TABLE.each_pair {|color, result| str = str.gsub(color,result) } return str end This worked in 1.8, no problem. If the input contains a character above 127.chr, it now bombs with the error: "Encoding::CompatibilityError at / incompatible encoding regexp match (ASCII-8BIT regexp with ISO-8859-1 string)" I've tried various acts of desperation to fix it, to no avail. I don't understand exactly what is wrong... Thanks, Dennis
From: Dennis Nedry on 17 Jun 2010 09:49 On Wed, Jun 16, 2010 at 6:30 PM, Michael Fellinger <m.fellinger(a)gmail.com> wrote: > > str has the encoding ISO-8859-1, probably inherited from your system locale. > Convert it to ASCII-8BIT before processing it. > > http://blog.grayproductions.net/articles/ruby_19s_string Thanks, that worked. I guess we should always specify file encoding from now on. Take Care, mark -- "I've got ham but I'm not a hamster." -Bill Bailey
|
Pages: 1 Prev: Input/Output to IRB using Pipes Next: Fueling your car with natural gas from home |