From: Priyank Shah on 5 Aug 2010 02:55 hi, I read html file using nokogiri. and its work fine. But after read when i print it, it shows me unknown charater like "┬á" in place of <somestarting>hello </somecomplete> so it looks like "hello┬á". it create problem bcoz of   and ending tag. If any know about its solution please help. Thanks, Priyank Shah -- Posted via http://www.ruby-forum.com/.
From: Brian Candler on 5 Aug 2010 04:08 Try using p str or puts str.inspect or puts str.bytes.to_a.inspect to get a better look at what character codes are in there. -- Posted via http://www.ruby-forum.com/.
From: Priyank Shah on 5 Aug 2010 04:18 Brian Candler wrote: > Try using > p str > or > puts str.inspect > or > puts str.bytes.to_a.inspect > > to get a better look at what character codes are in there. Hi Thanks for reply, But it is not useful for me if i use inspect it convert "hello\302\240" i want simple space. Thanks, Priyank Shah -- Posted via http://www.ruby-forum.com/.
From: Brian Candler on 5 Aug 2010 09:38 Priyank Shah wrote: > But it is not useful for me if i use inspect it convert "hello\302\240" That is useful. It shows that the has been converted into the sequence \302\240 (octal) or \xc2\xa0 (hex) That happens to be the code for a non-breaking space in UTF-8, codepoint 160: $ irb19 >> 160.chr("UTF-8") => " " >> 160.chr("UTF-8").bytes.to_a => [194, 160] >> 160.chr("UTF-8").force_encoding("ASCII-8BIT") => "\xC2\xA0" So the terminal you are trying to print it to is non-UTF-8. Perhaps a Windows box? You didn't say what your platform was. In that case, you need to re-encode it to the appropriate character set. -- Posted via http://www.ruby-forum.com/.
|
Pages: 1 Prev: Ruby scrips getting executed in a loop. Next: Data::Dumper Request |