From: born in USSR on 27 Jun 2010 08:33 I have string: '\u041f\u0440\u0438\u0432\u0435\u0442!' and i need to convert it to string such as 'привет!'. I can convert string to '041f 0440 0438 0432 0435 0442', then convert to decimal and at the end convert each code to character with function: > str.scan(/[0-9]+/).each {|x| result_str << x.to_i} but i don't think that it is the most rational way. -- Posted via http://www.ruby-forum.com/.
From: Justin Collins on 27 Jun 2010 14:38 On 06/27/2010 05:33 AM, born in USSR wrote: > I have string: '\u041f\u0440\u0438\u0432\u0435\u0442!' and i need to > convert it to string such as 'привет!'. > I can convert string to '041f 0440 0438 0432 0435 0442', then convert to > decimal and at the end convert each code to character with function: > > >> str.scan(/[0-9]+/).each {|x| result_str<< x.to_i} >> > but i don't think that it is the most rational way. > irb(main):001:0> RUBY_VERSION => "1.9.1" irb(main):002:0> puts '\u041f\u0440\u0438\u0432\u0435\u0442!' \u041f\u0440\u0438\u0432\u0435\u0442! => nil irb(main):003:0> puts "\u041f\u0440\u0438\u0432\u0435\u0442!" Привет! => nil Note the difference in single quotes versus double quotes. -Justin
From: Gary Wright on 28 Jun 2010 00:22 On Jun 27, 2010, at 8:33 AM, born in USSR wrote: > I have string: '\u041f\u0440\u0438\u0432\u0435\u0442!' and i need to > convert it to string such as 'пÑивеÑ!'. > I can convert string to '041f 0440 0438 0432 0435 0442', then convert to > decimal and at the end convert each code to character with function: If I understand you correctly you can leverage Ruby's parser to interpret your string literal: irb> x = '\u041f\u0440\u0438\u0432\u0435\u0442!' => "\\u041f\\u0440\\u0438\\u0432\\u0435\\u0442!" irb> eval("\"#{x}\"") => "ÐÑивеÑ!" Be careful though with eval, make sure your string to be evaluated doesn't contain any untrusted code. Gary Wright
From: Markus Schirp on 28 Jun 2010 00:39 I think the JSON parser is able to decode this unicode escapes correctly! The JSON parser will not decode an pure string to you have to wrap the string into array syntax, and extract after parsing: mbj(a)mbj ~ $ irb irb(main):001:0> require 'json' => true irb(main):002:0> x = '\u041f\u0440\u0438\u0432\u0435\u0442!' => "\\u041f\\u0440\\u0438\\u0432\\u0435\\u0442!" irb(main):003:0> JSON.parse('["'+x+'"]')[0] => "Привет!" irb(main):004:0> IMHO better than eval ;) On Mon, Jun 28, 2010 at 01:22:33PM +0900, Gary Wright wrote: > > On Jun 27, 2010, at 8:33 AM, born in USSR wrote: > > > I have string: '\u041f\u0440\u0438\u0432\u0435\u0442!' and i need to > > convert it to string such as 'привет!'. > > I can convert string to '041f 0440 0438 0432 0435 0442', then convert to > > decimal and at the end convert each code to character with function: > > If I understand you correctly you can leverage Ruby's parser to > interpret your string literal: > > irb> x = '\u041f\u0440\u0438\u0432\u0435\u0442!' > => "\\u041f\\u0440\\u0438\\u0432\\u0435\\u0442!" > irb> eval("\"#{x}\"") > => "Привет!" > > Be careful though with eval, make sure your string to be evaluated doesn't contain any untrusted code. > > Gary Wright
From: Benoit Daloze on 28 Jun 2010 09:24
On 28 June 2010 07:39, Markus Schirp <mbj(a)seonic.net> wrote: > IMHO better than eval ;) str = '\u041f\u0440\u0438\u0432\u0435\u0442!' p str.gsub(/\\u(\h{4})/) { $1.to_i(16).chr('UTF-8') } What do you say of this? Well, I was searching something in the line of String#unpack, like p str.gsub(/\\u(\h{4})/) { [$1.to_i(16)].pack('U') } but as we are scanning one by one, it is not interesting and need an extra array like in JSON (but it is 1.8 compatible). B.D. |