From: Robert Klemme on 27 Jul 2010 17:16 On 27.07.2010 18:47, Mike Pe wrote: > Robert Klemme wrote: >> 2010/7/24 Mike Pe<mikep123(a)gmail.com>: >>>>> puts doc.root.attributes["test"] --> �nil >>>> Can you show what exactly you did? >>> The issue is that the first line of my input file: >>> >>> <?xml version="1.0" encoding="UTF-16"?> >>> >>> Causes the file to be read as an "xml application". Basically, I just >>> want to be able to use REXML to parse out this xml file, but it does not >>> parse properly with this line in the beginning of my input file. >>> (otherwise it works fine). >> >> Please provide the code you are using so others can try this out >> themselves. I asked for this already (see above). >> >>> I tried converting the files using iconv commands from your link, but it >>> UTF-16 and UTF-8, the same error occurs, without regard for format. >>> >>> Why is this line interfering with the parser and how would I fix it? >>> Thank you for your help. >> >> It seems there is no UTF-16 support: >> >> irb(main):009:0> f=File.open "x", "r:UTF-16" >> (irb):9: warning: Unsupported encoding UTF-16 ignored >> => #<File:x> >> >> So there is no point in trying to import a UTF-16 encoded file in Ruby. > As for the code that I am using, I simplified the code in my original > post. The first line: > > doc = REXML::Document.new error What is "error"? How do you obtain it? > Should parse in the XML document and recognize all of the roots, > elements, attributes, etc. from the input document. > > i.e.: > puts doc.root.attributes["test"] > > Should return "yes" because the attribute in the error xml file (see > above) is "yes. With the extra line, it puts "nil". (because the parser > did not do its job). > > I tried converting all of the files to UTF-8 and they still did not > work. (If you remove the extra line, it does work) I do not think the > problem with is in the unicode. Hmm... robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
From: Mike Pe on 28 Jul 2010 02:23 Robert Klemme wrote: > On 27.07.2010 18:47, Mike Pe wrote: >>>> parse properly with this line in the beginning of my input file. >>> >>> It seems there is no UTF-16 support: >>> >>> irb(main):009:0> f=File.open "x", "r:UTF-16" >>> (irb):9: warning: Unsupported encoding UTF-16 ignored >>> => #<File:x> >>> >>> So there is no point in trying to import a UTF-16 encoded file in Ruby. > >> As for the code that I am using, I simplified the code in my original >> post. The first line: >> >> doc = REXML::Document.new error > > What is "error"? How do you obtain it? By "error", I meant my file called error from my first post: error = <<EOF <?xml version="1.0" encoding="UTF-16"?> <document test="yes"> </document> EOF > >> I tried converting all of the files to UTF-8 and they still did not >> work. (If you remove the extra line, it does work) I do not think the >> problem with is in the unicode. > > Hmm... > > robert -- Posted via http://www.ruby-forum.com/.
From: brabuhr on 28 Jul 2010 09:49 >>>>> Can you show what exactly you did? >>> >>> Please provide the code you are using so others can try this out >>> themselves. I asked for this already (see above). Could you provide a link to a zip file that contains an original input that fails, a re-encoded input file that fails, and an input file that does not fail and a script that loads them? Or, provide a more detailed step-by-step of what you did, e.g.: # poke at the original file to see what it looks like ls -l orig-utf16.xml file orig-utf16.xml wc -c orig-utf16.xml enca orig-utf16.xml head orig-utf16.xml # convert the file iconv -t UTF8 -f UTF16 < orig-utf16.xml > new-utf8.xml # poke at the new file to see what it looks like ls -l new-utf8.xml file new-utf8.xml wc -c new-utf8.xml enca new-utf8.xml head new-utf8.xml # load the files in the script cat rexmltest.rb ruby rexmltest.rb old-utf16.xml ruby rexmltest.rb new-utf8.xml Thanks.
First
|
Prev
|
Pages: 1 2 Prev: HOW CAN I HACK $5000 FROM PAYPAL WATCH VIDEO. Next: [ANN] launchy 0.3.7 Released |