From: Robert Klemme on 13 May 2010 12:32 On 13.05.2010 16:34, Une Bévue wrote: > Robert Klemme<shortcutter(a)googlemail.com> wrote: > >> There's also the flip flop operator: >> >> File.foreach "myfile" do |line| >> if /pattern/ =~ line .. false >> puts line >> end >> end >> >> The trick I am using is that the FF operator starts to return true if >> the first expression returns true and stays true until the last >> expression returns true - in this case never since you want to read >> until the end of the file. > > coud that trick be used for start and stop tags ? like : > > File.foreach "myfile" do |line| > if /<body/ =~ line .. /<\/body/ =~ line > puts line > end > end > > if true, that's clever ! Yes, that could be done. However, I would not use this for languages from the SGML family (XML, HTML) because there are no guarantees as to how many tags you'll find on a single line of text. There are better tools do deal with that (REXML, Nokogiri...). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
From: =?ISO-8859-1?Q?Une_B=E9vue?= on 13 May 2010 14:21 Robert Klemme <shortcutter(a)googlemail.com> wrote: > Yes, that could be done. However, I would not use this for languages > from the SGML family (XML, HTML) because there are no guarantees as to > how many tags you'll find on a single line of text. There are better > tools do deal with that (REXML, Nokogiri...). Right, however REXML isn't working for badly balanced tags. I dis some test, today, of Nokogiri, it works even better than tidy for the first step cleaning unbalanced tags. the only question i have about Nokogiri is how to avoid the DOCTYPE because it outputs : <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> even if i'm using #to_xhtml : then, the DOCTYPE is wrong... -- � La vie ne se comprend que par un retour en arri�re, mais on ne la vit qu'en avant. � (S�ren Kierkegaard)
From: Rick DeNatale on 12 May 2010 13:29 On Wed, May 12, 2010 at 1:20 PM, Vandana <nairvan(a)gmail.com> wrote: > Hello All, > > I would like to read a file in ruby. It is a 2G file, but > contain useless data in the beginning portion of the file. > > There is a particular pattern towards the middle of the file after > which useful data begins. Is there a way to grep for this pattern and > then read every line henceforth, but ignore all lines previous to line > on which pattern found? Grep is going to have to read the file to find that pattern anyway. -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Github: http://github.com/rubyredrick Twitter: @RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale
From: Roger Pack on 12 May 2010 13:31 > There is a particular pattern towards the middle of the file after > which useful data begins. Is there a way to grep for this pattern and > then read every line henceforth, but ignore all lines previous to line > on which pattern found? If you don't know where it is, then you'll probably have to parse each line until you reach it, then continue on. -rp -- Posted via http://www.ruby-forum.com/.
First
|
Prev
|
Pages: 1 2 Prev: gem install icu4r error Next: How to Pump $1,000s in CASH & Checks to your door. |