Prev: How to POST Form Data?
Next: name correspondence
From: Rolf Pedersen on 5 Jul 2010 04:48 [Note: parts of this message were removed to make it a legal post.] Hi I have a file with the following format (example): Save Format v3.0(19990112) @begin Libraries "felles.pbl" ""; @end; @begin Objects "n_cst_xml_utils.sru" "felles.pbl"; "n_melding.sru" "felles.pbl"; @end; The data in the two begin/end blocks are lists, which may be longer than shown. I'd like to extract an array of the filenames (first quote) in the @begin Objects ... @end; block. For the example above this should return ["n_cst_xml_utils.sru", "n_melding.sru"] My initial idea was to treat the whole thing as one long string, and extract the part within the being-end-block by using regexp, converting the result back to individual lines (split '\n') and doing array.map and regexp to single out the name in the first quote on each line. But I keep hitting the wall, especially with the first step in this approach... :o( I know this should be easily done in a couple of lines of code, but I can't get it right. Appreciate any help! Best regards, Rolf
From: Brian Candler on 5 Jul 2010 06:12 Rolf Pedersen wrote: > My initial idea was to treat the whole thing as one long string, and > extract > the part within the being-end-block by using regexp, converting the > result > back to individual lines (split '\n') and doing array.map and regexp to > single out the name in the first quote on each line. > But I keep hitting the wall, especially with the first step in this > approach... :o( How about this for starters: p src.scan(/^@begin(.*?)^@end;/m) -- Posted via http://www.ruby-forum.com/.
From: Rolf Pedersen on 5 Jul 2010 07:52 [Note: parts of this message were removed to make it a legal post.] Thanks Brian, that helped me a lot ! :o) The code now looks like this: filenames = File.open(filename).readlines.join.scan(/^@begin Objects\n(.*?)^@end;/m)[0][0].split("\n").map{|l| l.scan(/"(.*?)"/)[0][0]} Probably far from optimal, but it seems to do the trick. Best regards, Rolf On Mon, Jul 5, 2010 at 12:12 PM, Brian Candler <b.candler(a)pobox.com> wrote: > Rolf Pedersen wrote: > > My initial idea was to treat the whole thing as one long string, and > > extract > > the part within the being-end-block by using regexp, converting the > > result > > back to individual lines (split '\n') and doing array.map and regexp to > > single out the name in the first quote on each line. > > But I keep hitting the wall, especially with the first step in this > > approach... :o( > > How about this for starters: > > p src.scan(/^@begin(.*?)^@end;/m) > -- > Posted via http://www.ruby-forum.com/. > >
From: Brian Candler on 5 Jul 2010 08:07 Rolf Pedersen wrote: > The code now looks like this: > > filenames = File.open(filename).readlines.join.scan(/^@begin > Objects\n(.*?)^@end;/m)[0][0].split("\n").map{|l| > l.scan(/"(.*?)"/)[0][0]} > Probably far from optimal, but it seems to do the trick. That's the most important thing :-) I actually misread your example. If there's only one @begin Objects section, then 'scan' is overkill; a simple regexp match will do. res = if File.read(filename) =~ /^@begin Objects$(.*?)^@end;$/m $1.scan(/^\s*"(.*?)"/).map { |r| r.first } end -- Posted via http://www.ruby-forum.com/.
From: Robert Klemme on 5 Jul 2010 11:51
2010/7/5 Brian Candler <b.candler(a)pobox.com>: > Rolf Pedersen wrote: >> The code now looks like this: >> >> filenames = File.open(filename).readlines.join.scan(/^@begin >> Objects\n(.*?)^@end;/m)[0][0].split("\n").map{|l| >> l.scan(/"(.*?)"/)[0][0]} >> Probably far from optimal, but it seems to do the trick. > > That's the most important thing :-) > > I actually misread your example. If there's only one @begin Objects > section, then 'scan' is overkill; a simple regexp match will do. > > res = if File.read(filename) =~ /^@begin Objects$(.*?)^@end;$/m > $1.scan(/^\s*"(.*?)"/).map { |r| r.first } > end If files are large than the line based approach is usually more feasible. In this case you can use the flip flop operator in an if condition to select the lines we want: 17:31:49 Temp$ ./lextr.rb ["n_cst_xml_utils.sru", "n_melding.sru"] 17:48:32 Temp$ cat lextr.rb #!/bin/env ruby19 ar = [] DATA.each_line do |line| if /^@begin Objects/ =~ line .. /^end;/ =~ line name = line[/^\s*"([^"]*)"/, 1] and ar << name end end p ar __END__ Save Format v3.0(19990112) @begin Libraries "felles.pbl" ""; @end; @begin Objects "n_cst_xml_utils.sru" "felles.pbl"; "n_melding.sru" "felles.pbl"; @end; 17:49:30 Temp$ Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ |