Prev: require, from within lib, files located under the local path
Next: Converting file from utf-16 to utf-8
From: Jesús Gabriel y Galán on 23 Mar 2010 08:30 Another approach: irb(main):001:0> s = "car car ice ice house ice house tree" => "car car ice ice house ice house tree" irb(main):002:0> h = Hash.new(0) => {} irb(main):006:0> s.split.each {|x| h[x] += 1} => ["car", "car", "ice", "ice", "house", "ice", "house", "tree"] irb(main):007:0> h => {"ice"=>3, "house"=>2, "car"=>2, "tree"=>1} irb(main):008:0> h.sort_by {|k,v| -v} => [["ice", 3], ["house", 2], ["car", 2], ["tree", 1]] irb(main):009:0> h.sort_by {|k,v| -v}.each {|k,v| puts "#{v} #{k}"} 3 ice 2 house 2 car 1 tree With the uniq and the count you are traversing the array many times. Jesus.
From: Phrogz on 23 Mar 2010 09:37 On Mar 23, 3:47 am, Juan Gf <juan...(a)gmail.com> wrote: > Hello, I'm newbie so I apologize if my question it's stupid. I want to > write a program that counts how many times a word appears in a text. Here's another variation, just for the learning experience: irb(main):001:0> s = "car car ice ice house ice house tree" => "car car ice ice house ice house tree" irb(main):002:0> words = s.scan /\w+/ => ["car", "car", "ice", "ice", "house", "ice", "house", "tree"] irb(main):003:0> groups = words.group_by{ |word| word } => {"car"=>["car", "car"], "ice"=>["ice", "ice", "ice"], "house"=>["house", "house"], "tree"=>["tree"]} irb(main):005:0> counted = groups.map{ |word,list| [list.length,word] } => [[2, "car"], [3, "ice"], [2, "house"], [1, "tree"]] irb(main):007:0> sorted = counted.sort_by{ |count,word| [- count,word] } => [[3, "ice"], [2, "car"], [2, "house"], [1, "tree"]] irb(main):008:0> sorted.each{ |count,word| puts "%d %s" % [ count, word ] } 3 ice 2 car 2 house 1 tree => [[3, "ice"], [2, "car"], [2, "house"], [1, "tree"]] Of course you don't need all those intermediary variables if you don't want them and don't need to debug the results along the way: s.scan(/\w+/).group_by{|w| w }.map{|w,l| [l.length,w] }.sort_by{ |c,w| [-c,w] }.each{ |a| puts "%d %s" % a } But I'd really do it the way Jesús did.
From: Juan Gf on 23 Mar 2010 11:44 Thank you Jesús & Gavin, I now have enough concepts for studying this week!!! pretty amazing how many different ways of doing the same thing :) -- Posted via http://www.ruby-forum.com/.
From: Robert Klemme on 23 Mar 2010 13:23 2010/3/23 Juan Gf <juangf7(a)gmail.com>: > Thank you Jesús & Gavin, I now have enough concepts for studying this > week!!! pretty amazing how many different ways of doing the same thing > :) Welcome to the wonderful world of Ruby! Here's why: http://en.wikipedia.org/wiki/TIMTOWTDI Well, OK, that is not really an explanation - but it describes the situation with Ruby rather accurately. :-) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
From: Ryan Davis on 23 Mar 2010 19:27 On Mar 23, 2010, at 03:35 , Juan Gf wrote: > Ryan Davis wrote: >> On Mar 23, 2010, at 02:47 , Juan Gf wrote: >> >>> 2 car >>> 1 apple >>> 1 tree >>> >>> Any ideas? >> >> Read aloud what the code says, translated to natural language (English >> or otherwise, doesn't matter... just raise it to human thought level). > a = 'Apple car caR house tree ice ice ice house' > b = a.downcase.split(' ') > b.uniq.each do |element| > puts "#{b.count(element)}\t#{element}" > end > CONVERT THE TEXT IN LOWER-CASE AND THEN SPLIT THE TEXT INTO SINGLE > [a list of] WORDS! THEN COUNT [and print] HOW MANY TIMES EVERY SINGLE WORD APPEARS! I'd say that is mostly correct. You're glossing over the uniq part: Then walk over each unique word and print how many times it occurs in the list of words. >> Then say aloud what you want it to do, step by step. > > CONVERT THE TEXT IN LOWER-CASE AND THEN SPLIT THE TEXT INTO [a list of] SINGLE WORDS! THEN COUNT HOW MANY TIMES EVERY SINGLE WORD APPEARS! THEN SORT > [the list] THE RESULTS: FIRST THE MORE COMMON WORDS AND AFTER THE LESS COMMON WORDS better. >> What's the difference? > > the difference is "THEN SORT THE RESULTS: FIRST THE MORE COMMON WORDS > AND AFTER THE LESS COMMON WORDS FINALLY BLOODY COMPUTER BRING ME A > PIZZA!" > >> Translate that difference back down to code. > > I tried to use .sort like this: > > "b.uniq.each do |element| > puts "#{(b.count(element)).sort}\t#{element}" > end" > > but obviously it doesn't work. see how I modified your description to "THEN SORT [the list]"? That's what you're missing. You're not paying attention to what your each is iterating over. As others have pointed out, there are a lot of ways to do this, my favorite is to change the description to: Convert the text to lower-case and split into a list of words. Create a hash to count the words (default to 0). Enumerate the list of words and increment the hash by one for every word seen. Enumerate the hash sorted by the word counts (descending) and name (ascending) and print the word and occurances. input = 'Apple car caR house tree ice ice ice house' count = Hash.new 0 input.downcase.split(' ').each do |word| count[word] += 1 end count.sort_by { |word, count| [-count, word] }.each do |word, count| puts "%4d: %s" % [count, word] end which outputs: 3: ice 2: car 2: house 1: apple 1: tree
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: require, from within lib, files located under the local path Next: Converting file from utf-16 to utf-8 |