From: Junhui Liao on 24 Jul 2010 06:24 Dear all, Recently, I have to do this job. Re-organize the original text data then write in files. The original data is like this (tsv format). First line: time_1.1 signal_1.1 time_2.1 signal_2.1 ... time_4096.1 signal_4096.1 (total 4096 pairs). Second line: time_1.2 signal_1.2 time_2.2 signal_2.2 ... time_4096.2 signal_4096.2(total 4096 pairs). ....... last line(totally 2048 lines): time_1.2048 signal_1.2048 time_2.2048 signal_2.2048 ... time_4096.2048 signal_4096.2048 (total 4096 pairs). What shall I do is, Step 0, all of the time_n.* should subtract to the time_n.1. That is to say, time_1.1, time_1.2, ... time_1.2048 should subtract time_1.1. time_2.1, time_2.2, ... time_2.2048 should subtract time_2.1. .... time_4096.1, time_4096.2, ... time_4096.2048 should subtract time_4096.1. Step 1, make all of the time_k.* and signal_k.* in each line collected together and save in files, let's say, file_k.tsv . Namely, all of the time_1.1 , signal_1.1, time_1.2, signal_1.2 ...... time_1.2048, signal_1.2048 should save in file_1.tsv. And the first line is time_1.1 signal_1.1; the second line is time_1.2, signal_1.2 ...... the last line is time_1.2048, signal_1.2048. All of the time_2.1 , signal_2.1, time_2.2, signal_2.2 ...... time_2.2048, signal_1.2048 should save in file_2.tsv. And the first line is time_2.1 signal_2.1; the second line is time_2.2, signal_2.2 ...... the last line is time_2.2048, signal_2.2048. ...... All of the time_4096.1 , signal_4096.1, time_4096.2, signal_4096.2 ...... time_4096.2048, signal_4096.2048 should save in file_4096.tsv. And the first line is time_4096.1 signal_4096.1; the second line is time_4096.2, signal_4096.2 ...... the last line is time_4096.2048, signal_4096.2048. Already, I developed a script in C++, but it cost around 3 hours to deal with this job. And I am totally new guy to ruby, perl, a little on Python. So, my question is, 1, how many time it will be cost to do this job under ruby? If the time less than one and a half hours, then it worth to study for me. I was attracted by the beautiful ruby, already : ) . 2, Is there any similar example ? Best regards ! Junhui -- Posted via http://www.ruby-forum.com/.
From: Colin Bartlett on 25 Jul 2010 17:30 [Note: parts of this message were removed to make it a legal post.] On Sat, Jul 24, 2010 at 11:24 AM, Junhui Liao <junhui.liao(a)uclouvain.be>wrote: > ... > Already, I developed a script in C++, but it cost around 3 hours to deal > with this job. > And I am totally new guy to ruby, perl, a little on Python. > So, my question is, > > 1, how many time it will be cost to do this job under ruby? > If the time less than one and a half hours, then it worth to study for > me. I was attracted by the beautiful ruby, already : ) . I'm neither a Ruby expert nor an expert programmer, but I have been using Ruby (for my own purposes) for over 8 years, and as a thought exercise I tried this (not actually running anything), and it took me about 20 to 30 minutes, *provided* the computer memory is big enough to hold all the data. (I couldn't think of an easy way to what I think you want to do without reading in all the data first, modifying it, then writing it out. That, or open 4096 files at the same time: neither way seems elegant.) And if you can do that in C++ then I'm sure you can probably do it in Ruby, Perl, Python, etc, etc. If you can program in C++ then I see no reason why you wouldn't be able to program in Ruby, Perl, Python, etc. (It might look like C++ rewritten in R, P, P, etc, but so what if you're trying things out.) Personally, if I didn't have much time, and I wanted to try something out in another computer language, I'd go with a language that I knew a little about, so in my case that would be Ruby, Pascal, Qbasic (!!!), and - in your case - maybe try something quick in Python. (But I'd also encourage you to look at Ruby sometime and try it.) Maybe it partly depends on what standard methods/functions are available: for example, in Ruby you can read a line from a file into a String, and then use a builtin method on the String to split it into an array of values using a specified delimiter, so in your case a space character? But I'd be very surprised if there weren't similar builtins in Perl and Python.
From: Junhui Liao on 25 Jul 2010 18:15 > (I couldn't think of an easy way to what I think you want to do without > reading in all the data first, modifying it, then writing it out. That, > or > open 4096 files at the same time: neither way seems elegant.) Actually, I developed two versions of C++ script. One is opening 4096 files at the same time. This cost 3 hours. Another version is saving all of the data in a big vector, then scanning the vector to pick the right items to write in files. This cost 2 hours and 45 minutes. :-). > > Personally, if I didn't have much time, and I wanted to try something > out in > another computer language, I'd go with a language that I knew a little > about, so in my case that would be Ruby, Pascal, Qbasic (!!!), and - in > your > case - maybe try something quick in Python. (But I'd also encourage you > to > look at Ruby sometime and try it.) Thanks a lot for your encourage, I tried to read something on ruby already. Since this language is very simple and beautiful, no matter it works for my case or not(But I hope it could be). > > Maybe it partly depends on what standard methods/functions are > available: > for example, in Ruby you can read a line from a file into a String, and > then > use a builtin method on the String to split it into an array of values > using > a specified delimiter, so in your case a space character? I need this kind of comment seriously, saying, what are the knowledge which is necessary and enough to do my job. If there are some special and powerful methods or stances to do this kind of stuff. Or be better, give a example just very close my case. I can get the detailed by reading book(s) or googling. Anyway, thanks a lot for your reply! Best ! Junhui -- Posted via http://www.ruby-forum.com/.
From: Colin Bartlett on 27 Jul 2010 18:07 I'm putting this at the top of my post because I think the basic problem here may be intensive numeric calculations, and - even more so - disk (input and) output of about 16 MiB x N bytes of data, where N is 8 bytes (? for Floating point numbers), so about 128 MiB in total, and other people will have a better knowledge of some possibly useful links. On Sun, Jul 25, 2010 at 11:15 PM, Junhui Liao <junhui.liao(a)uclouvain.be>wrote: > Actually, I developed two versions of C++ script. > One is opening 4096 files at the same time. This cost 3 hours. > Another version is saving all of the data in a big vector, > then scanning the vector to pick the right items to write > in files. This cost 2 hours and 45 minutes. :-). > Sorry - in my post I misunderstood what you meant by "cost". I think it is (very?) unlikely that any Ruby (or Perl or Python, etc?) program will run faster than your C++ scripts. Where Ruby (or Python - I'm not so sure about Perl, I haven't used it) does have an advantage is that I think development may be quicker. So there are trade-offs. (Incidentally, I'm not an expert, but those timings suggest to me that the major processing cost may be in writing the results out to disk, so changing the language for all or part of the processing is unlikely to make a large difference?) But I'm open to correction: there are people who have used Ruby for fairly intensive large data sets processing, but my understanding is that they use a mixture of Ruby as "glue" with any intensive calculations in C, etc. For example, from some limited experience I have the speed of Ruby reading strings of bytes in from files is similar to the speed of Java or compiled Pascal, but for calculating CRCs of files the speed of pure Ruby calculating the CRCs once the bytes had read in was much slower than Java or compiled Pascal: so I used Ruby (or rather JRuby) to read in the strings of bytes from the files, and then called Java code from Ruby to calculate the CRC from the bytes. Overall the speed of this was similar to a pure Java or pure compiled Pascal program. Piet Hut and Jun Makino have been using Ruby to model dense star clusters. (Note that this is something I know nothing about! I'm just intrigued by the underlying principle of using Ruby for intensive numerical calculations by developing in Ruby without worrying about speed by using smaller unrealistic models, and then using more realistic models by translating part (or all!) of the Ruby code to a faster language.) http://www.kira.org/index.php?option=com_content&task=view&id=124&Itemid=154 ...MODEST is the new name for the Stellar Dynamics workshop. It stands for: MOdeling DEnse STellar systems ... The basic idea is to start a kind of N-body wikipedia, as a group's process
|
Pages: 1 Prev: Unsubscribe Next: How do I upload an image with Sinatra (like Paperclip)? |