Prev: Acai Berry France | Perdre 5 Kilos en 2 Semaines | Essai Gratuit
Next: Why does kompozer have such a high cpu usage when it is sitting idle?
From: Rui Maciel on 21 Dec 2009 16:33 Ryan Chan wrote: > Hello, > > Consider the case: > > You have 200 lines of mapping to replace, in a csv format, e.g. > > apple,orange > boy,girl > ... > > You have a 500MB file, you want to replace all 200 lines of mapping, > what would be the most efficient way to do it? You could try awk. It doesn't hurt. Rui Maciel
From: John Hasler on 21 Dec 2009 16:59 Rui Maciel writes: > You could try awk. It doesn't hurt. I don't know... it's what my parrot says when something hurts. -- John "Awk: bailing out near line 10" Hasler jhasler(a)newsguy.com Dancing Horse Hill Elmwood, WI USA
From: Ryan Chan on 22 Dec 2009 11:35 On Dec 22, 3:04 am, unruh <un...(a)wormhole.physics.ubc.ca> wrote: > Why run it multiple times? sed or even ed can run as many commands as > you like in a single invocation. > Seems I found the answer, not sure if it is exactly what you said. Do you mean to use a sed script file, such as 1,/^END/{ s/1/a/g s/2/b/g } and replace by sed -f replace.sed source.txt
From: hongkonger on 25 Dec 2009 21:32
This is exactly what you want. http://www.linuxask.com/questions/replace-multiple-strings-using-sed On 12æ21æ¥, ä¸å12æ07å, pk <p....(a)pk.invalid> wrote: > Ryan Chan wrote: > > Consider the case: > > > You have 200 lines of mapping to replace, in a csv format, e.g. > > > apple,orange > > boy,girl > > ... > > > You have a 500MB file, you want to replace all 200 lines of mapping, > > what would be the most efficient way to do it? > > Not sure about "most efficient", but with awk you can do all of that in a > single pass (almost) over the data: > > awk -F, 'NR==FNR{a[$1]=$2;next} > {for(i in a)gsub(i,a[i]); print}' mapfile datafile > > However, that has at least two problems, which may or may not be relevant > for your scenario: > > 1) Does not know about "words", so if "pineapple" appears in the data, it > will become "pineorange"; > > 2) assumes that all the strings don't contain regex metacharacters, and that > will likely produce wrong outcomes if one of the words to replace is, say > "a.*b" or similar. |