From: George Orwell on
"RC" =3D=3D Ryan Chan <ryanchan...(a)gmail.com>:
RC> You have 200 lines of mapping to replace, in a csv format, e.g.
RC> apple,orange
RC> boy,girl
RC> ....
RC>=20
RC> You have a 500MB file, you want to replace all 200 lines of mapping,
RC> what would be the most efficient way to do it?

Use perl, as it was invented for jobs like this one.

Hint: if you want to replace whole words only, the wildcard characters
\W and \s might help.


Il mittente di questo messaggio|The sender address of this
non corrisponde ad un utente |message is not related to a real
reale ma all'indirizzo fittizio|person but to a fake address of an
di un sistema anonimizzatore |anonymous system
Per maggiori informazioni |For more info
https://www.mixmaster.it
From: Ryan Chan on
On Dec 21, 12:21 am, John Hasler <jhas...(a)newsguy.com> wrote:
> man sed
> --
> John Hasler
> jhas...(a)newsguy.com
> Dancing Horse Hill
> Elmwood, WI USA

Yes, I have tried to replace using sed, and work quite fast for a
SINGLE replacement.
But if I run the sed multiple times, then it will be slow.

So I ask here to know if any faster method to replace a mapping stored
in a file. (I can write some scripts, but not sure if any existing way
can do the tricks)
From: John Hasler on
Ryan writes:
> Yes, I have tried to replace using sed, and work quite fast for a
> SINGLE replacement. But if I run the sed multiple times, then it will
> be slow.

You must, of course, convert your "mapping" file into sed commands,
either by running a script over the file or by running sed inside a
script which reads each "mapping" line, generates a sed command, and
runs it. Alternatively, you could use Awk or Perl or Python.
--
John Hasler
jhasler(a)newsguy.com
Dancing Horse Hill
Elmwood, WI USA
From: unruh on
On 2009-12-21, Ryan Chan <ryanchan404(a)gmail.com> wrote:
> On Dec 21, 12:21?am, John Hasler <jhas...(a)newsguy.com> wrote:
>> man sed
>> --
>> John Hasler
>> jhas...(a)newsguy.com
>> Dancing Horse Hill
>> Elmwood, WI USA
>
> Yes, I have tried to replace using sed, and work quite fast for a
> SINGLE replacement.
> But if I run the sed multiple times, then it will be slow.

Why run it multiple times? sed or even ed can run as many commands as
you like in a single invocation.


>
> So I ask here to know if any faster method to replace a mapping stored
> in a file. (I can write some scripts, but not sure if any existing way
> can do the tricks)
From: Grant Edwards on
On 2009-12-21, unruh <unruh(a)wormhole.physics.ubc.ca> wrote:
> On 2009-12-21, Ryan Chan <ryanchan404(a)gmail.com> wrote:
>> On Dec 21, 12:21?am, John Hasler <jhas...(a)newsguy.com> wrote:
>>> man sed
>>> --
>>> John Hasler
>>> jhas...(a)newsguy.com
>>> Dancing Horse Hill
>>> Elmwood, WI USA
>>
>> Yes, I have tried to replace using sed, and work quite fast for a
>> SINGLE replacement.
>> But if I run the sed multiple times, then it will be slow.
>
> Why run it multiple times? sed or even ed can run as many
> commands as you like in a single invocation.

Even if you do run it multiple times, if you do so in a
pipeline I don't think you'll notice any additional time (I'd
bet real money that the job is going to be disk-I/O bound, not
CPU bound).

--
Grant Edwards grante Yow! Are we live or on
at tape?
visi.com