From: Adam Kellas on 2 Mar 2010 11:39 Hi, Looking for suggestions on how to speed up the function below. It's intended to "re-macroize" the output of make; in other words given a sequence of command lines generated by make, and a set of make macros, I need to substitute in the make variables such that "gcc -c -g -O2 - Wall -DNDEBUG" might become (say) "$(CC) -c $(CFLAGS) $(DFLAGS)", as much as possible as it was in the original Makefile. The %Variables hash maps strings to macro names; thus with the above example we would have $Variables{'gcc'} = '$(CC)'; $Variables{'-g -O2 -Wall'} = '$(CFLAGS)'; $Variables{'-DNDEBUG'} = '$(DFLAGS)'; Anyway, the function below seems to work but scales very badly and becomes unusable when enough variables are in use. Any ideas on how to write it better? sub varify { my $word = shift; $word =~ s%\$%\$\$%g; for my $substr (keys %Variables) { while ((my $start = index($word, $substr)) >= 0) { substr($word, $start, length($substr)) = $Variables{$substr}; } } return $word; } Thanks, AK
From: Jim Gibson on 2 Mar 2010 12:57 In article <688f37dd-7719-4944-a19f-77a60c572804(a)d2g2000yqa.googlegroups.com>, Adam Kellas <adam.kellas(a)gmail.com> wrote: > Hi, > > Looking for suggestions on how to speed up the function below. It's > intended to "re-macroize" the output of make; in other words given a > sequence of command lines generated by make, and a set of make macros, > I need to substitute in the make variables such that "gcc -c -g -O2 - > Wall -DNDEBUG" might become (say) "$(CC) -c $(CFLAGS) $(DFLAGS)", as > much as possible as it was in the original Makefile. The %Variables > hash maps strings to macro names; thus with the above example we would > have > > $Variables{'gcc'} = '$(CC)'; > $Variables{'-g -O2 -Wall'} = '$(CFLAGS)'; > $Variables{'-DNDEBUG'} = '$(DFLAGS)'; > > Anyway, the function below seems to work but scales very badly and > becomes unusable when enough variables are in use. Any ideas on how to > write it better? > > sub varify { > my $word = shift; > $word =~ s%\$%\$\$%g; > for my $substr (keys %Variables) { > while ((my $start = index($word, $substr)) >= 0) { > substr($word, $start, length($substr)) = $Variables{$substr}; > } > } > return $word; > } I don't see how you are going to get much of a speedup. It seems you are already doing the minimum amount of work with no wasted steps. You might try using the each() function instead of keys. That saves generating the key array and the hash lookup for the replacement string: while( my($key,$replace) = each( %Variables) ) { ... substr($word, $start, length($substr)) = $replace; } You could also pre-compute the replacement string lengths so you don't have to call the length() function for each replacement. Thus, you might be better off using three arrays or a two-dimensional (N,3) array to hold (key,replacement,length(replacement)) values. How many key:replacement pairs are you using? I am surprised this doesn't scale very well. It would appear to be O(n) in the number of search strings. As always, only benchmarking can ensure you are getting any speedups or tell you where the actual bottlenecks are. -- Jim Gibson
From: Uri Guttman on 2 Mar 2010 13:06 >>>>> "AK" == Adam Kellas <adam.kellas(a)gmail.com> writes: AK> Hi, AK> Looking for suggestions on how to speed up the function below. It's AK> intended to "re-macroize" the output of make; in other words given a AK> sequence of command lines generated by make, and a set of make macros, AK> I need to substitute in the make variables such that "gcc -c -g -O2 - AK> Wall -DNDEBUG" might become (say) "$(CC) -c $(CFLAGS) $(DFLAGS)", as AK> much as possible as it was in the original Makefile. The %Variables AK> hash maps strings to macro names; thus with the above example we would AK> have AK> $Variables{'gcc'} = '$(CC)'; AK> $Variables{'-g -O2 -Wall'} = '$(CFLAGS)'; AK> $Variables{'-DNDEBUG'} = '$(DFLAGS)'; AK> sub varify { AK> my $word = shift; AK> $word =~ s%\$%\$\$%g; AK> for my $substr (keys %Variables) { AK> while ((my $start = index($word, $substr)) >= 0) { AK> substr($word, $start, length($substr)) = $Variables{$substr}; AK> } AK> } AK> return $word; AK> } can you make a pattern that would match ANY of the strings you want to match? even a alternation might do well enough. then you can do a single s/// with the replacement value being looked up in the $variables (poor name) hash. also why would speed be an issue here? it looks like it would be done off line to fix up makefile output. uri -- Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com -- ----- Perl Code Review , Architecture, Development, Training, Support ------ --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
From: J�rgen Exner on 2 Mar 2010 13:07 Adam Kellas <adam.kellas(a)gmail.com> wrote: >Looking for suggestions on how to speed up the function below. It's >intended to "re-macroize" the output of make; in other words given a >sequence of command lines generated by make, and a set of make macros, >I need to substitute in the make variables such that "gcc -c -g -O2 - >Wall -DNDEBUG" might become (say) "$(CC) -c $(CFLAGS) $(DFLAGS)", as >much as possible as it was in the original Makefile. The %Variables >hash maps strings to macro names; thus with the above example we would >have > > $Variables{'gcc'} = '$(CC)'; > $Variables{'-g -O2 -Wall'} = '$(CFLAGS)'; > $Variables{'-DNDEBUG'} = '$(DFLAGS)'; > >Anyway, the function below seems to work but scales very badly and >becomes unusable when enough variables are in use. Any ideas on how to >write it better? > >sub varify { > my $word = shift; > $word =~ s%\$%\$\$%g; > for my $substr (keys %Variables) { > while ((my $start = index($word, $substr)) >= 0) { > substr($word, $start, length($substr)) = $Variables{$substr}; Probably I am missing the obvious,but why are you doing the replacements manually, thus incurring a lot of string copy, instead of simply doing a s///? I would replace the whole while loop with a straight-forward s/$substr/$Variables{$substr}/g; May have to add \Q...\E if needed. BTW: $substr is an awful name considering there is a function substr() and capitalized names ($Variables) normally indicate file handles. jue
From: Adam Kellas on 2 Mar 2010 13:36 On Mar 2, 1:06 pm, "Uri Guttman" <u...(a)StemSystems.com> wrote: > can you make a pattern that would match ANY of the strings you want to > match? even a alternation might do well enough. then you can do a > single s/// with the replacement value being looked up in the $variables > (poor name) hash. Thanks, I will try this. > also why would speed be an issue here? it looks like it would be done > off line to fix up makefile output. You're right, this is done more or less off line and in theory should not be too performance sensitive. But this exhibits really pathological behavior - for reasons I don't understand, though it works fine in small unit-test setups it can appear to hang for hours on end in real-world situations. During that time strace shows that perl is calling the brk() system call over and over. Thanks, AK
|
Next
|
Last
Pages: 1 2 3 4 5 6 Prev: trim the last blank-line and compare files Next: Can you compile a perl executable? |