FAQ 3.16 How can I make my Perl program take less memory? [Perl]

Prev: Strange behaviour of LWP::UserAgent
Next: FAQ 7.19 What's the difference between deep and shallow binding?

From: PerlFAQ Server on 21 May 2010 12:00

This is an excerpt from the latest version perlfaq3.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

3.16: How can I make my Perl program take less memory?

When it comes to time-space tradeoffs, Perl nearly always prefers to
throw memory at a problem. Scalars in Perl use more memory than strings
in C, arrays take more than that, and hashes use even more. While
there's still a lot to be done, recent releases have been addressing
these issues. For example, as of 5.004, duplicate hash keys are shared
amongst all hashes using them, so require no reallocation.

In some cases, using substr() or vec() to simulate arrays can be highly
beneficial. For example, an array of a thousand booleans will take at
least 20,000 bytes of space, but it can be turned into one 125-byte bit
vector--a considerable memory savings. The standard Tie::SubstrHash
module can also help for certain types of data structure. If you're
working with specialist data structures (matrices, for instance) modules
that implement these in C may use less memory than equivalent Perl
modules.

Another thing to try is learning whether your Perl was compiled with the
system malloc or with Perl's builtin malloc. Whichever one it is, try
using the other one and see whether this makes a difference. Information
about malloc is in the INSTALL file in the source distribution. You can
find out whether you are using perl's malloc by typing "perl
-V:usemymalloc".

Of course, the best way to save memory is to not do anything to waste it
in the first place. Good programming practices can go a long way toward
this:

* Don't slurp!

Don't read an entire file into memory if you can process it line by
line. Or more concretely, use a loop like this:

#
# Good Idea
#
while (<FILE>) {
# ...
}

instead of this:

#
# Bad Idea
#
@data = <FILE>;
foreach (@data) {
# ...
}

When the files you're processing are small, it doesn't much matter
which way you do it, but it makes a huge difference when they start
getting larger.

* Use map and grep selectively

Remember that both map and grep expect a LIST argument, so doing
this:

@wanted = grep {/pattern/} <FILE>;

will cause the entire file to be slurped. For large files, it's
better to loop:

while (<FILE>) {
push(@wanted, $_) if /pattern/;
}

* Avoid unnecessary quotes and stringification

Don't quote large strings unless absolutely necessary:

my $copy = "$large_string";

makes 2 copies of $large_string (one for $copy and another for the
quotes), whereas

my $copy = $large_string;

only makes one copy.

Ditto for stringifying large arrays:

{
local $, = "\n";
print @big_array;
}

is much more memory-efficient than either

print join "\n", @big_array;

or

{
local $" = "\n";
print "@big_array";
}

* Pass by reference

Pass arrays and hashes by reference, not by value. For one thing,
it's the only way to pass multiple lists or hashes (or both) in a
single call/return. It also avoids creating a copy of all the
contents. This requires some judgement, however, because any changes
will be propagated back to the original data. If you really want to
mangle (er, modify) a copy, you'll have to sacrifice the memory
needed to make one.

* Tie large variables to disk.

For "big" data stores (i.e. ones that exceed available memory)
consider using one of the DB modules to store it on disk instead of
in RAM. This will incur a penalty in access time, but that's
probably better than causing your hard disk to thrash due to massive
swapping.

--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

|
Pages: 1
Prev: Strange behaviour of LWP::UserAgent
Next: FAQ 7.19 What's the difference between deep and shallow binding?