From: Matthias Andree on
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(blind carbon copy to Lasse Collin, maintainer of xzutils upstream)

Greetings,

I've just had xz break my devel/libtool22 FreeBSD port build on a low
memory computer (128 MB).

Reason is that xz by default caps memory use at ~40% of physical RAM
(this is documented), and skips decompressing files if that doesn't
suffice. Options to override this are provided, but I question this
behaviour:

- - This xz feature may seem reasonable if we do have a choice about
memory use, meaning, during compression. There we can reduce dictionary
size, search depth, and all that, and reduce memory use at the expense
of compression factor. No big deal.

- - This feature is, in my perception, INADEQUATE during decompression. If
I have a .xz file (downloaded from the Internet) that needs 90 MB RAM to
decompress, then I need to use those 90 MB no matter if that's nice or
not, it's just critical.

I am proposing to Lasse Collin to drop memory capping functionality in
xz/lzma in decompress mode, and in lzmadec/xzdec.


This is what happens in practice:

> # cd /usr/ports/devel/libtool22
> # make
> ===> Vulnerability check disabled, database not found
> ===> License check disabled, port has not defined LICENSE
> ===> Extracting for libtool-2.2.6b
> => MD5 Checksum OK for libtool-2.2.6b.tar.lzma.
> => SHA256 Checksum OK for libtool-2.2.6b.tar.lzma.
> ===> libtool-2.2.6b depends on file: /usr/local/bin/xz - found
> /usr/local/bin/xz: /usr/ports/distfiles//libtool-2.2.6b.tar.lzma: Memory usage limit reached
> /usr/local/bin/xz: Limit was 46 MiB, but 65 MiB would have been needed
> ===> Patching for libtool-2.2.6b
> ===> Applying FreeBSD patches for libtool-2.2.6b
> patch: **** can't cd to /usr/ports/devel/libtool22/work/libtool-2.2.6b: No such file or directory
> => Patch patch-libltdl-Makefile.in failed to apply cleanly.
> *** Error code 1
>
> Stop in /usr/ports/devel/libtool22.

Investigating this (on FreeBSD 6.4-RELEASE-p10 i386):

- - The ports system generates this command line and executes it:

> for file in libtool-2.2.6b.tar.lzma; do if ! (cd /usr/ports/devel/libtool22/work && /usr/local/bin/xz -dc /usr/ports/distfiles//$file | /usr/bin/tar -xf - --no-same-owner --no-same-permissions); then exit 1; fi; done

- - xz fails with error message as above, exits with return code 1, after
not having printed any bytes on stdout.

- - tar decompresses a 0-byte file, which succeeds.

- - The FreeBSD ports system does not detect the tar input was empty, and
proceeds to the patch phase, which fails because tar didn't extract
files - so patch fails. This misleads end users about the actual error.


2. xz port or its use in the framework: we need to disable the memory
capping for decompressing. We simply must try to decompress no matter what.

I'll wait a couple of days hoping for Lasse's response, but generally,
we have two alternatives to proceed:

a. directly patch archivers/xz so that the memory capping is not applied
during decompression. Should be considered upstream -- because during
decompression you cannot tweak parameters to reduce memory usage, you
need to proceed with what's needed by the file to build the dictionaries
and whatnot to decompress. (CC'ing port maintainer Christian Weisgerber)

b. ports framework: pass command line arguments or environment variables
to defeat capping with the -M option. Note that -M 0 sets the default
cap to 40% of physical RAM. Example (the large figure is 2^31-1 i. e.
INT_MAX):

> # export XZ_OPT=-M2147483647
> # xz -dc >/dev/null /usr/ports/distfiles/libtool*.tar.lzma
> # echo $?
> 0

I'd propose using XT_OPT+=-M$(getconf LLONG_MAX) for now.

We can't use unsigned types because FreeBSD getconf, as of 6.4, would
return those as negative numbers (due to internal use of intmax_t even
where variables are of unsigned type and might be in range (INTMAX_MAX,
UINTMAX_MAX], which causes (justified!) xz complaints, and INTMAX_MAX or
INT64_MAX cannot be queried via getconf(1) on FreeBSD (undefined).

The getconf fix is non-trivial, so I expect we better not rely on that now.

So for FreeBSD, I propose to patch this Mk/bsd.port.mk line #2382 (as of
CVS rev. 1.642)

EXTRACT_CMD?= ${XZ_CMD}

to

EXTRACT_CMD?= ${XZ_CMD} $$(getconf LLONG_MAX)


Other proposals?

Best regards
Matthias

- --
Matthias Andree (ports/ committer)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/

iEYEARECAAYFAkwbpNMACgkQvmGDOQUufZXWJQCgmSOmo5a0rnFR3aNa7/4NEwGk
pQ8AoJ2liNx8o62j5Z6ON1Lh22n60hia
=pYpJ
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-ports(a)freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "freebsd-ports-unsubscribe(a)freebsd.org"

From: Xin LI on
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 2010/06/18 09:54, Matthias Andree wrote:
> So for FreeBSD, I propose to patch this Mk/bsd.port.mk line #2382 (as of
> CVS rev. 1.642)
>
> EXTRACT_CMD?= ${XZ_CMD}
>
> to
>
> EXTRACT_CMD?= ${XZ_CMD} $$(getconf LLONG_MAX)
>
>
> Other proposals?

Will ${XZ_CMD} -M max work for you? This should have the same effect I
think (assuming the ports xz version supports it)

Cheers,
- --
Xin LI <delphij(a)delphij.net> http://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (FreeBSD)

iQEcBAEBCAAGBQJMHBKWAAoJEATO+BI/yjfBqukIALoVYkI17tqAIsJ7OTne4dRL
Jk+VYgNawha4Db9KRKOyjG6u+t0UpV+5gJl+mTvf0FWqh4WcReVfTBGMibjn7HHq
n43ufR9dcHj1AD1LxBHu6UQRlGgb93AEqoQv+gj7cNYtxz6pxaNhGIxwW2gDRaT9
umAnuqIwBvZP8T1j3FYPYixZyrfxOtwEJ9OehkdAkRI7Ip4JPp8nf23Lv1oagzU9
UCCaosFgbqXDUV8O7RREkJm2brQuSBLdQiaojg/ZlV1V8UHjRn30qOkA6yGSZHd3
ydPSDMl4P8rqSwo7LHoJSHOezjfc3y5ZA+ZTRQiiglVxPXp3nm8aEJqMJv4KIvA=
=0vOp
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-ports(a)freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "freebsd-ports-unsubscribe(a)freebsd.org"

From: Matthias Andree on
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 19.06.2010, 02:43 Uhr, schrieb Xin LI:

> Will ${XZ_CMD} -M max work for you? This should have the same effect I
> think (assuming the ports xz version supports it)

Thanks. Yes it does. I wonder why I've missed it in the manpage. This is
so much easier than getconf fiddling. Still for decompression of ports,
I maintain that the manpage recommendations (such as 90%) are off, and
that decompression should always set this.

- --
Matthias Andree
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (GNU/Linux)

iEYEARECAAYFAkwcF1MACgkQvmGDOQUufZVzRQCg+62dbT7llIgvU5jHuV7OdAFm
HQYAoIRLxrnh7bIp8YK4qpYgN71S7GHz
=0ZwM
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-ports(a)freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "freebsd-ports-unsubscribe(a)freebsd.org"

From: Lasse Collin on
On 2010-06-18 Matthias Andree wrote:
> I've just had xz break my devel/libtool22 FreeBSD port build on a low
> memory computer (128 MB).
>
> Reason is that xz by default caps memory use at ~40% of physical RAM
> (this is documented), and skips decompressing files if that doesn't
> suffice.

A snapshot of XZ Utils newer than the 4.999.9beta release has different
default limit (I know that an official release would be nice etc. but
there isn't one now.):

- If 40 % of RAM is at least 80 MiB, 40 % of RAM is used as
the limit.
- If 80 % of RAM is less than 80 MiB, 80 % of RAM is used as
the limit.
- Otherwise 80 MiB is used as the limit.

The above avoids the problem on most systems since 80 MiB is enough for
all typical .xz files.

The limit was a problem on Gentoo too. There's still a problem with the
above default limit if you have 16 MiB RAM (that's a real-world
example), so Gentoo put XZ_OPT=--memory=max to their build system. They
too think it is better to let the system be slow and swap very heavily
for an hour or two than refuse decompression in a critical script.

> - This feature is, in my perception, INADEQUATE during decompression.
> If I have a .xz file (downloaded from the Internet) that needs 90 MB
> RAM to decompress, then I need to use those 90 MB no matter if
> that's nice or not, it's just critical.
>
> I am proposing to Lasse Collin to drop memory capping functionality
> in xz/lzma in decompress mode, and in lzmadec/xzdec.

Naturally the limiter functionality won't be removed, but a different
default value can be considered, including no limit by default.

Would you find no limit OK after xz allocated and used 1 GiB of memory
without a warning after you tried to decompress a relatively big file
you just downloaded on a slightly older system with 512 MiB RAM? I guess
if it is a critical file decompressed by a critical script, you don't
mind it swapping a couple of hours, because you just want it done no
matter how long it takes. But in normal command line use some people
would prefer to get an error first so that it is possible to consider
e.g. using another system to do the decompression (possibly
recompressing with lower settings or with another tool) instead of just
overriding the limit.

One possibility could be to make the limit for decompression e.g. max(80
MiB, 40 % of RAM), since all typical files will decompress with 80 MiB
(you need to use advanced options to create files needing more). That
way also systems with less than 128 MiB RAM would decompress all typical
files by default, possibly slowly with heavy swapping, and systems with
more RAM would be protected from unexpected memory usage of very rarely
occurring .xz files.

Determining a good limit has been quite a bit of a problem for me.
Obviously a DoS protection mechanism shouldn't be a DoS itself.
Disabling the limiter completely by default doesn't seem like an option,
because it would only change who will be annoyed. Comments are very
welcome. Thanks.

--
Lasse Collin | IRC: Larhzu @ IRCnet & Freenode
_______________________________________________
freebsd-ports(a)freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "freebsd-ports-unsubscribe(a)freebsd.org"

From: Matthias Andree on
Greetings,

first my thanks to Lasse for the prompt reply.

Lasse Collin wrote on Jun 19:

> On 2010-06-18 Matthias Andree wrote:
>
>> I've just had xz break my devel/libtool22 FreeBSD port build on a low
>> memory computer (128 MB).
>>
>> Reason is that xz by default caps memory use at ~40% of physical RAM
>> (this is documented), and skips decompressing files if that doesn't
>> suffice.
>>
> A snapshot of XZ Utils newer than the 4.999.9beta release has different
> default limit (I know that an official release would be nice etc. but
> there isn't one now.):
>
> - If 40 % of RAM is at least 80 MiB, 40 % of RAM is used as
> the limit.
> - If 80 % of RAM is less than 80 MiB, 80 % of RAM is used as
> the limit.
> - Otherwise 80 MiB is used as the limit.
>
> The above avoids the problem on most systems since 80 MiB is enough for
> all typical .xz files.
>
> The limit was a problem on Gentoo too. There's still a problem with the
> above default limit if you have 16 MiB RAM (that's a real-world
> example), so Gentoo put XZ_OPT=--memory=max to their build system. They
> too think it is better to let the system be slow and swap very heavily
> for an hour or two than refuse decompression in a critical script.
It is /not the application's responsibility/ to sidestep an inadequate
system configuration.

<rant>Just because noobs can use Unix today thanks to GNOME and KDE
does not mean we should target command-line utility defaults for them,
or turn the average command line utility like xz into a sissy.</rant>

It might be useful if build systems (such as Gentoo's system around
emerge, or FreeBSD port, or pkgsrc that originated around NetBSD)
could know that for low memory computer they may want to download the
bigger gz instead because it's for /them/ orders of magnitude faster
to decompress than .bz2 or .xz which might cause swap disk thrashing.

Decompression is *not* a facultative job. If I need the uncompressed
contents, I need them, period.
I don't decompress the file for the fun of it out of a certain mood,
but because I need to.

We have system facilities for limiting resources, including those that
limit virtual memory. Don't work against or around them. Remember that
if you play too nicely, others will starve you. There's no guarantee
that the "40% of RAM" will be in RAM unless you use privileged
operations (think mlock/mlockall). If there are tons of other
applications allocating memory the OS might decide to swap 15% of RAM
of xz's address space out to disk. Oops.

If I want to avoid thrashing, I can tune my swap configuration, or
memory limits properly. I can abort an ongoing decompression (SIGINT)
if it swaps too much.

Point to counter your defaults: On some of my systems, xz wouldn't
even get the default 40% of RAM because I often impose 300 MB Virtual
Memory limits on computers with 1 GB of RAM.

>> - This feature is, in my perception, INADEQUATE during decompression.
>> If I have a .xz file (downloaded from the Internet) that needs 90 MB
>> RAM to decompress, then I need to use those 90 MB no matter if
>> that's nice or not, it's just critical.
>>
>> I am proposing to Lasse Collin to drop memory capping functionality
>> in xz/lzma in decompress mode, and in lzmadec/xzdec.
>>
> Naturally the limiter functionality won't be removed, but a different
> default value can be considered, including no limit by default.
Please do that. For decompression, -M max should be the default.

For compression, it's less critical because service is degraded, not
denied, but I'd still think -M max would be the better default. I can
always put "export XZ_OPT=-3" in /etc/profile.d/local.sh or wherever
it belongs on the OS of the day.

I still think utilities and applications should /not/ impose
arbitrarily lower limits by default though.

I see that a feature to limit memory for compression or decompression
can be useful, particularly if I compress for other systems (think
compressing on my workstation for an embedded system).

I propose to put the essence of this discussion into the manpage and
have the memory limiter warnings point to the --memory option in the
manpage as a usability improvement.

> Would you find no limit OK after xz allocated and used 1 GiB of memory
> without a warning after you tried to decompress a relatively big file
> you just downloaded on a slightly older system with 512 MiB RAM? I guess
Yes, I absolutely want that.

> if it is a critical file decompressed by a critical script, you don't
> mind it swapping a couple of hours, because you just want it done no
> matter how long it takes. But in normal command line use some people
> would prefer to get an error first so that it is possible to consider
> e.g. using another system to do the decompression (possibly
> recompressing with lower settings or with another tool) instead of just
> overriding the limit.
They can use ulimit(1) -- or whatever their favourite login shell
offers -- to set soft memory limits.
Use the system facilities, but don't duplicate the standard operating
system knobs in applications.

> One possibility could be to make the limit for decompression e.g. max(80
> MiB, 40 % of RAM), since all typical files will decompress with 80 MiB
> (you need to use advanced options to create files needing more). That
> way also systems with less than 128 MiB RAM would decompress all typical
> files by default, possibly slowly with heavy swapping, and systems with
> more RAM would be protected from unexpected memory usage of very rarely
> occurring .xz files.

You're overcomplicating matters, and that's giving you headaches:

> Determining a good limit has been quite a bit of a problem for me.
> Obviously a DoS protection mechanism shouldn't be a DoS itself.
Agreed.

You are facing those difficulties because you are mixing up the
machinery (the --memory feature per se - which does not belong in the
application because the OS already has it) with the rules/policies
(such as the default setting for memory).

You are overcomplicating matters with trying to be smart.

> Disabling the limiter completely by default doesn't seem like an option,
> because it would only change who will be annoyed. Comments are very
> welcome. Thanks.

It is a necessity to change it. Applications do not have the freedom
to decide if they are in the mood to accept or refuce a crucial operation.
If I tell my Unix utility to decompress, I expect it to do exactly
that. If I want it to ask questions, I add --interactive on the
command line.

Feel free to pop up a million of obnoxious "do you really want"
requestors in your MS Windows GUI application, but xz is for Unix, and
it should behave the Unix way: do what you're told to, and try as hard
as you can. Don't

I find xz a very useful tool, but the -M default of "if this, 40%, of
that, 80%, or if foo, then X MB" for decompression is a major glitch
and should be remedied and changed for -M max at least for decompression.

Best regards
Matthias

--
Matthias Andree (ports/ committer)