From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
> There could well be moving parts if the user wants to adjust the value
> being written to oom_adj, and can't because it's compiled in. I don't
> see why we can't just add a GUC for this and be done with it.

The number of users who will want to do that might be different from
epsilon, but not by enough to justify a GUC. Furthermore, we don't have
any reasonable infrastructure for supporting platform-specific GUCs,
which means that the amount of effort you're proposing is extremely
large.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on
On Fri, Jan 8, 2010 at 10:24, Stephen Frost <sfrost(a)snowman.net> wrote:
> As I recall, oom_adj wasn't visible in the container because you
> explicitly set what proc entries processes can have access to when using
> VServers, and OpenSSH didn't handle that cleanly.  Guess what I'm just
> saying is "don't just assume everything is as it would be on a 'normal'
> system when dealing with /proc and friends", and, of course, test, test,
> test when you're talking about back-porting things.

Sure this was openssh? I just looked through the entire cvs history
for opensshp and found 0 references to 'oom' let alone 'oom_adj'.
Maybe something distro specific?


[ for the curious here is what I tried ]
$ git clone git://git.infradead.org/openssh.git
$ cd openssh
$ git grep oom_adj
$ git grep 'oom'
$ git grep oom | grep -vi loomis | grep -v room
$ git log -p | grep oom | grep -vi loomis | grep -v room | grep -v tsoome

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Alex Hunsaker <badalex(a)gmail.com> writes:
> Sure this was openssh? I just looked through the entire cvs history
> for opensshp and found 0 references to 'oom' let alone 'oom_adj'.
> Maybe something distro specific?

FWIW, I see no evidence that sshd on Fedora does anything to change its
oom score --- the oom_adj file reads as zero for both the parent daemon
and its children. Kinda scary to realize the OOM killer could easily
lock me out of boxes I run headless.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on
On Fri, Jan 8, 2010 at 12:45, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Alex Hunsaker <badalex(a)gmail.com> writes:
>> Sure this was openssh? I just looked through the entire cvs history
>> for opensshp and found 0 references to 'oom' let alone 'oom_adj'.
>> Maybe something distro specific?
>
> FWIW, I see no evidence that sshd on Fedora does anything to change its
> oom score --- the oom_adj file reads as zero for both the parent daemon
> and its children.  Kinda scary to realize the OOM killer could easily
> lock me out of boxes I run headless.

[ OT, CC trimmed]

Yeah, for me sshd has a score of 24 and has the 13th lowest oom_score
on my box. While postgres with 0 connections and shared_buffers =
28MB has a score of 26558 and has the 5th highest oom_score. Only
chromium and xmonad are above it. With 5 connections it just about
doubles its score (to 47238). Had I been headless postgres would
certainly die on oom. But even something like alsamixer or bash is
higher than sshd *shrug*.

For the curious here below is the raw data and how i generated it [
yes its far from perfect... ]:
(for file in /proc/*/; do echo `cat $file/oom_score` `cat
$file/cmdline` $file; done) | sort -n

6 /sbin/agetty-838400tty1linux /proc/2204/
6 /sbin/agetty-838400tty2linux /proc/1977/
6 /sbin/agetty-838400tty3linux /proc/1978/
6 /usr/sbin/crond /proc/1954/
6 /usr/sbin/uptimed /proc/1971/
10 /usr/sbin/irqbalance /proc/1951/
12 /usr/sbin/ntpd-s /proc/1967/
12 /usr/sbin/smartd /proc/2128/
13 hald-addon-input: Listening on /dev/input/event2 /dev/input/event1
/proc/2044/
13 hald-addon-storage: polling /dev/sr0 (every 2 sec) /proc/2088/
13 /usr/lib/hal/hald-addon-cpufreq /proc/2092/
19 /usr/sbin/syslog-ng /proc/1639/
24 /usr/sbin/sshd /proc/1943/
26 /usr/lib/sa/sadc-F-L-SDISK6006- /proc/2716/
27 supervising syslog-ng /proc/1638/
38 hald-runner /proc/1992/
53 cat/proc/self//cmdline /proc/self/
64 /usr/lib/postfix/master /proc/2089/
90 /usr/bin/X-nolistentcp /proc/2173/
102 -bash /proc/10997/
108 /home/alex/.cabal/bin/xmobar-x1 /proc/2190/
140 /usr/bin/dbus-daemon--fork--print-pid5--print-address7--session /proc/2256/
140 /usr/bin/dbus-daemon--system /proc/1983/
191 /usr/lib/hal/hald-addon-acpi /proc/2093/
194 dbus-launch--autolaunch004d7f457c373938f22d796a4ae05b60--binary-syntax--close-stderr
/proc/2255/
199 /usr/sbin/ntpd-s /proc/1960/
206 ssh-agentxmonad /proc/2188/
287 tail-f/var/log/httpd/error.log /proc/8688/
310 firefox /proc/18571/
339 xbindkeys /proc/2192/
341 -bash /proc/10589/
354 /usr/local/bin/cmus /proc/2205/
400 /bin/sh/usr/bin/startx /proc/2155/
456 sort-n /proc/10998/
487 /usr/sbin/hald /proc/1991/
525 qmgr-l-tfifo-u /proc/2097/
639 /bin/sh/home/alex/.xinitrc /proc/2180/
736 /usr/lib/GConf/gconfd-2 /proc/18578/
861 tail-f/var/log/httpd/error.log /proc/1621/
1122 urxvtd-q-f-o /proc/2189/
1151 /usr/lib/chromium/chromium /proc/2220/
1172 -bash /proc/22276/
1351 mutt-y /proc/27637/
1497 alsamixer /proc/3528/
1528 ssh192.168.0.15 /proc/1523/
1534 -bash /proc/2863/
1534 -bash /proc/2881/
1534 -bash /proc/2891/
1793 /usr/lib/chromium/chromium /proc/2219/
1828 ssh70.98.186.4 /proc/2860/
2066 pickup-l-tfifo-u /proc/26254/
2195 postgres: stats collector process /proc/10602/
2281 -bash /proc/3521/
2299 -bash /proc/1516/
2447 -bash /proc/23990/
2573 -bash /proc/1538/
3082 /usr/lib/chromium/chromium --channel=2219.aa9eb00.196295212
--type=renderer --lang=en-US
--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/2392/
3305 -bash /proc/20490/
3796 /usr/sbin/httpd-kstart /proc/1553/
4697 /usr/bin/knotify4 /proc/31084/
6117 /usr/lib/chromium/chromium --channel=2219.afcb450.251437212
--type=renderer --lang=en-US
--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/18311/
6564 /proc/self/exe--channel=2219.a684ce8.115295409--type=extension--lang=en-US--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/2242/
6921 /proc/self/exe--channel=2219.a6862e0.261342415--type=extension--lang=en-US--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/2245/
7453 /proc/self/exe--type=plugin--plugin-path=/usr/lib/mozilla/plugins/libflashplayer.so--lang=en-US--plugin-data-dir=/home/alex/.config/chromium/Default--channel=2219.afd16a58.340779570
/proc/18316/
10132 -bash /proc/2195/
10154 postgres: wal writer process /proc/10600/
10154 postgres: writer process /proc/10599/
10299 postgres: autovacuum launcher process /proc/10601/
11019 init [3] /proc/1/
17332 /usr/sbin/httpd-kstart /proc/1616/
17365 /usr/sbin/httpd-kstart /proc/1611/
17366 /usr/sbin/httpd-kstart /proc/1612/
17398 /usr/sbin/httpd-kstart /proc/1613/
17567 /usr/sbin/httpd-kstart /proc/1614/
19954 xinit/home/alex/.xinitrc--/etc/X11/xinit/xserverrc:0-auth/tmp/serverauth.NHLLZS74xg
/proc/2172/
26558 bin/postgres-Dblah /proc/10597/
26882 /proc/self/exe--channel=2219.b1692d18.1483224477--type=extension--lang=en-US--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/1597/
52201 /usr/lib/chromium/chromium--type=zygote /proc/2221/
53839 /home/alex/.xmonad/xmonad-i386-linux /proc/2187/
61530 /usr/lib/chromium/chromium --channel=2219.aa993d0.1197730980
--type=renderer --lang=en-US
--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/
/proc/1527/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Stephen Frost on
* Tom Lane (tgl(a)sss.pgh.pa.us) wrote:
> Alex Hunsaker <badalex(a)gmail.com> writes:
> > Sure this was openssh? I just looked through the entire cvs history
> > for opensshp and found 0 references to 'oom' let alone 'oom_adj'.
> > Maybe something distro specific?
>
> FWIW, I see no evidence that sshd on Fedora does anything to change its
> oom score --- the oom_adj file reads as zero for both the parent daemon
> and its children. Kinda scary to realize the OOM killer could easily
> lock me out of boxes I run headless.

There were a few issues, as it turns out, the particularly annoying one
was in the init script which caused upgrades to fail due to sshd not
being restarted, bug report here:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=473573

The other issue was with a Debian-specific patch which was applied to
OpenSSH which basically just created noise in the log file, bug report
here:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=487325

In the end, the problem was with errors being returned from attempts to
modify oom_adj. As long as we can just ignore those hopefully there
won't be any issues.

Thanks,

Stephen