From: Tom Lane on
Magnus Hagander <magnus(a)hagander.net> writes:
> Do we need to make the value configurable? I'd certainly find it
> interesting to set backends to say 5 or something like that, that
> makes them less likely to be killed than any old "oops opened too big
> file in an editor"-process, but still possible to kill if the system
> is *really* running out of memory.

I don't want to go to the trouble of creating (and documenting) a
configure option for this. Much less a GUC ;-)

What I suggest is that we do something like

#ifdef LINUX_OOM_ADJ
...
fprintf(oom, "%d\n", LINUX_OOM_ADJ);
...
#endif

Then, somebody who wants the feature would build with, say,
-DLINUX_OOM_ADJ=0
or another value if they want that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Alex Hunsaker wrote:
> On Thu, Jan 7, 2010 at 20:26, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> > Alex Hunsaker <badalex(a)gmail.com> writes:
>
> > We can either drop this in core (with a lot of #ifdef LINUX added)
>
> Any thoughts on doing something like (in fork_process.c)
>
> #ifdef LINUX
> void oom_adjust()
> {
> ...
> }
> #else
> void oom_adjust() {}
> #endif
>
> So there is only one #ifdef? It still leaves the ugly calls to the function...

The usual solution for this kind of thing is:

#ifdef LINUX
#define OOM_ADJUST oom_adjust()
#else
#define OOM_ADJUST do {} while (0)
#endif

so there is no call or dummy function and you reference it in the code
as:

OOM_ADJUST;

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on
On Fri, Jan 8, 2010 at 07:53, Bruce Momjian <bruce(a)momjian.us> wrote:
> Alex Hunsaker wrote:
>> On Thu, Jan 7, 2010 at 20:26, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> > Alex Hunsaker <badalex(a)gmail.com> writes:

> The usual solution for this kind of thing is:
>
>        #ifdef LINUX
>        #define OOM_ADJUST oom_adjust()
>        #else
>        #define OOM_ADJUST do {} while (0)
>        #endif
>
> so there is no call or dummy function and you reference it in the code
> as:

Surely any compiler worth its salt would turn a call to an empty void
function into a noop? Then again maybe I just hate macros :)

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on
On Fri, Jan 8, 2010 at 07:27, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Then, somebody who wants the feature would build with, say,
>        -DLINUX_OOM_ADJ=0
> or another value if they want that.

Here is a stab at that.

It sets oom_adj for:
autovacuum workers
archivers (pgarch.c)
regular backends

Also it updates the contrib linux starup script to start under an oom_adj of -17

Comments?

*** a/contrib/start-scripts/linux
--- b/contrib/start-scripts/linux
***************
*** 53,58 **** DAEMON="$prefix/bin/postmaster"
--- 53,63 ----
# What to use to shut down the postmaster
PGCTL="$prefix/bin/pg_ctl"

+ # Adjust oom_adj on linux to avoid the postmaster from be killed
+ # note you probably want to compile postgres with -DLINUX_OOM_ADJ=0
+ # so that regular backends will be killed on oom
+ OOM_ADJ=-17
+
set -e

# Only start if we can find the postmaster.
***************
*** 62,67 **** test -x $DAEMON || exit 0
--- 67,73 ----
case $1 in
start)
echo -n "Starting PostgreSQL: "
+ echo $OOM_ADJ > /proc/self/oom_adj
su - $PGUSER -c "$DAEMON -D '$PGDATA' &" >>$PGLOG 2>&1
echo "ok"
;;
*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
***************
*** 1403,1408 **** StartAutoVacWorker(void)
--- 1403,1411 ----
/* Lose the postmaster's on-exit routines */
on_exit_reset();

+ /* allow us to be killed on oom */
+ oom_adjust();
+
AutoVacWorkerMain(0, NULL);
break;
#endif
*** a/src/backend/postmaster/fork_process.c
--- b/src/backend/postmaster/fork_process.c
***************
*** 66,68 **** fork_process(void)
--- 66,97 ----
}

#endif /* ! WIN32 */
+
+ #if defined(__linux__) && defined(LINUX_OOM_ADJ)
+ /*
+ * By default linux really likes to kill the postmaster on oom.
+ * Because linux does not take SYSV shared mem into account it
+ * (almost) always will SIGKILL the postmaster on an oom event.
+ *
+ * In the event you started the postmaster under a low (negative)
+ * oom_adj. This will adjust regular backends, autovac and archivers
+ * to LINUX_OOM_ADJ on fork(). Allowing them to be killed in an oom
+ * event.
+ *
+ * Later we might add other OS oom type stuff here.
+ */
+ void
+ oom_adjust(void)
+ {
+ FILE *oom = fopen("/proc/self/oom_adj", "w");
+
+ /* ignore errors we dont really care */
+ if (oom)
+ {
+ fprintf(oom, "%d\n", LINUX_OOM_ADJ);
+ fclose(oom);
+ }
+ }
+ #else
+ void oom_adjust(void) { }
+ #endif
*** a/src/backend/postmaster/pgarch.c
--- b/src/backend/postmaster/pgarch.c
***************
*** 170,175 **** pgarch_start(void)
--- 170,178 ----
/* Drop our connection to postmaster's shared memory, as well */
PGSharedMemoryDetach();

+ /* allow us to be killed on oom */
+ oom_adjust();
+
PgArchiverMain(0, NULL);
break;
#endif
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 3076,3081 **** BackendStartup(Port *port)
--- 3076,3084 ----
/* Perform additional initialization and collect startup packet */
BackendInitialize(port);

+ /* allow us to be killed on oom */
+ oom_adjust();
+
/* And run the backend */
proc_exit(BackendRun(port));
}
*** a/src/include/postmaster/fork_process.h
--- b/src/include/postmaster/fork_process.h
***************
*** 13,17 ****
--- 13,18 ----
#define FORK_PROCESS_H

extern pid_t fork_process(void);
+ extern void oom_adjust(void);

#endif /* FORK_PROCESS_H */
From: Tom Lane on
Alex Hunsaker <badalex(a)gmail.com> writes:
> On Fri, Jan 8, 2010 at 07:27, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> Then, somebody who wants the feature would build with, say,
>>        -DLINUX_OOM_ADJ=0
>> or another value if they want that.

> Here is a stab at that.

Anybody have an objection to this basic approach? I'm in a bit of a
hurry to get something like this into the Fedora RPMs, so barring
objections I'm going to review this, commit it into HEAD, and then
make a back-ported patch I can use with 8.4 in Fedora.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers