From: "Kevin Grittner" on
"Kevin Grittner" <Kevin.Grittner(a)wicourts.gov> wrote:

> So I'm not sure whether I can get to a state suitable for starting
> the desired test, but I'll stay with a for a while.

I have other commitments today, so I'm going to leave the VACUUM
ANALYZE running and come back tomorrow morning to try the pg_dump.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
"Kevin Grittner" <Kevin.Grittner(a)wicourts.gov> writes:
> ... After a few minutes that left me curious just how big
> the database was, so I tried:

> select pg_size_pretty(pg_database_size('test'));

> I did a Ctrl+C after about five minutes and got:

> Cancel request sent

> but it didn't return for 15 or 20 minutes.

Hm, we probably are lacking CHECK_FOR_INTERRUPTS in the inner loops in
dbsize.c ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
KaiGai Kohei <kaigai(a)kaigai.gr.jp> writes:
> (2010/01/23 5:12), Tom Lane wrote:
>> Now the argument against that is that it won't scale terribly well
>> to situations with very large numbers of blobs.

> Even if the database contains massive number of large objects, all the
> pg_dump has to manege on RAM is its metadata, not data contents.

I'm not so worried about the amount of RAM needed as whether pg_dump's
internal algorithms will scale to large numbers of TOC entries. Any
O(N^2) behavior would be pretty painful, for example. No doubt we could
fix any such problems, but it might take more work than we want to do
right now.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on
Tom Lane <tgl(a)sss.pgh.pa.us> wrote:

> I'm not so worried about the amount of RAM needed as whether
> pg_dump's internal algorithms will scale to large numbers of TOC
> entries. Any O(N^2) behavior would be pretty painful, for
> example. No doubt we could fix any such problems, but it might
> take more work than we want to do right now.

I'm afraid pg_dump didn't get very far with this before:

pg_dump: WARNING: out of shared memory
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
pg_dump: The command was: LOCK TABLE public.test2672 IN ACCESS SHARE
MODE

Given how fast it happened, I suspect that it was 2672 tables into
the dump, versus 26% of the way through 5.5 million tables.

A sampling of the vmstat 1 output lines in "baseline" state --
before the dump started:

procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
1 0 319804 583656 23372 124473248 0 0 17224 10 1742 18995
9 1 88 2 0
3 1 319804 595840 23368 124458856 0 0 17016 10 2014 22965
9 1 89 1 0
1 0 319804 586912 23376 124469128 0 0 16808 158 1807 19181
8 1 89 2 0
2 0 319804 576304 23368 124479416 0 0 16840 5 1764 19136
8 1 90 1 0

0 1 319804 590480 23364 124459888 0 0 1488 130 3449 13844
2 1 93 3 0
0 1 319804 589476 23364 124460912 0 0 1456 115 3328 11800
2 1 94 4 0
1 0 319804 588468 23364 124461944 0 0 1376 146 3156 11770
2 1 95 2 0
1 1 319804 587836 23364 124465024 0 0 1576 133 3599 14797
3 1 94 3 0

While it was running:

procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
2 1 429080 886244 23308 111242464 0 0 25684 38 2920 18847
7 3 85 5 0
2 1 429080 798172 23308 111297976 0 0 40024 26 1342 16967
13 2 82 4 0
2 1 429080 707708 23308 111357600 0 0 42520 34 1588 19148
13 2 81 4 0
0 5 429080 620700 23308 111414144 0 0 40272 73863 1434 18077
12 2 80 6 0
1 5 429080 605616 23308 111425448 0 0 6920 131232 729 5187
3 1 66 31 0
0 6 429080 582852 23316 111442912 0 0 10840 131248 665 4987
3 1 66 30 0
2 4 429080 584976 23308 111433672 0 0 9776 139416 693 7890
4 1 66 29 0
0 5 429080 575752 23308 111436752 0 0 10776 131217 647 6157
3 1 66 30 0
1 3 429080 583768 23308 111420304 0 0 13616 90352 1043 13047
6 1 68 25 0
4 0 429080 578888 23300 111397696 0 0 40000 44 1347 25329
12 2 79 6 0
2 1 429080 582368 23292 111367896 0 0 40320 76 1517 28628
13 2 80 5 0
2 0 429080 584960 23276 111338096 0 0 40240 163 1374 26988
13 2 80 5 0
6 0 429080 576176 23268 111319600 0 0 40328 170 1465 27229
13 2 80 5 0
4 0 429080 583212 23212 111288816 0 0 39568 138 1418 27296
13 2 80 5 0

This box has 16 CPUs, so the jump from 3% user CPU to 13% with an
increase of I/O wait from 3% to 5% suggests that pg_dump was
primarily CPU bound in user code before the crash.

I can leave this database around for a while if there are other
things you would like me to try.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
"Kevin Grittner" <Kevin.Grittner(a)wicourts.gov> writes:
> I'm afraid pg_dump didn't get very far with this before:

> pg_dump: WARNING: out of shared memory
> pg_dump: SQL command failed

> Given how fast it happened, I suspect that it was 2672 tables into
> the dump, versus 26% of the way through 5.5 million tables.

Yeah, I didn't think about that. You'd have to bump
max_locks_per_transaction up awfully far to get to where pg_dump
could dump millions of tables, because it wants to lock each one.

It might be better to try a test case with lighter-weight objects,
say 5 million simple functions.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers