From: Tom Lane on 12 Sep 2006 09:34 Simon Riggs <simon(a)2ndquadrant.com> writes: > On Mon, 2006-09-11 at 11:29 -0400, Tom Lane wrote: >> Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in >> src/include/storage/lwlock.h. You need a full backend recompile >> after changing it, but you shouldn't need to initdb, if that helps. > IIRC we did that already and the answer was 16... No, no one has shown me any numbers from any "real" tests (anything more than pgbench on a Dell PC ...). regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
From: "Strong, David" on 12 Sep 2006 11:46 We can pass on what we've seen when running tests here with different BufMapping and LockMgr partition sizes. We use a TPC-C inspired benchmark. Currently it is configured to run 25 backend processes. The test runs for 16 minutes as this is the minimum amount of time we can run and obtain useful information. This gives us 24,000 seconds (25 * 16 * 60) of processing time. The following timings have been rounded to the nearest second and represent the amount of time amongst all backend processes to acquire and release locks. For example, a value of 2500 seconds would mean each backend process (25) took ~100 seconds to acquire or release a lock. Although, in reality, the time spent locking or releasing each partition entry is not uniform and there are some definite hotspot entries. We can pass on some of the lock output if anyone is interested. When using 16 buffer and 16 lock partitions, we see that BufMapping takes 809 seconds to acquire locks and 174 seconds to release locks. The LockMgr takes 362 seconds to acquire locks and 26 seconds to release locks. When using 128 buffer and 128 lock partitions, we see that BufMapping takes 277 seconds (532 seconds improvement) to acquire locks and 78 seconds (96 seconds improvement) to release locks. The LockMgr takes 235 seconds (127 seconds improvement) to acquire locks and 22 seconds (4 seconds improvement) to release locks. Overall, 128 BufMapping partitions improves locking/releasing by 678 seconds, 128 LockMgr partitions improves locking/releasing by 131 seconds. With the improvements in the various locking times, one might expect an improvement in the overall benchmark result. However, a 16 partition run produces a result of 198.74 TPS and a 128 partition run produces a result of 203.24 TPS. Part of the time saved from BufMapping and LockMgr partitions is absorbed into the WALInsertLock lock. For a 16 partition run, the total time to lock/release the WALInsertLock lock is 5845 seconds. For 128 partitions, the WALInsertLock lock takes 6172 seconds, an increase of 327 seconds. Perhaps we have our WAL configured incorrectly? Other static locks are also affected, but not as much as the WALInsertLock lock. For example, the ProcArrayLock lock increases from 337 seconds to 348 seconds. The SInvalLock lock increases from 317 seconds to 331 seconds. Due to expansion of time in other locks, a 128 partition run only spends 403 seconds less in locking than a 16 partition run. We can generate some OProfile statistics, but most of the time saved is probably absorbed into functions such as HeapTupleSatisfiesSnapshot and PinBuffer which seem to have a very high overhead. David -----Original Message----- From: pgsql-hackers-owner(a)postgresql.org [mailto:pgsql-hackers-owner(a)postgresql.org] On Behalf Of Simon Riggs Sent: Tuesday, September 12, 2006 1:37 AM To: Tom Lane Cc: Mark Wong; Bruce Momjian; PostgreSQL-development Subject: Re: [HACKERS] Lock partitions On Mon, 2006-09-11 at 11:29 -0400, Tom Lane wrote: > Mark Wong <markw(a)osdl.org> writes: > > Tom Lane wrote: > >> It would be nice to see some results from the OSDL tests with, say, 4, > >> 8, and 16 lock partitions before we forget about the point though. > >> Anybody know whether OSDL is in a position to run tests for us? > > > Yeah, I can run some dbt2 tests in the lab. I'll get started on it. > > We're still a little bit away from getting the automated testing for > > PostgreSQL going again though. > > Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in > src/include/storage/lwlock.h. You need a full backend recompile > after changing it, but you shouldn't need to initdb, if that helps. IIRC we did that already and the answer was 16... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo(a)postgresql.org so that your message can get through to the mailing list cleanly
From: Tom Lane on 12 Sep 2006 12:40 "Strong, David" <david.strong(a)unisys.com> writes: > When using 16 buffer and 16 lock partitions, we see that BufMapping > takes 809 seconds to acquire locks and 174 seconds to release locks. The > LockMgr takes 362 seconds to acquire locks and 26 seconds to release > locks. > When using 128 buffer and 128 lock partitions, we see that BufMapping > takes 277 seconds (532 seconds improvement) to acquire locks and 78 > seconds (96 seconds improvement) to release locks. The LockMgr takes 235 > seconds (127 seconds improvement) to acquire locks and 22 seconds (4 > seconds improvement) to release locks. While I don't see any particular penalty to increasing NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very significant penalty (increasing PGPROC size as well as the work needed during LockReleaseAll, which is executed at every transaction end). I think 128 lock partitions is probably verging on the ridiculous .... particularly if your benchmark only involves touching half a dozen tables. I'd be more interested in comparisons between 4 and 16 lock partitions. Also, please vary the two settings independently rather than confusing the issue by changing them both at once. > With the improvements in the various locking times, one might expect an > improvement in the overall benchmark result. However, a 16 partition run > produces a result of 198.74 TPS and a 128 partition run produces a > result of 203.24 TPS. > Part of the time saved from BufMapping and LockMgr partitions is > absorbed into the WALInsertLock lock. For a 16 partition run, the total > time to lock/release the WALInsertLock lock is 5845 seconds. For 128 > partitions, the WALInsertLock lock takes 6172 seconds, an increase of > 327 seconds. Perhaps we have our WAL configured incorrectly? I fear this throws your entire measurement procedure into question. For a fixed workload the number of acquisitions of WALInsertLock ought to be fixed, so you shouldn't see any more contention for WALInsertLock if the transaction rate didn't change materially. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
From: "Strong, David" on 12 Sep 2006 13:03 Tom, Thanks for the feedback. We'll run a few tests with differing buffer and lock partition sizes in the range you're interested in and we'll let you know what we see. Our workload is not fixed, however. Our benchmark does not follow the strict TPC-C guideline of using think times etc. We throw as many transactions at the database as we can. So, when any time is freed up, we will fill it with another transaction. We simply want to stress as much as we can. As one bottleneck is removed, the time saved obviously flows to the next. Postgres 8.2 moves some of the time that used to be consumed by single BufMappingLock and LockMGRLock locks to the WALInsertLock lock. We have run tests where we made XLogInsert a NOP, because we wanted to see where the next bottleneck would be, and some of the time occupied by WALInsertLock lock was absorbed by the SInvalLock lock. We have not tried to remove the SInvalLock lock to see where time flows to next, but we might. David -----Original Message----- From: Tom Lane [mailto:tgl(a)sss.pgh.pa.us] Sent: Tuesday, September 12, 2006 9:40 AM To: Strong, David Cc: PostgreSQL-development Subject: Re: [HACKERS] Lock partitions "Strong, David" <david.strong(a)unisys.com> writes: > When using 16 buffer and 16 lock partitions, we see that BufMapping > takes 809 seconds to acquire locks and 174 seconds to release locks. The > LockMgr takes 362 seconds to acquire locks and 26 seconds to release > locks. > When using 128 buffer and 128 lock partitions, we see that BufMapping > takes 277 seconds (532 seconds improvement) to acquire locks and 78 > seconds (96 seconds improvement) to release locks. The LockMgr takes 235 > seconds (127 seconds improvement) to acquire locks and 22 seconds (4 > seconds improvement) to release locks. While I don't see any particular penalty to increasing NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very significant penalty (increasing PGPROC size as well as the work needed during LockReleaseAll, which is executed at every transaction end). I think 128 lock partitions is probably verging on the ridiculous .... particularly if your benchmark only involves touching half a dozen tables. I'd be more interested in comparisons between 4 and 16 lock partitions. Also, please vary the two settings independently rather than confusing the issue by changing them both at once. > With the improvements in the various locking times, one might expect an > improvement in the overall benchmark result. However, a 16 partition run > produces a result of 198.74 TPS and a 128 partition run produces a > result of 203.24 TPS. > Part of the time saved from BufMapping and LockMgr partitions is > absorbed into the WALInsertLock lock. For a 16 partition run, the total > time to lock/release the WALInsertLock lock is 5845 seconds. For 128 > partitions, the WALInsertLock lock takes 6172 seconds, an increase of > 327 seconds. Perhaps we have our WAL configured incorrectly? I fear this throws your entire measurement procedure into question. For a fixed workload the number of acquisitions of WALInsertLock ought to be fixed, so you shouldn't see any more contention for WALInsertLock if the transaction rate didn't change materially. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
From: Simon Riggs on 13 Sep 2006 05:22
On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote: > "Strong, David" <david.strong(a)unisys.com> writes: > > When using 16 buffer and 16 lock partitions, we see that BufMapping > > takes 809 seconds to acquire locks and 174 seconds to release locks. The > > LockMgr takes 362 seconds to acquire locks and 26 seconds to release > > locks. > > > When using 128 buffer and 128 lock partitions, we see that BufMapping > > takes 277 seconds (532 seconds improvement) to acquire locks and 78 > > seconds (96 seconds improvement) to release locks. The LockMgr takes 235 > > seconds (127 seconds improvement) to acquire locks and 22 seconds (4 > > seconds improvement) to release locks. > > While I don't see any particular penalty to increasing > NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very > significant penalty (increasing PGPROC size as well as the work needed > during LockReleaseAll, which is executed at every transaction end). > I think 128 lock partitions is probably verging on the ridiculous > ... particularly if your benchmark only involves touching half a dozen > tables. I'd be more interested in comparisons between 4 and 16 lock > partitions. Also, please vary the two settings independently rather > than confusing the issue by changing them both at once. Good thinking David. Even if 128 is fairly high, it does seem worth exploring higher values - I was just stuck in "fewer == better" thoughts. > > With the improvements in the various locking times, one might expect an > > improvement in the overall benchmark result. However, a 16 partition run > > produces a result of 198.74 TPS and a 128 partition run produces a > > result of 203.24 TPS. > > > Part of the time saved from BufMapping and LockMgr partitions is > > absorbed into the WALInsertLock lock. For a 16 partition run, the total > > time to lock/release the WALInsertLock lock is 5845 seconds. For 128 > > partitions, the WALInsertLock lock takes 6172 seconds, an increase of > > 327 seconds. Perhaps we have our WAL configured incorrectly? > > I fear this throws your entire measurement procedure into question. For > a fixed workload the number of acquisitions of WALInsertLock ought to be > fixed, so you shouldn't see any more contention for WALInsertLock if the > transaction rate didn't change materially. David's results were to do with lock acquire/release time, not the number of acquisitions, so that in itself doesn't make me doubt these measurements. Perhaps we can ask whether there was a substantially different number of lock acquisitions? As Tom says, that would be an issue. It seems reasonable that relieving the bottleneck on BufMapping and LockMgr locks that we would then queue longer on the next bottleneck, WALInsertLock. So again, those tests seem reasonable to me so far. These seem to be the beginnings of accurate wait time analysis, so I'm listening closely. Are you using a lightweight timer? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |