From: "Strong, David" on 13 Sep 2006 13:52 Simon, In the 16/16 (16 buffer partitions/16 lock partitions) test, the WALInsertLock lock had 14643080 acquisition attempts and 12057678 successful acquisitions on the lock. That's 2585402 retries on the lock. That is to say that PGSemaphoreLock was invoked 2585402 times. In the 128/128 test, the WALInsertLock lock had 14991208 acquisition attempts and 12324765 successful acquisitions. That's 2666443 retries. The 128/128 test attempted 348128 more lock acquisitions than the 16/16 test and retried 81041 times more than the 16/16 test. We attribute the rise in WALInsertLock lock accesses to the reduction in time on acquiring the BufMapping and LockMgr partition locks. Does this seem reasonable? The overhead of any monitoring is of great concern to us. We've tried both clock_gettime () and getttimeofday () calls. They both seem to have the same overhead ~1 us/call (measured against the TSC of the CPU) and both seem to be accurate. We realize this can be a delicate point and so we would be happy to rerun any tests with a different timing mechanism. David -----Original Message----- From: Simon Riggs [mailto:simon(a)2ndquadrant.com] Sent: Wednesday, September 13, 2006 2:22 AM To: Tom Lane Cc: Strong, David; PostgreSQL-development Subject: Re: [HACKERS] Lock partitions On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote: > "Strong, David" <david.strong(a)unisys.com> writes: > > When using 16 buffer and 16 lock partitions, we see that BufMapping > > takes 809 seconds to acquire locks and 174 seconds to release locks. The > > LockMgr takes 362 seconds to acquire locks and 26 seconds to release > > locks. > > > When using 128 buffer and 128 lock partitions, we see that BufMapping > > takes 277 seconds (532 seconds improvement) to acquire locks and 78 > > seconds (96 seconds improvement) to release locks. The LockMgr takes 235 > > seconds (127 seconds improvement) to acquire locks and 22 seconds (4 > > seconds improvement) to release locks. > > While I don't see any particular penalty to increasing > NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very > significant penalty (increasing PGPROC size as well as the work needed > during LockReleaseAll, which is executed at every transaction end). > I think 128 lock partitions is probably verging on the ridiculous > ... particularly if your benchmark only involves touching half a dozen > tables. I'd be more interested in comparisons between 4 and 16 lock > partitions. Also, please vary the two settings independently rather > than confusing the issue by changing them both at once. Good thinking David. Even if 128 is fairly high, it does seem worth exploring higher values - I was just stuck in "fewer == better" thoughts. > > With the improvements in the various locking times, one might expect an > > improvement in the overall benchmark result. However, a 16 partition run > > produces a result of 198.74 TPS and a 128 partition run produces a > > result of 203.24 TPS. > > > Part of the time saved from BufMapping and LockMgr partitions is > > absorbed into the WALInsertLock lock. For a 16 partition run, the total > > time to lock/release the WALInsertLock lock is 5845 seconds. For 128 > > partitions, the WALInsertLock lock takes 6172 seconds, an increase of > > 327 seconds. Perhaps we have our WAL configured incorrectly? > > I fear this throws your entire measurement procedure into question. For > a fixed workload the number of acquisitions of WALInsertLock ought to be > fixed, so you shouldn't see any more contention for WALInsertLock if the > transaction rate didn't change materially. David's results were to do with lock acquire/release time, not the number of acquisitions, so that in itself doesn't make me doubt these measurements. Perhaps we can ask whether there was a substantially different number of lock acquisitions? As Tom says, that would be an issue. It seems reasonable that relieving the bottleneck on BufMapping and LockMgr locks that we would then queue longer on the next bottleneck, WALInsertLock. So again, those tests seem reasonable to me so far. These seem to be the beginnings of accurate wait time analysis, so I'm listening closely. Are you using a lightweight timer? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
From: "Strong, David" on 13 Sep 2006 14:46 Tom, We have some results for you. We left the buffer partition locks at 128 as this did not seem to be a concern and we're still using 25 backend processes. We ran tests for 4, 8 and 16 lock partitions. For 4 lock partitions, it took 620 seconds to acquire locks and 32 seconds to release locks. The test produced 199.95 TPS. For 8 lock partitions, it took 505 seconds to acquire locks and 31 seconds to release locks. The test produced 201.16 TPS. For 16 lock partitions, it took 362 seconds to acquire locks and 22 seconds to release locks. The test produced 200.75 TPS. And, just for grins, using 128 buffer and 128 lock partitions, took 235 seconds to acquire locks and 22 seconds to release locks. The test produced 203.24 TPS. Let me know if we can provide any additional information from these tests and if there are any other tests that we can run. David -----Original Message----- From: pgsql-hackers-owner(a)postgresql.org [mailto:pgsql-hackers-owner(a)postgresql.org] On Behalf Of Strong, David Sent: Wednesday, September 13, 2006 10:52 AM To: PostgreSQL-development Subject: Re: [HACKERS] Lock partitions Simon, In the 16/16 (16 buffer partitions/16 lock partitions) test, the WALInsertLock lock had 14643080 acquisition attempts and 12057678 successful acquisitions on the lock. That's 2585402 retries on the lock. That is to say that PGSemaphoreLock was invoked 2585402 times. In the 128/128 test, the WALInsertLock lock had 14991208 acquisition attempts and 12324765 successful acquisitions. That's 2666443 retries. The 128/128 test attempted 348128 more lock acquisitions than the 16/16 test and retried 81041 times more than the 16/16 test. We attribute the rise in WALInsertLock lock accesses to the reduction in time on acquiring the BufMapping and LockMgr partition locks. Does this seem reasonable? The overhead of any monitoring is of great concern to us. We've tried both clock_gettime () and getttimeofday () calls. They both seem to have the same overhead ~1 us/call (measured against the TSC of the CPU) and both seem to be accurate. We realize this can be a delicate point and so we would be happy to rerun any tests with a different timing mechanism. David -----Original Message----- From: Simon Riggs [mailto:simon(a)2ndquadrant.com] Sent: Wednesday, September 13, 2006 2:22 AM To: Tom Lane Cc: Strong, David; PostgreSQL-development Subject: Re: [HACKERS] Lock partitions On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote: > "Strong, David" <david.strong(a)unisys.com> writes: > > When using 16 buffer and 16 lock partitions, we see that BufMapping > > takes 809 seconds to acquire locks and 174 seconds to release locks. The > > LockMgr takes 362 seconds to acquire locks and 26 seconds to release > > locks. > > > When using 128 buffer and 128 lock partitions, we see that BufMapping > > takes 277 seconds (532 seconds improvement) to acquire locks and 78 > > seconds (96 seconds improvement) to release locks. The LockMgr takes 235 > > seconds (127 seconds improvement) to acquire locks and 22 seconds (4 > > seconds improvement) to release locks. > > While I don't see any particular penalty to increasing > NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very > significant penalty (increasing PGPROC size as well as the work needed > during LockReleaseAll, which is executed at every transaction end). > I think 128 lock partitions is probably verging on the ridiculous > ... particularly if your benchmark only involves touching half a dozen > tables. I'd be more interested in comparisons between 4 and 16 lock > partitions. Also, please vary the two settings independently rather > than confusing the issue by changing them both at once. Good thinking David. Even if 128 is fairly high, it does seem worth exploring higher values - I was just stuck in "fewer == better" thoughts. > > With the improvements in the various locking times, one might expect an > > improvement in the overall benchmark result. However, a 16 partition run > > produces a result of 198.74 TPS and a 128 partition run produces a > > result of 203.24 TPS. > > > Part of the time saved from BufMapping and LockMgr partitions is > > absorbed into the WALInsertLock lock. For a 16 partition run, the total > > time to lock/release the WALInsertLock lock is 5845 seconds. For 128 > > partitions, the WALInsertLock lock takes 6172 seconds, an increase of > > 327 seconds. Perhaps we have our WAL configured incorrectly? > > I fear this throws your entire measurement procedure into question. For > a fixed workload the number of acquisitions of WALInsertLock ought to be > fixed, so you shouldn't see any more contention for WALInsertLock if the > transaction rate didn't change materially. David's results were to do with lock acquire/release time, not the number of acquisitions, so that in itself doesn't make me doubt these measurements. Perhaps we can ask whether there was a substantially different number of lock acquisitions? As Tom says, that would be an issue. It seems reasonable that relieving the bottleneck on BufMapping and LockMgr locks that we would then queue longer on the next bottleneck, WALInsertLock. So again, those tests seem reasonable to me so far. These seem to be the beginnings of accurate wait time analysis, so I'm listening closely. Are you using a lightweight timer? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org
From: Tom Lane on 13 Sep 2006 16:35 "Strong, David" <david.strong(a)unisys.com> writes: > We have some results for you. We left the buffer partition locks at 128 > as this did not seem to be a concern and we're still using 25 backend > processes. We ran tests for 4, 8 and 16 lock partitions. > For 4 lock partitions, it took 620 seconds to acquire locks and 32 > seconds to release locks. The test produced 199.95 TPS. > For 8 lock partitions, it took 505 seconds to acquire locks and 31 > seconds to release locks. The test produced 201.16 TPS. > For 16 lock partitions, it took 362 seconds to acquire locks and 22 > seconds to release locks. The test produced 200.75 TPS. > And, just for grins, using 128 buffer and 128 lock partitions, took 235 > seconds to acquire locks and 22 seconds to release locks. The test > produced 203.24 TPS. [ itch... ] I can't help thinking there's something wrong with this; the wait-time measurements seem sane, but why is there essentially no change in the TPS result? The above numbers are only for the lock-partition LWLocks, right? What are the totals --- that is, how much time is spent blocked vs. processing overall? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org
From: Jim Nasby on 13 Sep 2006 16:37 On Sep 13, 2006, at 2:46 PM, Strong, David wrote: > We have some results for you. We left the buffer partition locks at > 128 > as this did not seem to be a concern and we're still using 25 backend > processes. We ran tests for 4, 8 and 16 lock partitions. Isn't having more lock partitions than buffer partitions pointless? -- Jim Nasby jim(a)nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
From: Tom Lane on 13 Sep 2006 17:18
Jim Nasby <jim(a)nasby.net> writes: > Isn't having more lock partitions than buffer partitions pointless? AFAIK they're pretty orthogonal. It's true though that a typical transaction doesn't hold all that many locks, which is why I don't see a need for a large number of lock partitions. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |