From: Simon Riggs on 25 May 2010 06:12 Some performance problems have been reported on HS from two users: Erik and Stefan. The characteristics of those issues have been that performance is * sporadically reduced, though mostly runs at full speed * context switch storms reported as being associated So we're looking for something that doesn't always happen, but when it does it involves lots of processes and context switching. Unfortunately neither test reporter has been able to re-run tests, leaving me not much to go on. Though since I know the code well, I can focus in on likely suspects fairly easily; in this case I think I have a root cause. Earlier this year I added deadlock detection into Startup process when it waits for a buffer pin. The deadlock detection was simplified since it doesn't wait for deadlock_timeout before acting, it just immediately sends a signal to all active processes to resolve the deadlock, even if the buffer pin is released very soon afterwards. Heikki questioned this implementation at the time, though I said it was easier to start simple and add more code if problems arose and time allowed. It's clear that with 100+ connections and reasonably frequent buffer pin waits, as would occur when accessing same data blocks on both primary and standby, that the current too-simple coding would cause performance issues, as Heikki implied. Certainly actual deadlocks are much rarer than buffer pin waits, so the current coding is wasteful. The following patch adds some simple logic to make the Startup process wait for deadlock_timeout before it sends the deadlock resolution signals. It does that by refactoring the API to enable_standby_sigalrm(), though doesn't change other behaviour or add new features. Viewpoints? -- Simon Riggs www.2ndQuadrant.com
|
Pages: 1 Prev: ROLLBACK TO SAVEPOINT Next: libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT |