Prev: xlogViewer / xlogdump
Next: CVS corruption/mistagging?
From: "Luke Lonergan" on 5 Mar 2007 03:30 Hi Tom, > Now this may only prove that the disk subsystem on this > machine is too cheap to let the system show any CPU-related > issues. Try it with a warm IO cache. As I posted before, we see double the performance of a VACUUM from a table in IO cache when the shared buffer cache isn't being polluted. The speed with large buffer cache should be about 450 MB/s and the speed with a buffer cache smaller than L2 should be about 800 MB/s. The real issue here isn't the L2 behavior, though that's important when trying to reach very high IO speeds, the issue is that we're seeing the buffer cache pollution in the first place. When we instrument the blocks selected by the buffer page selection algorithm, we see that they iterate sequentially, filling the shared buffer cache. That's the source of the problem here. Do we have a regression test somewhere for this? - Luke ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
From: Grzegorz Jaskiewicz on 5 Mar 2007 03:17 On Mar 5, 2007, at 2:36 AM, Tom Lane wrote: > n into account. > > I'm also less than convinced that it'd be helpful for a big seqscan: > won't reading a new disk page into memory via DMA cause that memory to > get flushed from the processor cache anyway? Nope. DMA is writing directly into main memory. If the area was in the L2/L1 cache, it will get invalidated. But if it isn't there, it is okay. -- Grzegorz Jaskiewicz gj(a)pointblue.com.pl ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
From: Tom Lane on 5 Mar 2007 03:45 "Luke Lonergan" <LLonergan(a)greenplum.com> writes: >> So either way, it isn't in processor cache after the read. >> So how can there be any performance benefit? > It's the copy from kernel IO cache to the buffer cache that is L2 > sensitive. When the shared buffer cache is polluted, it thrashes the L2 > cache. When the number of pages being written to in the kernel->user > space writes fits in L2, then the L2 lines are "written through" (see > the link below on page 264 for the write combining features of the > opteron for example) and the writes to main memory are deferred. That makes absolutely zero sense. The data coming from the disk was certainly not in processor cache to start with, and I hope you're not suggesting that it matters whether the *target* page of a memcpy was already in processor cache. If the latter, it is not our bug to fix. > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/ > 25112.PDF Even granting that your conclusions are accurate, we are not in the business of optimizing Postgres for a single CPU architecture. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
From: "Luke Lonergan" on 5 Mar 2007 03:51 Hi Tom, > Even granting that your conclusions are accurate, we are not > in the business of optimizing Postgres for a single CPU architecture. I think you're missing my/our point: The Postgres shared buffer cache algorithm appears to have a bug. When there is a sequential scan the blocks are filling the entire shared buffer cache. This should be "fixed". My proposal for a fix: ensure that when relations larger (much larger?) than buffer cache are scanned, they are mapped to a single page in the shared buffer cache. - Luke ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo(a)postgresql.org so that your message can get through to the mailing list cleanly
From: Heikki Linnakangas on 5 Mar 2007 04:09
Luke Lonergan wrote: > The Postgres shared buffer cache algorithm appears to have a bug. When > there is a sequential scan the blocks are filling the entire shared > buffer cache. This should be "fixed". > > My proposal for a fix: ensure that when relations larger (much larger?) > than buffer cache are scanned, they are mapped to a single page in the > shared buffer cache. It's not that simple. Using the whole buffer cache for a single seqscan is ok, if there's currently no better use for the buffer cache. Running a single select will indeed use the whole cache, but if you run any other smaller queries, the pages they need should stay in cache and the seqscan will loop through the other buffers. In fact, the pages that are left in the cache after the seqscan finishes would be useful for the next seqscan of the same table if we were smart enough to read those pages first. That'd make a big difference for seqscanning a table that's say 1.5x your RAM size. Hmm, I wonder if Jeff's sync seqscan patch adresses that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |