Prev: xlogViewer / xlogdump
Next: CVS corruption/mistagging?
From: Hannu Krosing on 5 Mar 2007 04:10 Ühel kenal päeval, E, 2007-03-05 kell 03:51, kirjutas Luke Lonergan: > Hi Tom, > > > Even granting that your conclusions are accurate, we are not > > in the business of optimizing Postgres for a single CPU architecture. > > I think you're missing my/our point: > > The Postgres shared buffer cache algorithm appears to have a bug. When > there is a sequential scan the blocks are filling the entire shared > buffer cache. This should be "fixed". > > My proposal for a fix: ensure that when relations larger (much larger?) > than buffer cache are scanned, they are mapped to a single page in the > shared buffer cache. How will this approach play together with synchronized scan patches ? Or should synchronized scan rely on systems cache only ? > - Luke > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo(a)postgresql.org so that your > message can get through to the mailing list cleanly -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo(a)postgresql.org so that your message can get through to the mailing list cleanly
From: Tom Lane on 5 Mar 2007 04:15 "Luke Lonergan" <LLonergan(a)greenplum.com> writes: > I think you're missing my/our point: > The Postgres shared buffer cache algorithm appears to have a bug. When > there is a sequential scan the blocks are filling the entire shared > buffer cache. This should be "fixed". No, this is not a bug; it is operating as designed. The point of the current bufmgr algorithm is to replace the page least recently used, and that's what it's doing. If you want to lobby for changing the algorithm, then you need to explain why one test case on one platform justifies de-optimizing for a lot of other cases. In almost any concurrent-access situation I think that what you are suggesting would be a dead loss --- for instance we might as well forget about Jeff Davis' synchronized-scan work. In any case, I'm still not convinced that you've identified the problem correctly, because your explanation makes no sense to me. How can the processor's L2 cache improve access to data that it hasn't got yet? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
From: Florian Weimer on 5 Mar 2007 04:20 * Tom Lane: > That makes absolutely zero sense. The data coming from the disk was > certainly not in processor cache to start with, and I hope you're not > suggesting that it matters whether the *target* page of a memcpy was > already in processor cache. If the latter, it is not our bug to fix. Uhm, if it's not in the cache, you typically need to evict some cache lines to make room for the data, so I'd expect an indirect performance hit. I could be mistaken, though. -- Florian Weimer <fweimer(a)bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstra�e 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org
From: Hannu Krosing on 5 Mar 2007 04:41 Ühel kenal päeval, E, 2007-03-05 kell 04:15, kirjutas Tom Lane: > "Luke Lonergan" <LLonergan(a)greenplum.com> writes: > > I think you're missing my/our point: > > > The Postgres shared buffer cache algorithm appears to have a bug. When > > there is a sequential scan the blocks are filling the entire shared > > buffer cache. This should be "fixed". > > No, this is not a bug; it is operating as designed. Maybe he means that there is an oversight (aka "bug") in the design ;) > The point of the > current bufmgr algorithm is to replace the page least recently used, > and that's what it's doing. > > If you want to lobby for changing the algorithm, then you need to > explain why one test case on one platform justifies de-optimizing > for a lot of other cases. If you know beforehand that you will definitely overflow cache and not reuse it anytime soon, then it seems quite reasonable to not even start polluting the cache. Especially, if you get a noticable boost in performance while doing so. > In almost any concurrent-access situation > I think that what you are suggesting would be a dead loss Only if the concurrent access patern is over data mostly fitting in buffer cache. If we can avoid polluting buffer cache with data we know we will use only once, more useful data will be available. > --- for > instance we might as well forget about Jeff Davis' synchronized-scan > work. Depends on ratio of system cache/shared buffer cache. I don't think Jeff's patch is anywere near the point it needs to start worrying about data swapping between system cache and shared burrers, or L2 cache usage > In any case, I'm still not convinced that you've identified the problem > correctly, because your explanation makes no sense to me. How can the > processor's L2 cache improve access to data that it hasn't got yet? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
From: Mark Kirkwood on 5 Mar 2007 05:00
Gavin Sherry wrote: > On Mon, 5 Mar 2007, Mark Kirkwood wrote: > >> To add a little to this - forgetting the scan resistant point for the >> moment... cranking down shared_buffers to be smaller than the L2 cache >> seems to help *any* sequential scan immensely, even on quite modest HW: >> > (snipped) >> When I've profiled this activity, I've seen a lot of time spent >> searching for/allocating a new buffer for each page being fetched. >> Obviously having less of them to search through will help, but having >> less than the L2 cache-size worth of 'em seems to help a whole lot! > > Could you demonstrate that point by showing us timings for shared_buffers > sizes from 512K up to, say, 2 MB? The two numbers you give there might > just have to do with managing a large buffer. Yeah - good point: PIII 1.26 Ghz 512Kb L2 cache 2G RAM Test is elapsed time for: SELECT count(*) FROM lineitem lineitem has 1535724 pages (11997 MB) Shared Buffers Elapsed IO rate (from vmstat) -------------- ------- --------------------- 400MB 101 s 122 MB/s 2MB 100 s 1MB 97 s 768KB 93 s 512KB 86 s 256KB 77 s 128KB 74 s 166 MB/s I've added the observed IO rate for the two extreme cases (the rest can be pretty much deduced via interpolation). Note that the system will do about 220 MB/s with the now (in)famous dd test, so we have a bit of headroom (not too bad for a PIII). Cheers Mark ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |