per table random-page-cost? [PgSql]

Prev: Application name patch - v2
Next: Controlling changes in plpgsql variableresolution

From: Alvaro Herrera on 22 Oct 2009 21:38

Greg Stark escribi�:

> There is another use case which perhaps needs to be addressed: if the
> user has some queries which are very latency sensitive and others
> which are not latency sensitive. In that case it might be very
> important to keep the pages of data used by the high priority queries
> in the cache. That's something we should have a high level abstract
> interface for, not depend on low level system features.

Yeah.

I wonder if the right thing for this is to mark the objects (pages), or
the queries, as needing special attention. If you mark the queries,
then perhaps they could behave slightly differently like adding +2 or so
to buffer usage count instead of +1, so that they take longer than a
normal buffer in getting evicted. This way you don't force the admin to
figure out what's the right size ratio for different named caches, etc.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: =?utf-8?q?C=C3=A9dric_Villemain?= on 23 Oct 2009 05:23

Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a Ã©crit :
> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
> > On Thu, Oct 22, 2009 at 11:16 AM, CÃ©dric Villemain
> >
> > <cedric.villemain(a)dalibo.com> wrote:
> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a Ã©crit :
> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin.mank(a)gmail.com>
wrote:
> > >> > Currently random_page_cost is a GUC. I propose that this could be
> > >> > set per-table.
> > >>
> > >> Or per-tablespace.
> > >>
> > >> Yes, I think there are a class of GUCs which describe the physical
> > >> attributes of the storage system which should be per-table or
> > >> per-tablespace. random_page_cost, sequential_page_cost,
> > >> effective_io_concurrency come to mind.
> > >
> > > and, perhaps effective_cache_size.
> > >
> > > You can have situation where you don't want some tables go to OS memory
> > > (you can disabled that at filesystem level, ... l'd like to be able to
> > > do that at postgres level but it is another point)
> > >
> > > So you put those tables in a separate tablespace, and tell postgresql
> > > that the effective_cache_size is 0 (for this tablespace), up to
> > > postgres to do the right thing with that ;)
> >
> > Why would you ever want to set effective_cache_size to 0?
>
> I think this is a misunderstanding of how effective_cache_size works. I
> can't think of any reason to do that. I could see a reason to tell the
> OS to not throw a relation into cache but that is a different thing.

Well the effective_cache_size in this context is OS cache memory (0 in my case)
+ estimation of shared_buffer.. ah so DBA should estimate the amount in the
shared_buffer only, ok.

So consider effective_cache_size = 0 + what pg_buffer_cache will tell.

My case is a table containing 29 GB of bytea in a database of 52 GB. Every row
on the 29GB table is grab only few times. And it will just renew OS cache
memory every time (the server have only 8GB of ram).
So when I remove this table (not the index) from the OS cache memory, I keep
more interesting blocks in the OS cache memory.

And disk + raid are quick enought to bypass the OS cache memory for this
tablespace.

Are things a bit clearer and usage not so silly ?

>
> Joshua D. Drake
>
> > ...Robert
>

--
CÃ©dric Villemain
Administrateur de Base de DonnÃ©es
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From: Robert Haas on 23 Oct 2009 08:23

On Fri, Oct 23, 2009 at 5:23 AM, Cédric Villemain
<cedric.villemain(a)dalibo.com> wrote:
> Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a écrit :
>> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
>> > On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
>> >
>> > <cedric.villemain(a)dalibo.com> wrote:
>> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
>> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin.mank(a)gmail.com>
> wrote:
>> > >> > Currently random_page_cost is a GUC. I propose that this could be
>> > >> > set per-table.
>> > >>
>> > >> Or per-tablespace.
>> > >>
>> > >> Yes, I think there are a class of GUCs which describe the physical
>> > >> attributes of the storage system which should be per-table or
>> > >> per-tablespace. random_page_cost, sequential_page_cost,
>> > >> effective_io_concurrency come to mind.
>> > >
>> > > and, perhaps effective_cache_size.
>> > >
>> > > You can have situation where you don't want some tables go to OS memory
>> > > (you can disabled that at filesystem level, ... l'd like to be able to
>> > > do that at postgres level but it is another point)
>> > >
>> > > So you put those tables in a separate tablespace, and tell postgresql
>> > > that the effective_cache_size is 0 (for this tablespace), up to
>> > > postgres to do the right thing with that ;)
>> >
>> > Why would you ever want to set effective_cache_size to 0?
>>
>> I think this is a misunderstanding of how effective_cache_size works. I
>> can't think of any reason to do that. I could see a reason to tell the
>> OS to not throw a relation into cache but that is a different thing.
>
> Well the effective_cache_size in this context is OS cache memory (0 in my case)
> + estimation of shared_buffer.. ah so DBA should estimate the amount in the
> shared_buffer only, ok.
>
> So consider effective_cache_size = 0 + what pg_buffer_cache will tell.
>
> My case is a table containing 29 GB of bytea in a database of 52 GB. Every row
> on the 29GB table is grab only few times. And it will just renew OS cache
> memory every time (the server have only 8GB of ram).
> So when I remove this table (not the index) from the OS cache memory, I keep
> more interesting blocks in the OS cache memory.
>
> And disk + raid are quick enought to bypass the OS cache memory for this
> tablespace.
>
>
> Are things a bit clearer and usage not so silly ?

Well, I think you're vastly overestimating the power of
effective_cache_size. effective_cache_size changes the planner's
estimation of how likely a repeated partial index scan is to find the
same block in cache. So it only affects
nested-loop-with-inner-indexscan plans, and if effective_cache_size is
set to a value larger than the size of the index (or maybe the
relation, I'm too lazy to go reread the code right now), one value is
as good as another. For a typical user, I think you could set
effective_cache_size to, say, a terabyte, and it wouldn't make a bit
of difference. Heck, why not 2TB.

As far as I can see, the only possible value of setting this knob to a
value other than positive-infinity is that if you have a huge dataset
that's not close to fitting in memory, this might cause the planner to
pick a merge join over a nested loop with inner indexscan, which might
be better if it makes the I/O sequential rather than random. Anyone
think I'm a pessimist?

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: =?iso-8859-1?q?C=E9dric_Villemain?= on 23 Oct 2009 09:23

Le vendredi 23 octobre 2009 14:23:09, Robert Haas a écrit :
> On Fri, Oct 23, 2009 at 5:23 AM, Cédric Villemain
>
> <cedric.villemain(a)dalibo.com> wrote:
> > Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a écrit :
> >> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
> >> > On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
> >> >
> >> > <cedric.villemain(a)dalibo.com> wrote:
> >> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
> >> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank
> >> > >> <marcin.mank(a)gmail.com>
> >
> > wrote:
> >> > >> > Currently random_page_cost is a GUC. I propose that this could be
> >> > >> > set per-table.
> >> > >>
> >> > >> Or per-tablespace.
> >> > >>
> >> > >> Yes, I think there are a class of GUCs which describe the physical
> >> > >> attributes of the storage system which should be per-table or
> >> > >> per-tablespace. random_page_cost, sequential_page_cost,
> >> > >> effective_io_concurrency come to mind.
> >> > >
> >> > > and, perhaps effective_cache_size.
> >> > >
> >> > > You can have situation where you don't want some tables go to OS
> >> > > memory (you can disabled that at filesystem level, ... l'd like to
> >> > > be able to do that at postgres level but it is another point)
> >> > >
> >> > > So you put those tables in a separate tablespace, and tell
> >> > > postgresql that the effective_cache_size is 0 (for this tablespace),
> >> > > up to postgres to do the right thing with that ;)
> >> >
> >> > Why would you ever want to set effective_cache_size to 0?
> >>
> >> I think this is a misunderstanding of how effective_cache_size works. I
> >> can't think of any reason to do that. I could see a reason to tell the
> >> OS to not throw a relation into cache but that is a different thing.
> >
> > Well the effective_cache_size in this context is OS cache memory (0 in my
> > case) + estimation of shared_buffer.. ah so DBA should estimate the
> > amount in the shared_buffer only, ok.
> >
> > So consider effective_cache_size = 0 + what pg_buffer_cache will tell.
> >
> > My case is a table containing 29 GB of bytea in a database of 52 GB.
> > Every row on the 29GB table is grab only few times. And it will just
> > renew OS cache memory every time (the server have only 8GB of ram).
> > So when I remove this table (not the index) from the OS cache memory, I
> > keep more interesting blocks in the OS cache memory.
> >
> > And disk + raid are quick enought to bypass the OS cache memory for this
> > tablespace.
> >
> >
> > Are things a bit clearer and usage not so silly ?
>
> Well, I think you're vastly overestimating the power of
> effective_cache_size. effective_cache_size changes the planner's
> estimation of how likely a repeated partial index scan is to find the
> same block in cache. So it only affects
> nested-loop-with-inner-indexscan plans, and if effective_cache_size is
> set to a value larger than the size of the index (or maybe the
> relation, I'm too lazy to go reread the code right now), one value is
> as good as another. For a typical user, I think you could set
> effective_cache_size to, say, a terabyte, and it wouldn't make a bit
> of difference. Heck, why not 2TB.

Ok. I don't care too much on this parameter so.
As we were talking about some parameters that can be tablespace specific, I
thought this one can have different values too.

>
> As far as I can see, the only possible value of setting this knob to a
> value other than positive-infinity is that if you have a huge dataset
> that's not close to fitting in memory, this might cause the planner to
> pick a merge join over a nested loop with inner indexscan, which might
> be better if it makes the I/O sequential rather than random. Anyone
> think I'm a pessimist?
>
> ...Robert
>

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From: Josh Berkus on 23 Oct 2009 19:04

Cedric,

> ase is a table containing 29 GB of bytea in a database of 52 GB. Every row
> on the 29GB table is grab only few times. And it will just renew OS cache
> memory every time (the server have only 8GB of ram).
> So when I remove this table (not the index) from the OS cache memory, I keep
> more interesting blocks in the OS cache memory.

effective_cache_size doesn't control what gets cached, it just tells the
planner about it.

Now, if we had an OS which could be convinced to handle caching
differently for different physical devices, then I could see wanting
this setting to be per-tablespace. For example, it would make a lot of
sense not to FS-cache any data which is on a ramdisk or superfast SSD
array. The same with archive data which you expected to be slow and
infrequently accessed on a NAS device. If your OS can do that, while
caching data from other sources, then it would make sense.

However, I don't know any current OS which allows for this. Does anyone
else?

--Josh Berkus

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: Application name patch - v2
Next: Controlling changes in plpgsql variableresolution