Prev: drivers/serial/sunsab.c: adjust the constant used to initialize the interrupt_mask0 fields
Next: [PATCH] xfs: Fix integer overflow in fs/xfs/linux-2.6/xfs_ioctl*.c
From: Daisuke Nishimura on 16 Mar 2010 03:50 On Mon, 15 Mar 2010 00:26:39 +0100, Andrea Righi <arighi(a)develer.com> wrote: > Document cgroup dirty memory interfaces and statistics. > > Signed-off-by: Andrea Righi <arighi(a)develer.com> > --- > Documentation/cgroups/memory.txt | 36 ++++++++++++++++++++++++++++++++++++ > 1 files changed, 36 insertions(+), 0 deletions(-) > > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt > index 49f86f3..38ca499 100644 > --- a/Documentation/cgroups/memory.txt > +++ b/Documentation/cgroups/memory.txt > @@ -310,6 +310,11 @@ cache - # of bytes of page cache memory. > rss - # of bytes of anonymous and swap cache memory. > pgpgin - # of pages paged in (equivalent to # of charging events). > pgpgout - # of pages paged out (equivalent to # of uncharging events). > +filedirty - # of pages that are waiting to get written back to the disk. > +writeback - # of pages that are actively being written back to the disk. > +writeback_tmp - # of pages used by FUSE for temporary writeback buffers. > +nfs - # of NFS pages sent to the server, but not yet committed to > + the actual storage. > active_anon - # of bytes of anonymous and swap cache memory on active > lru list. > inactive_anon - # of bytes of anonymous memory and swap cache memory on > @@ -345,6 +350,37 @@ Note: > - a cgroup which uses hierarchy and it has child cgroup. > - a cgroup which uses hierarchy and not the root of hierarchy. > > +5.4 dirty memory > + > + Control the maximum amount of dirty pages a cgroup can have at any given time. > + > + Limiting dirty memory is like fixing the max amount of dirty (hard to > + reclaim) page cache used by any cgroup. So, in case of multiple cgroup writers, > + they will not be able to consume more than their designated share of dirty > + pages and will be forced to perform write-out if they cross that limit. > + > + The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. > + It is possible to configure a limit to trigger both a direct writeback or a > + background writeback performed by per-bdi flusher threads. > + > + Per-cgroup dirty limits can be set using the following files in the cgroupfs: > + > + - memory.dirty_ratio: contains, as a percentage of cgroup memory, the > + amount of dirty memory at which a process which is generating disk writes > + inside the cgroup will start itself writing out dirty data. > + > + - memory.dirty_bytes: the amount of dirty memory of the cgroup (expressed in > + bytes) at which a process generating disk writes will start itself writing > + out dirty data. > + > + - memory.dirty_background_ratio: contains, as a percentage of the cgroup > + memory, the amount of dirty memory at which background writeback kernel > + threads will start writing out dirty data. > + > + - memory.dirty_background_bytes: the amount of dirty memory of the cgroup (in > + bytes) at which background writeback kernel threads will start writing out > + dirty data. > + > It would be better to note that what those files of root cgroup mean. We cannot write any value to them, IOW, we cannot control dirty limit about root cgroup. And they show the same value as the global one(strictly speaking, it's not true because global values can change. We need a hook in mem_cgroup_dirty_read()?). Thanks, Daisuke Nishimura. > 6. Hierarchy support > > -- > 1.6.3.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Greg Thelen on 17 Mar 2010 13:50 On Mon, Mar 15, 2010 at 11:41 PM, Daisuke Nishimura <nishimura(a)mxp.nes.nec.co.jp> wrote: > On Mon, 15 Mar 2010 00:26:39 +0100, Andrea Righi <arighi(a)develer.com> wrote: >> Document cgroup dirty memory interfaces and statistics. >> >> Signed-off-by: Andrea Righi <arighi(a)develer.com> >> --- >> �Documentation/cgroups/memory.txt | � 36 ++++++++++++++++++++++++++++++++++++ >> �1 files changed, 36 insertions(+), 0 deletions(-) >> >> diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt >> index 49f86f3..38ca499 100644 >> --- a/Documentation/cgroups/memory.txt >> +++ b/Documentation/cgroups/memory.txt >> @@ -310,6 +310,11 @@ cache � � � � � �- # of bytes of page cache memory. >> �rss � � � � �- # of bytes of anonymous and swap cache memory. >> �pgpgin � � � � � � � - # of pages paged in (equivalent to # of charging events). >> �pgpgout � � � � � � �- # of pages paged out (equivalent to # of uncharging events). >> +filedirty � �- # of pages that are waiting to get written back to the disk. >> +writeback � �- # of pages that are actively being written back to the disk. >> +writeback_tmp � � � �- # of pages used by FUSE for temporary writeback buffers. >> +nfs � � � � �- # of NFS pages sent to the server, but not yet committed to >> + � � � � � � � the actual storage. Should these new memory.stat counters (filedirty, etc) report byte counts rather than page counts? I am thinking that byte counters would make reporting more obvious depending on how heterogeneous page sizes are used. Byte counters would also agree with /proc/meminfo. Within the kernel we could still maintain page counts. The only change would be to the reporting routine, mem_cgroup_get_local_stat(), which would scale the page counts by PAGE_SIZE as it does for for cache,rss,etc. -- Greg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Balbir Singh on 17 Mar 2010 15:10 * Greg Thelen <gthelen(a)google.com> [2010-03-17 09:48:18]: > On Mon, Mar 15, 2010 at 11:41 PM, Daisuke Nishimura > <nishimura(a)mxp.nes.nec.co.jp> wrote: > > On Mon, 15 Mar 2010 00:26:39 +0100, Andrea Righi <arighi(a)develer.com> wrote: > >> Document cgroup dirty memory interfaces and statistics. > >> > >> Signed-off-by: Andrea Righi <arighi(a)develer.com> > >> --- > >> �Documentation/cgroups/memory.txt | � 36 ++++++++++++++++++++++++++++++++++++ > >> �1 files changed, 36 insertions(+), 0 deletions(-) > >> > >> diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt > >> index 49f86f3..38ca499 100644 > >> --- a/Documentation/cgroups/memory.txt > >> +++ b/Documentation/cgroups/memory.txt > >> @@ -310,6 +310,11 @@ cache � � � � � �- # of bytes of page cache memory. > >> �rss � � � � �- # of bytes of anonymous and swap cache memory. > >> �pgpgin � � � � � � � - # of pages paged in (equivalent to # of charging events). > >> �pgpgout � � � � � � �- # of pages paged out (equivalent to # of uncharging events). > >> +filedirty � �- # of pages that are waiting to get written back to the disk. > >> +writeback � �- # of pages that are actively being written back to the disk. > >> +writeback_tmp � � � �- # of pages used by FUSE for temporary writeback buffers. > >> +nfs � � � � �- # of NFS pages sent to the server, but not yet committed to > >> + � � � � � � � the actual storage. > > Should these new memory.stat counters (filedirty, etc) report byte > counts rather than page counts? I am thinking that byte counters > would make reporting more obvious depending on how heterogeneous page > sizes are used. Byte counters would also agree with /proc/meminfo. > Within the kernel we could still maintain page counts. The only > change would be to the reporting routine, mem_cgroup_get_local_stat(), > which would scale the page counts by PAGE_SIZE as it does for for > cache,rss,etc. > I agree, byte counts would be better than page counts. pgpin and pgpout are special cases where the pages matter, the size does not due to the nature of the operation. -- Three Cheers, Balbir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andrea Righi on 17 Mar 2010 18:50
On Tue, Mar 16, 2010 at 04:41:21PM +0900, Daisuke Nishimura wrote: > On Mon, 15 Mar 2010 00:26:39 +0100, Andrea Righi <arighi(a)develer.com> wrote: > > Document cgroup dirty memory interfaces and statistics. > > > > Signed-off-by: Andrea Righi <arighi(a)develer.com> > > --- > > Documentation/cgroups/memory.txt | 36 ++++++++++++++++++++++++++++++++++++ > > 1 files changed, 36 insertions(+), 0 deletions(-) > > > > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt > > index 49f86f3..38ca499 100644 > > --- a/Documentation/cgroups/memory.txt > > +++ b/Documentation/cgroups/memory.txt > > @@ -310,6 +310,11 @@ cache - # of bytes of page cache memory. > > rss - # of bytes of anonymous and swap cache memory. > > pgpgin - # of pages paged in (equivalent to # of charging events). > > pgpgout - # of pages paged out (equivalent to # of uncharging events). > > +filedirty - # of pages that are waiting to get written back to the disk. > > +writeback - # of pages that are actively being written back to the disk. > > +writeback_tmp - # of pages used by FUSE for temporary writeback buffers. > > +nfs - # of NFS pages sent to the server, but not yet committed to > > + the actual storage. > > active_anon - # of bytes of anonymous and swap cache memory on active > > lru list. > > inactive_anon - # of bytes of anonymous memory and swap cache memory on > > @@ -345,6 +350,37 @@ Note: > > - a cgroup which uses hierarchy and it has child cgroup. > > - a cgroup which uses hierarchy and not the root of hierarchy. > > > > +5.4 dirty memory > > + > > + Control the maximum amount of dirty pages a cgroup can have at any given time. > > + > > + Limiting dirty memory is like fixing the max amount of dirty (hard to > > + reclaim) page cache used by any cgroup. So, in case of multiple cgroup writers, > > + they will not be able to consume more than their designated share of dirty > > + pages and will be forced to perform write-out if they cross that limit. > > + > > + The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. > > + It is possible to configure a limit to trigger both a direct writeback or a > > + background writeback performed by per-bdi flusher threads. > > + > > + Per-cgroup dirty limits can be set using the following files in the cgroupfs: > > + > > + - memory.dirty_ratio: contains, as a percentage of cgroup memory, the > > + amount of dirty memory at which a process which is generating disk writes > > + inside the cgroup will start itself writing out dirty data. > > + > > + - memory.dirty_bytes: the amount of dirty memory of the cgroup (expressed in > > + bytes) at which a process generating disk writes will start itself writing > > + out dirty data. > > + > > + - memory.dirty_background_ratio: contains, as a percentage of the cgroup > > + memory, the amount of dirty memory at which background writeback kernel > > + threads will start writing out dirty data. > > + > > + - memory.dirty_background_bytes: the amount of dirty memory of the cgroup (in > > + bytes) at which background writeback kernel threads will start writing out > > + dirty data. > > + > > > It would be better to note that what those files of root cgroup mean. > We cannot write any value to them, IOW, we cannot control dirty limit about root cgroup. OK. > And they show the same value as the global one(strictly speaking, it's not true > because global values can change. We need a hook in mem_cgroup_dirty_read()?). OK, we can just return system-wide value if mem_cgroup_is_root() in mem_cgroup_dirty_read(). Will change this in the next version. Thanks, -Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |