Prev: Oops while running fs_racer test on a POWER6 box against latest git
Next: [PATCH] OMAP2: powerdomain: Add break in switch statement
From: KAMEZAWA Hiroyuki on 11 Jul 2010 20:20 On Fri, 09 Jul 2010 19:06:28 -0400 Munehiro Ikeda <m-ikeda(a)ds.jp.nec.com> wrote: > OK, we can do it like: > > do { > old = pc->flags; > new = old & ((1UL << PAGE_TRACKING_ID_SHIFT) - 1); > new |= (unsigned long)(id << PAGE_TRACKING_ID_SHIFT); > } while (cmpxchg(&pc->flags, old, new) != old); > > > > IIUC, there is an generic version now even if it's heavy. > > I couldn't find it out...is there already? Or you mean we should > have generic one? > generic cmpxchg in /include/asm-generic/cmpxchg.h But...ahh...in some arch, you can't use cmpxchg, sorry. How about start from "a new field" for I/O cgroup in page_cgroup ? struct page_cgroup { unsigned long flags; struct mem_cgroup *mem_cgroup; unsigned short blkio_cgroup_id; struct page *page; struct list_head lru; /* per cgroup LRU list */ }; We can consider how we optimize it out, later. (And, it's easier to debug and make development smooth.) For example, If we can create a vmalloced-array of mem_cgroup, id->mem_cgroup lookup can be very fast and we can replace pc->mem_cgroup link with pc->mem_cgroup_id. > > >> +unsigned long page_cgroup_get_owner(struct page *page); > >> +int page_cgroup_set_owner(struct page *page, unsigned long id); > >> +int page_cgroup_copy_owner(struct page *npage, struct page *opage); > >> + > >> void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat); > >> > >> #ifdef CONFIG_SPARSEMEM > >> diff --git a/init/Kconfig b/init/Kconfig > >> index 2e40f2f..337ee01 100644 > >> --- a/init/Kconfig > >> +++ b/init/Kconfig > >> @@ -650,7 +650,7 @@ endif # CGROUPS > >> > >> config CGROUP_PAGE > >> def_bool y > >> - depends on CGROUP_MEM_RES_CTLR > >> + depends on CGROUP_MEM_RES_CTLR || GROUP_IOSCHED_ASYNC > >> > >> config MM_OWNER > >> bool > >> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c > >> index 6c00814..69e080c 100644 > >> --- a/mm/page_cgroup.c > >> +++ b/mm/page_cgroup.c > >> @@ -9,6 +9,7 @@ > >> #include<linux/vmalloc.h> > >> #include<linux/cgroup.h> > >> #include<linux/swapops.h> > >> +#include<linux/blk-iotrack.h> > >> > >> static void __meminit > >> __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn) > >> @@ -74,7 +75,7 @@ void __init page_cgroup_init_flatmem(void) > >> > >> int nid, fail; > >> > >> - if (mem_cgroup_disabled()) > >> + if (mem_cgroup_disabled()&& blk_iotrack_disabled()) > >> return; > >> > >> for_each_online_node(nid) { > >> @@ -83,12 +84,13 @@ void __init page_cgroup_init_flatmem(void) > >> goto fail; > >> } > >> printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage); > >> - printk(KERN_INFO "please try 'cgroup_disable=memory' option if you" > >> - " don't want memory cgroups\n"); > >> + printk(KERN_INFO "please try 'cgroup_disable=memory,blkio' option" > >> + " if you don't want memory and blkio cgroups\n"); > >> return; > >> fail: > >> printk(KERN_CRIT "allocation of page_cgroup failed.\n"); > >> - printk(KERN_CRIT "please try 'cgroup_disable=memory' boot option\n"); > >> + printk(KERN_CRIT > >> + "please try 'cgroup_disable=memory,blkio' boot option\n"); > >> panic("Out of memory"); > >> } > > > > Hmm, io-track is always done if blkio-cgroup is used, Right ? > > No, iotrack is enabled only when CONFIG_GROUP_IOSCHED_ASYNC=y. > If =n, iotrack is disabled even if blkio-cgroup is enabled. > > > > Then, why you have extra config as CONFIG_GROUP_IOSCHED_ASYNC ? > > Is it necessary ? > > Current purpose of the option is only for debug because it is > experimental functionality. > It can be removed if this work is completed, or dynamic switch > might be useful. > > In fact, just "cgroup_disable=memory" is enough for the failure > case. Let me think right messages. > IMHO, once you add boot-option or sysctl, it's very hard to remove it, lator. So, if you think you'll remove it lator, don't add it or just add CONFIG. Regards, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Munehiro IKEDA on 14 Jul 2010 10:50
KAMEZAWA Hiroyuki wrote: > On Fri, 09 Jul 2010 19:06:28 -0400 > Munehiro Ikeda <m-ikeda(a)ds.jp.nec.com> wrote: > >> OK, we can do it like: >> >> do { >> old = pc->flags; >> new = old & ((1UL << PAGE_TRACKING_ID_SHIFT) - 1); >> new |= (unsigned long)(id << PAGE_TRACKING_ID_SHIFT); >> } while (cmpxchg(&pc->flags, old, new) != old); >> >> >>> IIUC, there is an generic version now even if it's heavy. >> I couldn't find it out...is there already? Or you mean we should >> have generic one? >> > > generic cmpxchg in /include/asm-generic/cmpxchg.h > But...ahh...in some arch, you can't use cmpxchg, sorry. > > How about start from "a new field" for I/O cgroup in page_cgroup ? > > struct page_cgroup { > unsigned long flags; > struct mem_cgroup *mem_cgroup; > unsigned short blkio_cgroup_id; > struct page *page; > struct list_head lru; /* per cgroup LRU list */ > }; > > We can consider how we optimize it out, later. > (And, it's easier to debug and make development smooth.) > > For example, If we can create a vmalloced-array of mem_cgroup, > id->mem_cgroup lookup can be very fast and we can replace > pc->mem_cgroup link with pc->mem_cgroup_id. Nice suggestion. Thanks Kame-san. >>>> +unsigned long page_cgroup_get_owner(struct page *page); >>>> +int page_cgroup_set_owner(struct page *page, unsigned long id); >>>> +int page_cgroup_copy_owner(struct page *npage, struct page *opage); >>>> + >>>> void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat); >>>> >>>> #ifdef CONFIG_SPARSEMEM >>>> diff --git a/init/Kconfig b/init/Kconfig >>>> index 2e40f2f..337ee01 100644 >>>> --- a/init/Kconfig >>>> +++ b/init/Kconfig >>>> @@ -650,7 +650,7 @@ endif # CGROUPS >>>> >>>> config CGROUP_PAGE >>>> def_bool y >>>> - depends on CGROUP_MEM_RES_CTLR >>>> + depends on CGROUP_MEM_RES_CTLR || GROUP_IOSCHED_ASYNC >>>> >>>> config MM_OWNER >>>> bool >>>> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c >>>> index 6c00814..69e080c 100644 >>>> --- a/mm/page_cgroup.c >>>> +++ b/mm/page_cgroup.c >>>> @@ -9,6 +9,7 @@ >>>> #include<linux/vmalloc.h> >>>> #include<linux/cgroup.h> >>>> #include<linux/swapops.h> >>>> +#include<linux/blk-iotrack.h> >>>> >>>> static void __meminit >>>> __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn) >>>> @@ -74,7 +75,7 @@ void __init page_cgroup_init_flatmem(void) >>>> >>>> int nid, fail; >>>> >>>> - if (mem_cgroup_disabled()) >>>> + if (mem_cgroup_disabled()&& blk_iotrack_disabled()) >>>> return; >>>> >>>> for_each_online_node(nid) { >>>> @@ -83,12 +84,13 @@ void __init page_cgroup_init_flatmem(void) >>>> goto fail; >>>> } >>>> printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage); >>>> - printk(KERN_INFO "please try 'cgroup_disable=memory' option if you" >>>> - " don't want memory cgroups\n"); >>>> + printk(KERN_INFO "please try 'cgroup_disable=memory,blkio' option" >>>> + " if you don't want memory and blkio cgroups\n"); >>>> return; >>>> fail: >>>> printk(KERN_CRIT "allocation of page_cgroup failed.\n"); >>>> - printk(KERN_CRIT "please try 'cgroup_disable=memory' boot option\n"); >>>> + printk(KERN_CRIT >>>> + "please try 'cgroup_disable=memory,blkio' boot option\n"); >>>> panic("Out of memory"); >>>> } >>> Hmm, io-track is always done if blkio-cgroup is used, Right ? >> No, iotrack is enabled only when CONFIG_GROUP_IOSCHED_ASYNC=y. >> If =n, iotrack is disabled even if blkio-cgroup is enabled. >> > > >>> Then, why you have extra config as CONFIG_GROUP_IOSCHED_ASYNC ? >>> Is it necessary ? >> Current purpose of the option is only for debug because it is >> experimental functionality. >> It can be removed if this work is completed, or dynamic switch >> might be useful. >> >> In fact, just "cgroup_disable=memory" is enough for the failure >> case. Let me think right messages. >> > > IMHO, once you add boot-option or sysctl, it's very hard to remove it, lator. > So, if you think you'll remove it lator, don't add it or just add CONFIG. OK. I understand we need to seriously think over to add a new boot option. Thanks, Muuhh -- IKEDA, Munehiro NEC Corporation of America m-ikeda(a)ds.jp.nec.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |