From: KOSAKI Motohiro on 27 Jun 2010 23:20 > Now we're sure noone is using old cpumask operators, nor *cpumask, we can > allocate less bits safely. This reduces the memory usage of off-stack > cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual > cpus. I have to say I'm sorry. Probably I broke your assumption. If this patch applied, we reintroduce exposing nr_cpu_ids issue and break libnuma again. I think following change is necessary too. Or, Am I missing something? ================================================================== diff --git a/kernel/compat.c b/kernel/compat.c index 5adab05..5fbee3e 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -506,7 +506,8 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len, ret = sched_getaffinity(pid, mask); if (ret == 0) { - size_t retlen = min_t(size_t, len, cpumask_size()); + size_t retlen = min_t(size_t, len, + BITS_TO_LONGS(NR_CPUS) * sizeof(long)); if (compat_put_bitmap(user_mask_ptr, cpumask_bits(mask), retlen * 8)) ret = -EFAULT; diff --git a/kernel/sched.c b/kernel/sched.c index 18faf4d..c14acad 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, ret = sched_getaffinity(pid, mask); if (ret == 0) { - size_t retlen = min_t(size_t, len, cpumask_size()); + size_t retlen = min_t(size_t, len, + BITS_TO_LONGS(NR_CPUS) * sizeof(long)); > Signed-off-by: Rusty Russell <rusty(a)rustcorp.com.au> > Cc: Arnd Bergmann <arnd(a)arndb.de> > Cc: anton(a)samba.org > Cc: KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com> > Cc: Mike Travis <travis(a)sgi.com> > --- > include/linux/cpumask.h | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h > --- a/include/linux/cpumask.h > +++ b/include/linux/cpumask.h > @@ -1014,13 +1014,11 @@ static inline int cpulist_parse(const ch > /** > * cpumask_size - size to allocate for a 'struct cpumask' in bytes > * > - * This will eventually be a runtime variable, depending on nr_cpu_ids. > + * This can be a runtime variable, depending on nr_cpu_ids. > */ > static inline size_t cpumask_size(void) > { > - /* FIXME: Once all cpumask assignments are eliminated, this > - * can be nr_cpumask_bits */ > - return BITS_TO_LONGS(NR_CPUS) * sizeof(long); > + return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long); > } > > /* > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on 28 Jun 2010 06:30 On Mon, 28 Jun 2010 12:42:23 pm KOSAKI Motohiro wrote: > > Now we're sure noone is using old cpumask operators, nor *cpumask, we can > > allocate less bits safely. This reduces the memory usage of off-stack > > cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual > > cpus. > > I have to say I'm sorry. Probably I broke your assumption. > If this patch applied, we reintroduce exposing nr_cpu_ids issue and > break libnuma again. I think following change is necessary too. > > Or, Am I missing something? I cc'd you because I remembered you being involved in that libnuma issue and couldn't remember the details. Unfortunately, this solution doesn't work: > diff --git a/kernel/sched.c b/kernel/sched.c > index 18faf4d..c14acad 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, > > ret = sched_getaffinity(pid, mask); > if (ret == 0) { > - size_t retlen = min_t(size_t, len, cpumask_size()); > + size_t retlen = min_t(size_t, len, > + BITS_TO_LONGS(NR_CPUS) * sizeof(long)); > Since mask is a cpumask_var_t, only cpumask_size() is allocated. We can't copy NR_CPUS bits. But I think it's OK, anyway. libnuma is broken because it gets upset if the number of cpus it reads from /sys/.../cpumap is more than the cpumask size returned from sys_sched_getaffinity. Currently, getaffinity returns cpumask_size() (ie. based on NR_CPUS), and the printing routines use nr_cpumask_bits (ie. based on NR_CPUS for !CPUMASK_OFFSTACK, nr_cpu_ids for CPUMASK_OFFSTACK). (libnuma is OK on CONFIG_CPUMASK_OFFSTACK=y because the sysfs output is *shorter* than expected. I checked the code). With this patch, cpumask_size() becomes based on nr_cpumask_bits, so both getaffinity and sysfs are using the same basis. Do you agree? Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on 28 Jun 2010 06:40 > On Mon, 28 Jun 2010 12:42:23 pm KOSAKI Motohiro wrote: > > > Now we're sure noone is using old cpumask operators, nor *cpumask, we can > > > allocate less bits safely. This reduces the memory usage of off-stack > > > cpumasks when CONFIG_CPUMASK_OFFSTACK=y but we don't have NR_CPUS actual > > > cpus. > > > > I have to say I'm sorry. Probably I broke your assumption. > > If this patch applied, we reintroduce exposing nr_cpu_ids issue and > > break libnuma again. I think following change is necessary too. > > > > Or, Am I missing something? > > I cc'd you because I remembered you being involved in that libnuma issue > and couldn't remember the details. > > Unfortunately, this solution doesn't work: > > > diff --git a/kernel/sched.c b/kernel/sched.c > > index 18faf4d..c14acad 100644 > > --- a/kernel/sched.c > > +++ b/kernel/sched.c > > @@ -4823,7 +4823,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, > > > > ret = sched_getaffinity(pid, mask); > > if (ret == 0) { > > - size_t retlen = min_t(size_t, len, cpumask_size()); > > + size_t retlen = min_t(size_t, len, > > + BITS_TO_LONGS(NR_CPUS) * sizeof(long)); > > > > Since mask is a cpumask_var_t, only cpumask_size() is allocated. We can't > copy NR_CPUS bits. Ahh, yes. It's purely broken. > But I think it's OK, anyway. libnuma is broken because it gets upset if the > number of cpus it reads from /sys/.../cpumap is more than the cpumask size > returned from sys_sched_getaffinity. > > Currently, getaffinity returns cpumask_size() (ie. based on NR_CPUS), and > the printing routines use nr_cpumask_bits (ie. based on NR_CPUS for > !CPUMASK_OFFSTACK, nr_cpu_ids for CPUMASK_OFFSTACK). > > (libnuma is OK on CONFIG_CPUMASK_OFFSTACK=y because the sysfs output is > *shorter* than expected. I checked the code). > > With this patch, cpumask_size() becomes based on nr_cpumask_bits, so both > getaffinity and sysfs are using the same basis. > > Do you agree? Sure. I agree I missed. Thank you for very kindful explanation! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: GOOD DAY Next: Staging: otus: fix coding style issues in ioctl.c |