Prev: [RFC] transfer ASYNC vmscan writeback IO to the flusher threads
Next: [PATCH] block: move del_gendisk() from fs/partitions/check.c to block/genhd.c
From: Pekka Enberg on 30 Jul 2010 05:30 On Fri, Jul 30, 2010 at 12:47 AM, John Johansen <john.johansen(a)canonical.com> wrote: > +/** > + * kvmalloc - do allocation preferring kmalloc but falling back to vmalloc > + * @size: size of allocation > + * > + * Return: allocated buffer or NULL if failed > + * > + * It is possible that policy being loaded from the user is larger than > + * what can be allocated by kmalloc, in those cases fall back to vmalloc. > + */ > +void *kvmalloc(size_t size) > +{ > + � � � void *buffer = NULL; > + > + � � � if (size == 0) > + � � � � � � � return NULL; > + > + � � � /* do not attempt kmalloc if we need more than 16 pages at once */ > + � � � if (size <= (16*PAGE_SIZE)) > + � � � � � � � buffer = kmalloc(size, GFP_NOIO | __GFP_NOWARN); 16 pages is a lot of memory for 64 K pages. What's the purpose of GFP_NOIO here? vmalloc() will do GFP_KERNEL allocations anyway. > + � � � if (!buffer) { > + � � � � � � � /* see kvfree for why size must be at least work_struct size > + � � � � � � � �* when allocated via vmalloc > + � � � � � � � �*/ > + � � � � � � � if (size < sizeof(struct work_struct)) > + � � � � � � � � � � � size = sizeof(struct work_struct); > + � � � � � � � buffer = vmalloc(size); > + � � � } > + � � � return buffer; > +} Please don't hide this into apparmor internals. People have invented this function in the past so maybe it's time to put it in mm/util.c? > + > +/** > + * do_vfree - workqueue routine for freeing vmalloced memory > + * @work: data to be freed > + * > + * The work_struct is overlaid to the data being freed, as at the point > + * the work is scheduled the data is no longer valid, be its freeing > + * needs to be delayed until safe. > + */ > +static void do_vfree(struct work_struct *work) > +{ > + � � � vfree(work); > +} > + > +/** > + * kvfree - free an allocation do by kvmalloc > + * @buffer: buffer to free (MAYBE_NULL) > + * > + * Free a buffer allocated by kvmalloc > + */ > +void kvfree(void *buffer) > +{ > + � � � if (is_vmalloc_addr(buffer)) { > + � � � � � � � /* Data is no longer valid so just use the allocated space > + � � � � � � � �* as the work_struct > + � � � � � � � �*/ > + � � � � � � � struct work_struct *work = (struct work_struct *) buffer; > + � � � � � � � INIT_WORK(work, do_vfree); > + � � � � � � � schedule_work(work); I don't understand this part here. Is it needed for interrupt contexts or does vfree() sleep somewhere? If it's for the former, I think we can just add a comment saying that kvmalloc/kvfree is not safe from interrupt context and remove the schedule_work() parts here. > + � � � } else > + � � � � � � � kfree(buffer); > +} -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: John Johansen on 30 Jul 2010 06:10 On 07/30/2010 02:20 AM, Pekka Enberg wrote: > On Fri, Jul 30, 2010 at 12:47 AM, John Johansen > <john.johansen(a)canonical.com> wrote: >> +/** >> + * kvmalloc - do allocation preferring kmalloc but falling back to vmalloc >> + * @size: size of allocation >> + * >> + * Return: allocated buffer or NULL if failed >> + * >> + * It is possible that policy being loaded from the user is larger than >> + * what can be allocated by kmalloc, in those cases fall back to vmalloc. >> + */ >> +void *kvmalloc(size_t size) >> +{ >> + void *buffer = NULL; >> + >> + if (size == 0) >> + return NULL; >> + >> + /* do not attempt kmalloc if we need more than 16 pages at once */ >> + if (size <= (16*PAGE_SIZE)) >> + buffer = kmalloc(size, GFP_NOIO | __GFP_NOWARN); > > 16 pages is a lot of memory for 64 K pages. What's the purpose of yes it is, and I don't expect it will every allocate that much, though it will occassionally with large policies do allocations larger than 16*4K. The figure here is some what arbitrary, and I would certainly be willing to shrink it. Basically it is there to put a clamp on allocating precious physically contiguous memory. > GFP_NOIO here? vmalloc() will do GFP_KERNEL allocations anyway. > yep, and it used to be GFP_KERNEL too, looking back GFP_NOIO happend when poking at a bug where apparmor was trigger a IO when it was allocating its memory. Turned out the bug wasn't apparmor related just being triggered while apparmor was loading policy, but the GFP_NOIO flag stuck here. I am more than willing to flip it back. >> + if (!buffer) { >> + /* see kvfree for why size must be at least work_struct size >> + * when allocated via vmalloc >> + */ >> + if (size < sizeof(struct work_struct)) >> + size = sizeof(struct work_struct); >> + buffer = vmalloc(size); >> + } >> + return buffer; >> +} > > Please don't hide this into apparmor internals. People have invented > this function in the past so maybe it's time to put it in mm/util.c? > sure, I would be more than willing to replace this with a generic system fn. The last attempt I saw at adding generic routines of this nature was here http://www.spinics.net/lists/linux-fsdevel/msg31407.html >> + >> +/** >> + * do_vfree - workqueue routine for freeing vmalloced memory >> + * @work: data to be freed >> + * >> + * The work_struct is overlaid to the data being freed, as at the point >> + * the work is scheduled the data is no longer valid, be its freeing >> + * needs to be delayed until safe. >> + */ >> +static void do_vfree(struct work_struct *work) >> +{ >> + vfree(work); >> +} >> + >> +/** >> + * kvfree - free an allocation do by kvmalloc >> + * @buffer: buffer to free (MAYBE_NULL) >> + * >> + * Free a buffer allocated by kvmalloc >> + */ >> +void kvfree(void *buffer) >> +{ >> + if (is_vmalloc_addr(buffer)) { >> + /* Data is no longer valid so just use the allocated space >> + * as the work_struct >> + */ >> + struct work_struct *work = (struct work_struct *) buffer; >> + INIT_WORK(work, do_vfree); >> + schedule_work(work); > > I don't understand this part here. Is it needed for interrupt contexts > or does vfree() sleep somewhere? If it's for the former, I think we > can just add a comment saying that kvmalloc/kvfree is not safe from > interrupt context and remove the schedule_work() parts here. > vfree can sleep, and skipping the schedule_work parts won't work for apparmor as many of these allocations are being freed via rcu callbacks as most of our object life cycles are dependent on cred refcounting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on 30 Jul 2010 07:00 On Fri, Jul 30, 2010 at 1:01 PM, John Johansen <john.johansen(a)canonical.com> wrote: >>> +/** >>> + * kvfree - free an allocation do by kvmalloc >>> + * @buffer: buffer to free (MAYBE_NULL) >>> + * >>> + * Free a buffer allocated by kvmalloc >>> + */ >>> +void kvfree(void *buffer) >>> +{ >>> + � � � if (is_vmalloc_addr(buffer)) { >>> + � � � � � � � /* Data is no longer valid so just use the allocated space >>> + � � � � � � � �* as the work_struct >>> + � � � � � � � �*/ >>> + � � � � � � � struct work_struct *work = (struct work_struct *) buffer; >>> + � � � � � � � INIT_WORK(work, do_vfree); >>> + � � � � � � � schedule_work(work); >> >> I don't understand this part here. Is it needed for interrupt contexts >> or does vfree() sleep somewhere? If it's for the former, I think we >> can just add a comment saying that kvmalloc/kvfree is not safe from >> interrupt context and remove the schedule_work() parts here. > > vfree can sleep, and skipping the schedule_work parts won't work for > apparmor as many of these allocations are being freed via rcu callbacks > as most of our object life cycles are dependent on cred refcounting. Can someone point me to where vfree() actually sleeps? I'm unable to find the exact spot. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Changli Gao on 30 Jul 2010 10:30 On Fri, Jul 30, 2010 at 6:53 PM, Pekka Enberg <penberg(a)cs.helsinki.fi> wrote: > On Fri, Jul 30, 2010 at 1:01 PM, John Johansen >> >> vfree can sleep, and skipping the schedule_work parts won't work for >> apparmor as many of these allocations are being freed via rcu callbacks >> as most of our object life cycles are dependent on cred refcounting. > > Can someone point me to where vfree() actually sleeps? I'm unable to > find the exact spot. > http://lxr.linux.no/linux+v2.6.34.1/mm/vmalloc.c#L1405 . vfree -> __vunmap, vunmap -> __vunmap, and there is a might_sleep() function in vunmap. BTW: I'll respin the kvmalloc patch later. Thanks. -- Regards, Changli Gao(xiaosuo(a)gmail.com) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on 30 Jul 2010 11:10
On Fri, Jul 30, 2010 at 5:24 PM, Changli Gao <xiaosuo(a)gmail.com> wrote: > On Fri, Jul 30, 2010 at 6:53 PM, Pekka Enberg <penberg(a)cs.helsinki.fi> wrote: >> On Fri, Jul 30, 2010 at 1:01 PM, John Johansen >>> >>> vfree can sleep, and skipping the schedule_work parts won't work for >>> apparmor as many of these allocations are being freed via rcu callbacks >>> as most of our object life cycles are dependent on cred refcounting. >> >> Can someone point me to where vfree() actually sleeps? I'm unable to >> find the exact spot. > > http://lxr.linux.no/linux+v2.6.34.1/mm/vmalloc.c#L1405 . vfree -> > __vunmap, vunmap -> __vunmap, and there is a might_sleep() function in > vunmap. Yes, but that doesn't answer my question. Where's the actual call-site that can sleep? I can't find it! > BTW: I'll respin the kvmalloc patch later. Great! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |