request_firmware API exhaust memory [Kernel]

Prev: uml: i386: Avoid redefinition of NR_syscalls
Next: rcu: slim down rcutiny by removing rcu_scheduler_active and friends

From: Tomas Winkler on 26 Apr 2010 13:00

On Mon, Apr 26, 2010 at 6:19 PM, Kay Sievers <kay.sievers(a)vrfy.org> wrote:
> On Mon, 2010-04-26 at 12:38 +0200, Kay Sievers wrote:
>> On Sun, Apr 25, 2010 at 22:09, Tomas Winkler <tomasw(a)gmail.com> wrote:
>> > Said thing is that I don't see where the memory goes.... Anyhow I will
>> > try to run valgrin on udev just to be sure.
>>
>> Nah, that memory would be freed, if you kill all udev processes, which
>> it doesn't.
>>
>> The many udev worker processes you see for a few seconds was caused by
>> udevd handling events with TIMEOUT= set special. We need to make sure,
>> that firmware events run immediately and don't wait for other
>> processes to finish. The logic who does that was always creating a new
>> worker. I changed this now, but this will not affect the underlying
>> problem you are seeing, it will just make the udev workers not grow in
>> a timeframe of less than 10 seconds. The change is here:
>> http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=665ee17def2caa6811ae032ae68ebf8239a18cf8
>> but as mentioned, this change is unrelated to the memory leak you are seeing.
>>
>> > I'll be glad If someone can run my simple driver I posted and confirm
>> > that sees the same problem
>>
>> I can confirm that memory gets lost. I suspect for some reason the
>> firmware does not get properly cleaned up. If you increase the size of
>> the firmware image, it will leak memory much faster.
>
> I guess, the assumption that vfree() will free pages which are allocated
> by custom code, and not by vmalloc(), is not true.
>
> The attached changes seem to fix the issue for me.
>
> The custom page array mangling was introduced by David as an optimization
> with commit 6e03a201bbe8137487f340d26aa662110e324b20 and this should be
> checked, and if needed be fixed.
>
> Cheers,
> Kay
>
>
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 985da11..fe4e872 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -162,7 +162,7 @@ static ssize_t firmware_loading_store(struct device *dev,
> mutex_unlock(&fw_lock);
> break;
> }
> - vfree(fw_priv->fw->data);
> + vunmap(fw_priv->fw->data);
> fw_priv->fw->data = NULL;
> for (i = 0; i < fw_priv->nr_pages; i++)
> __free_page(fw_priv->pages[i]);
> @@ -176,7 +176,7 @@ static ssize_t firmware_loading_store(struct device *dev,
> break;
> case 0:
> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
> - vfree(fw_priv->fw->data);
> + vunmap(fw_priv->fw->data);
> fw_priv->fw->data = vmap(fw_priv->pages,
> fw_priv->nr_pages,
> 0, PAGE_KERNEL_RO);
> @@ -184,9 +184,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> dev_err(dev, "%s: vmap() failed\n", __func__);
> goto err;
> }
> - /* Pages will be freed by vfree() */
> - fw_priv->page_array_size = 0;
> - fw_priv->nr_pages = 0;
> complete(&fw_priv->completion);
> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
> break;
> @@ -578,7 +575,7 @@ release_firmware(const struct firmware *fw)
> if (fw->data == builtin->data)
> goto free_fw;
> }
> - vfree(fw->data);
> + vunmap(fw->data);
> free_fw:
> kfree(fw);
> }

Thanks for your effort I will test it tomorrow
Tomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Sujith Manoharan on 27 Apr 2010 00:20

On Mon, Apr 26, 2010 at 8:49 PM, Kay Sievers <kay.sievers(a)vrfy.org> wrote:
> I guess, the assumption that vfree() will free pages which are allocated
> by custom code, and not by vmalloc(), is not true.
>
> The attached changes seem to fix the issue for me.

I have the same issue with ath9k_htc. Repeated module load/unload
results in an eventual panic.

I checked your patch (on top of John Linville's wireless-testing tree)
and was unable to load the driver as FW download to the target failed
after the first iteration.

> - � � � � � � � � � � � /* Pages will be freed by vfree() */
> - � � � � � � � � � � � fw_priv->page_array_size = 0;
> - � � � � � � � � � � � fw_priv->nr_pages = 0;

Am completely ignorant about this code, but this seemed a bit fishy,
so I removed this chunk from the patch and tried. It seemed to be working.

But then, I got this trace. (Which was present before your patch too).
I do see ath9k_htc_txep() in the trace, but am still not sure it
there's a memory leak in the driver.

udevd version: 151

[ 606.888511] udevd[4744]: segfault at 7f0fdb19bf80 ip
00007f0fdb4f8231 sp 00007fff9a0cd590 error 5 in
ld-2.11.1.so[7f0fdb4ef000+1e000]
[ 606.888556] BUG: Bad page map in process udevd
pte:ffff88007b8b18e0 pmd:7c7e9067
[ 606.888559] addr:00007f0fdb095000 vm_flags:08000070 anon_vma:(null)
mapping:ffff88007cc4c998 index:108
[ 606.888565] vma->vm_ops->fault: filemap_fault+0x0/0x480
[ 606.888569] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[ 606.888573] Pid: 4744, comm: udevd Tainted: G D 2.6.34-rc5-wl #32
[ 606.888575] Call Trace:
[ 606.888581] [<ffffffff810ad644>] print_bad_pte+0x1a4/0x210
[ 606.888584] [<ffffffff810ae6e1>] unmap_vmas+0x4c1/0x970
[ 606.888589] [<ffffffff810b3b8d>] exit_mmap+0xdd/0x1a0
[ 606.888594] [<ffffffff810396d8>] mmput+0x48/0x100
[ 606.888597] [<ffffffff8103e1c1>] exit_mm+0x101/0x130
[ 606.888600] [<ffffffff8104058c>] do_exit+0x6ac/0x770
[ 606.888603] [<ffffffff8104069c>] do_group_exit+0x4c/0xc0
[ 606.888607] [<ffffffff8104c6f7>] get_signal_to_deliver+0x307/0x4c0
[ 606.888612] [<ffffffff81002180>] do_signal+0x70/0x7c0
[ 606.888616] [<ffffffff81026b9a>] ? do_page_fault+0xca/0x3e0
[ 606.888621] [<ffffffff81301575>] ? printk+0x3c/0x3f
[ 606.888624] [<ffffffff810d9790>] ? do_unlinkat+0x60/0x1c0
[ 606.888628] [<ffffffff81002925>] do_notify_resume+0x55/0x80
[ 606.888632] [<ffffffff81304942>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 606.888635] [<ffffffff81305a66>] retint_signal+0x46/0x90
[ 606.888638] swap_free: Bad swap file entry c07fffffffd023c3
[ 606.888641] BUG: Bad page map in process udevd
pte:ffffffffa04786b0 pmd:7c7e9067
[ 606.888644] addr:00007f0fdb096000 vm_flags:08000070 anon_vma:(null)
mapping:ffff88007cc4c998 index:109
[ 606.888646] vma->vm_ops->fault: filemap_fault+0x0/0x480
[ 606.888649] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[ 606.888652] Pid: 4744, comm: udevd Tainted: G B D 2.6.34-rc5-wl #32
[ 606.888654] Call Trace:
[ 606.888657] [<ffffffff810ad644>] print_bad_pte+0x1a4/0x210
[ 606.888660] [<ffffffff810ae6e1>] unmap_vmas+0x4c1/0x970
[ 606.888668] [<ffffffffa04786b0>] ? ath9k_htc_txep+0x0/0xe0 [ath9k_htc]
[ 606.888672] [<ffffffff810b3b8d>] exit_mmap+0xdd/0x1a0
[ 606.888675] [<ffffffff810396d8>] mmput+0x48/0x100
[ 606.888678] [<ffffffff8103e1c1>] exit_mm+0x101/0x130
[ 606.888681] [<ffffffff8104058c>] do_exit+0x6ac/0x770
[ 606.888684] [<ffffffff8104069c>] do_group_exit+0x4c/0xc0
[ 606.888688] [<ffffffff8104c6f7>] get_signal_to_deliver+0x307/0x4c0
[ 606.888692] [<ffffffff81002180>] do_signal+0x70/0x7c0
[ 606.888695] [<ffffffff81026b9a>] ? do_page_fault+0xca/0x3e0
[ 606.888698] [<ffffffff81301575>] ? printk+0x3c/0x3f
[ 606.888701] [<ffffffff810d9790>] ? do_unlinkat+0x60/0x1c0
[ 606.888705] [<ffffffff81002925>] do_notify_resume+0x55/0x80
[ 606.888708] [<ffffffff81304942>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 606.888711] [<ffffffff81305a66>] retint_signal+0x46/0x90

Sujith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Tomas Winkler on 27 Apr 2010 07:20

On Mon, Apr 26, 2010 at 6:19 PM, Kay Sievers <kay.sievers(a)vrfy.org> wrote:
> On Mon, 2010-04-26 at 12:38 +0200, Kay Sievers wrote:
>> On Sun, Apr 25, 2010 at 22:09, Tomas Winkler <tomasw(a)gmail.com> wrote:
>> > Said thing is that I don't see where the memory goes.... Anyhow I will
>> > try to run valgrin on udev just to be sure.
>>
>> Nah, that memory would be freed, if you kill all udev processes, which
>> it doesn't.
>>
>> The many udev worker processes you see for a few seconds was caused by
>> udevd handling events with TIMEOUT= set special. We need to make sure,
>> that firmware events run immediately and don't wait for other
>> processes to finish. The logic who does that was always creating a new
>> worker. I changed this now, but this will not affect the underlying
>> problem you are seeing, it will just make the udev workers not grow in
>> a timeframe of less than 10 seconds. The change is here:
>> http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=665ee17def2caa6811ae032ae68ebf8239a18cf8
>> but as mentioned, this change is unrelated to the memory leak you are seeing.
>>
>> > I'll be glad If someone can run my simple driver I posted and confirm
>> > that sees the same problem
>>
>> I can confirm that memory gets lost. I suspect for some reason the
>> firmware does not get properly cleaned up. If you increase the size of
>> the firmware image, it will leak memory much faster.
>
> I guess, the assumption that vfree() will free pages which are allocated
> by custom code, and not by vmalloc(), is not true.
>
> The attached changes seem to fix the issue for me.
>
> The custom page array mangling was introduced by David as an optimization
> with commit 6e03a201bbe8137487f340d26aa662110e324b20 and this should be
> checked, and if needed be fixed.
>
> Cheers,
> Kay
>
>
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 985da11..fe4e872 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -162,7 +162,7 @@ static ssize_t firmware_loading_store(struct device *dev,
> mutex_unlock(&fw_lock);
> break;
> }
> - vfree(fw_priv->fw->data);
> + vunmap(fw_priv->fw->data);
> fw_priv->fw->data = NULL;
> for (i = 0; i < fw_priv->nr_pages; i++)
> __free_page(fw_priv->pages[i]);
> @@ -176,7 +176,7 @@ static ssize_t firmware_loading_store(struct device *dev,
> break;
> case 0:
> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
> - vfree(fw_priv->fw->data);
> + vunmap(fw_priv->fw->data);
> fw_priv->fw->data = vmap(fw_priv->pages,
> fw_priv->nr_pages,
> 0, PAGE_KERNEL_RO);
> @@ -184,9 +184,6 @@ static ssize_t firmware_loading_store(struct device *dev,
> dev_err(dev, "%s: vmap() failed\n", __func__);
> goto err;
> }
> - /* Pages will be freed by vfree() */
> - fw_priv->page_array_size = 0;
> - fw_priv->nr_pages = 0;
> complete(&fw_priv->completion);
> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
> break;
> @@ -578,7 +575,7 @@ release_firmware(const struct firmware *fw)
> if (fw->data == builtin->data)
> goto free_fw;
> }
> - vfree(fw->data);
> + vunmap(fw->data);
> free_fw:
> kfree(fw);
> }
>

The difference between vfree and vunmap is that vfree request for
deallocating the pages while vunmap leaves the pages allocated. I
don't think you can replace vfree with vunmap the way you did.
The transition from vmalloc to alloc_pages were done by the patch
bellow. I think this needs some more review.

commit 6e03a201bbe8137487f340d26aa662110e324b20
Author: David Woodhouse <dwmw2(a)infradead.org>
Date: Thu Apr 9 22:04:07 2009 -0700

firmware: speed up request_firmware(), v3

Rather than calling vmalloc() repeatedly to grow the firmware image as
we receive data from userspace, just allocate and fill individual pages.
Then vmap() the whole lot in one go when we're done.

A quick test with a 337KiB iwlagn firmware shows the time taken for
request_firmware() going from ~32ms to ~5ms after I apply this patch.

[v2: define PAGE_KERNEL_RO as PAGE_KERNEL where necessary, use min_t()]
[v3: kunmap() takes the struct page *, not the virtual address]

Signed-off-by: David Woodhouse <David.Woodhouse(a)intel.com>
Tested-by: Sachin Sant <sachinp(a)in.ibm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Tomas Winkler on 27 Apr 2010 08:00

On Tue, Apr 27, 2010 at 2:18 PM, Tomas Winkler <tomasw(a)gmail.com> wrote:
> On Mon, Apr 26, 2010 at 6:19 PM, Kay Sievers <kay.sievers(a)vrfy.org> wrote:
>> On Mon, 2010-04-26 at 12:38 +0200, Kay Sievers wrote:
>>> On Sun, Apr 25, 2010 at 22:09, Tomas Winkler <tomasw(a)gmail.com> wrote:
>>> > Said thing is that I don't see where the memory goes.... Anyhow I will
>>> > try to run valgrin on udev just to be sure.
>>>
>>> Nah, that memory would be freed, if you kill all udev processes, which
>>> it doesn't.
>>>
>>> The many udev worker processes you see for a few seconds was caused by
>>> udevd handling events with TIMEOUT= set special. We need to make sure,
>>> that firmware events run immediately and don't wait for other
>>> processes to finish. The logic who does that was always creating a new
>>> worker. I changed this now, but this will not affect the underlying
>>> problem you are seeing, it will just make the udev workers not grow in
>>> a timeframe of less than 10 seconds. The change is here:
>>> http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=665ee17def2caa6811ae032ae68ebf8239a18cf8
>>> but as mentioned, this change is unrelated to the memory leak you are seeing.
>>>
>>> > I'll be glad If someone can run my simple driver I posted and confirm
>>> > that sees the same problem
>>>
>>> I can confirm that memory gets lost. I suspect for some reason the
>>> firmware does not get properly cleaned up. If you increase the size of
>>> the firmware image, it will leak memory much faster.
>>
>> I guess, the assumption that vfree() will free pages which are allocated
>> by custom code, and not by vmalloc(), is not true.
>>
>> The attached changes seem to fix the issue for me.
>>
>> The custom page array mangling was introduced by David as an optimization
>> with commit 6e03a201bbe8137487f340d26aa662110e324b20 and this should be
>> checked, and if needed be fixed.
>>
>> Cheers,
>> Kay
>>
>>
>> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
>> index 985da11..fe4e872 100644
>> --- a/drivers/base/firmware_class.c
>> +++ b/drivers/base/firmware_class.c
>> @@ -162,7 +162,7 @@ static ssize_t firmware_loading_store(struct device *dev,
>> mutex_unlock(&fw_lock);
>> break;
>> }
>> - vfree(fw_priv->fw->data);
>> + vunmap(fw_priv->fw->data);
>> fw_priv->fw->data = NULL;
>> for (i = 0; i < fw_priv->nr_pages; i++)
>> __free_page(fw_priv->pages[i]);
>> @@ -176,7 +176,7 @@ static ssize_t firmware_loading_store(struct device *dev,
>> break;
>> case 0:
>> if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
>> - vfree(fw_priv->fw->data);
>> + vunmap(fw_priv->fw->data);
>> fw_priv->fw->data = vmap(fw_priv->pages,
>> fw_priv->nr_pages,
>> 0, PAGE_KERNEL_RO);
>> @@ -184,9 +184,6 @@ static ssize_t firmware_loading_store(struct device *dev,
>> dev_err(dev, "%s: vmap() failed\n", __func__);
>> goto err;
>> }
>> - /* Pages will be freed by vfree() */
>> - fw_priv->page_array_size = 0;
>> - fw_priv->nr_pages = 0;
>> complete(&fw_priv->completion);
>> clear_bit(FW_STATUS_LOADING, &fw_priv->status);
>> break;
>> @@ -578,7 +575,7 @@ release_firmware(const struct firmware *fw)
>> if (fw->data == builtin->data)
>> goto free_fw;
>> }
>> - vfree(fw->data);
>> + vunmap(fw->data);
>> free_fw:
>> kfree(fw);
>> }
>>
>
> The difference between vfree and vunmap is that vfree request for
> deallocating the pages while vunmap leaves the pages allocated. I
> don't think you can replace vfree with vunmap the way you did.
> The transition from vmalloc to alloc_pages were done by the patch
> bellow. I think this needs some more review.
>
> commit 6e03a201bbe8137487f340d26aa662110e324b20
> Author: David Woodhouse <dwmw2(a)infradead.org>
> Date: Thu Apr 9 22:04:07 2009 -0700
>
> firmware: speed up request_firmware(), v3
>
> Rather than calling vmalloc() repeatedly to grow the firmware image as
> we receive data from userspace, just allocate and fill individual pages.
> Then vmap() the whole lot in one go when we're done.
>
> A quick test with a 337KiB iwlagn firmware shows the time taken for
> request_firmware() going from ~32ms to ~5ms after I apply this patch.
>
> [v2: define PAGE_KERNEL_RO as PAGE_KERNEL where necessary, use min_t()]
> [v3: kunmap() takes the struct page *, not the virtual address]
>
> Signed-off-by: David Woodhouse <David.Woodhouse(a)intel.com>
> Tested-by: Sachin Sant <sachinp(a)in.ibm.com

It looks like to me that fw_realloc_buffer missing some 'vREmap' code

Tomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Kay Sievers on 27 Apr 2010 08:10

On Tue, Apr 27, 2010 at 13:18, Tomas Winkler <tomasw(a)gmail.com> wrote:

> The difference between vfree and vunmap is that vfree request for
> deallocating the pages while vunmap leaves the pages allocated.

No, that's not the problem, as described in the earlier mail. Only if
the are vmalloc'd they will ever get free'd. The kernel will not take
over custom pages you just mapped. They will never get touched by
vfree. The page count in the vm area will always be zero.

> I
> don't think you can replace vfree with vunmap the way you did.
> The transition from vmalloc to alloc_pages were done by the patch
> bellow.

Sure, if you do vmap, you need to do vunmap, not vfree. But the pages
you need to take care of yourself, like you have allocated them. vfree
in that context makes only sense if you use vmalloc.

The patch I posted makes the issue go away. It's still not the right
fix, because the pages are only get freed when the device id cleaned
up, not on calling release_firmware. But it should illustrate the
underlying issue, and that there is no leaked memory anymore.

> I think this needs some more review.

If David does not fix it, it probably just needs to be reverted. And
instead of implementing our own "memory management", we should rather
add a vrealloc(), and the firmware loader should use that.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: uml: i386: Avoid redefinition of NR_syscalls
Next: rcu: slim down rcutiny by removing rcu_scheduler_active and friends