Prev: writeback: tracing and wbc->nr_to_write fixes
Next: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page()in 2.6.27.y series kernel
From: Dave Wright on 20 Apr 2010 09:50 The current calculation of VM overcommit (particularly with default vm.overcommit_ratio==50) seems to be a hold-over from the days when we had more swap than physical memory. For example, 1/2 phy mem + swap made sense when you had a 1GB of memory and 2GB of swap, however I recently ran into an issue on a server that had 8GB RAM and 2GB swap. The OOM killer was getting triggered as VM commit hit 6GB, even though there was plenty of RAM available. Once I figured out what was going on, I manually tweaked the ratio to be 110%. It looks like current distro recommendations are still "have as much swap as you have RAM", in which case the current calculation is fine, but with SSD becoming more common on boot drives, I think many users will end up with less swap than RAM - consider a desktop user who might have 4GB RAM and 1GB swap. I don't think you would expect Desktop users to understand or tweak overcommit_ratio, but I also don't think having the distro simply change the default from 50 (to 100 or something else) would cover all the cases well. Would it make more sense to have the overcommit formula be calculated as: max commit = min(swap, ram) * overcommit_ratio + max(swap, ram) ? When swap>=ram, the formula works exactly the same as it does now, but when ram>>swap, you are guaranteed to always be able to your full RAM (even when swap=0). -Dave Wright -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dirk Geschke on 20 Apr 2010 17:10
Hi all, I am not on the mailing list and a friend pointed me to this thread... Probably we had the same problem: We had a linux computer with 16 GB of RAM without swap. There was only one big job running on it which did a lot of I/O. This program failed to allocate much memory, we thought that this was due to the high amount of cached memory use. To avoid problems with overcommit we had set overcommit_memory to 2. Now I have seen this thread and now it gets clear: The default value of overcommit_ratio is 50, therefore one program can not allocate more than 8GB of memory at all. After reading this thread I wrote a little programm to allocate memory in 512MB blocks and fill it with zeros. My test system has 4GB of RAM and so I started: qfix:~# free total used free shared buffers cached Mem: 4052376 338124 3714252 0 0 17992 -/+ buffers/cache: 320132 3732244 Swap: 0 0 0 geschke(a)qfix:~$ ./a.out got 1 * 512MB got 2 * 512MB got 3 * 512MB malloc failure after 3 * 512 MB So 1.5 GB are ok, 2 GB of possible 4GB not. I guess some memory of the 4GB are not useable at all and therefore the limit is slightly below 2GB with an overcommit_ratio=50. Next step is to set overcommit_ratio=100: qfix:~# echo 100 >/proc/sys/vm/overcommit_ratio and run the porgram again: geschke(a)qfix:~$ ./a.out got 1 * 512MB got 2 * 512MB got 3 * 512MB got 4 * 512MB got 5 * 512MB got 6 * 512MB malloc failure after 6 * 512 MB That are more than 3 GB but I would have expected to get at least 3.5GB: geschke(a)qfix:~$ free total used free shared buffers cached Mem: 4052376 344976 3707400 0 0 22472 -/+ buffers/cache: 322504 3729872 Swap: 0 0 0 Maybe this is due to a reserved percentage for the root user? However, if I set overcommit_ratio=110 I get more than 3.5 GB: geschke(a)qfix:~$ ./a.out got 1 * 512MB got 2 * 512MB got 3 * 512MB got 4 * 512MB got 5 * 512MB got 6 * 512MB got 7 * 512MB malloc failure after 7 * 512 MB However, now I tested this with a high usage of cached memory, I die a lot of I/O: geschke(a)qfix:~$ free total used free shared buffers cached Mem: 4052376 1945512 2106864 0 0 1621200 -/+ buffers/cache: 324312 3728064 Swap: 0 0 0 A new run gives: geschke(a)qfix:~$ ./a.out got 1 * 512MB got 2 * 512MB got 3 * 512MB got 4 * 512MB got 5 * 512MB got 6 * 512MB got 7 * 512MB malloc failure after 7 * 512 MB and: qfix:~# free total used free shared buffers cached Mem: 4052376 346928 3705448 0 0 26008 -/+ buffers/cache: 320920 3731456 Swap: 0 0 0 So the cached memory is not really a problem for malloc. But since I am testing, I tried what will happens if a lot of memory is already in use. So I opened a large file with "vi": geschke(a)qfix:~$ free total used free shared buffers cached Mem: 4052376 1597168 2455208 0 0 391364 -/+ buffers/cache: 1205804 2846572 Swap: 0 0 0 Now I start the program again: geschke(a)qfix:~$ ./a.out got 1 * 512MB got 2 * 512MB got 3 * 512MB got 4 * 512MB got 5 * 512MB malloc failure after 5 * 512 MB Fine: It seems that there is not really a problem to increase to overcommit_ratio=100 if there is no swap in the system and one has set overcommit_memory=2. So I think, it is not really a problem to run with these settings. Best regards Dirk -- +----------------------------------------------------------------------+ | Dr. Dirk Geschke / Plankensteinweg 61 / 85435 Erding | | Telefon: 08122-559448 / Mobil: 0176-96906350 / Fax: 08122-9818106 | | dirk(a)geschke-online.de / dirk(a)lug-erding.de / kontakt(a)lug-erding.de | +----------------------------------------------------------------------+ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |