Prev: btrfs: add discard_compat support
Next: [PATCH 2/4] block: support compat discard mode by default.
From: Wu Fengguang on 21 Feb 2010 20:30 > Christian, with this patch and more patches to scale down readahead > size on small memory/device size, I guess it's no longer necessary to > introduce a CONFIG_READAHEAD_SIZE? This is the memory size based readahead limit :) Thanks, Fengguang --- readahead: limit readahead size for small memory systems When lifting the default readahead size from 128KB to 512KB, make sure it won't add memory pressure to small memory systems. For read-ahead, the memory pressure is mainly readahead buffers consumed by too many concurrent streams. The context readahead can adapt readahead size to thrashing threshold well. So in principle we don't need to adapt the default _max_ read-ahead size to memory pressure. For read-around, the memory pressure is mainly read-around misses on executables/libraries. Which could be reduced by scaling down read-around size on fast "reclaim passes". This patch presents a straightforward solution: to limit default readahead size proportional to available system memory, ie. 512MB mem => 512KB readahead size 128MB mem => 128KB readahead size 32MB mem => 32KB readahead size (minimal) Strictly speaking, only read-around size has to be limited. However we don't bother to seperate read-around size from read-ahead size for now. CC: Matt Mackall <mpm(a)selenic.com> Signed-off-by: Wu Fengguang <fengguang.wu(a)intel.com> --- mm/readahead.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) --- linux.orig/mm/readahead.c 2010-02-21 22:42:15.000000000 +0800 +++ linux/mm/readahead.c 2010-02-21 23:43:14.000000000 +0800 @@ -19,6 +19,9 @@ #include <linux/pagevec.h> #include <linux/pagemap.h> +#define MIN_READAHEAD_PAGES DIV_ROUND_UP(VM_MIN_READAHEAD*1024, PAGE_CACHE_SIZE) + +static int __init user_defined_readahead_size; static int __init config_readahead_size(char *str) { unsigned long bytes; @@ -36,11 +39,33 @@ static int __init config_readahead_size( bytes = 128 << 20; } + user_defined_readahead_size = 1; default_backing_dev_info.ra_pages = bytes / PAGE_CACHE_SIZE; return 0; } early_param("readahead", config_readahead_size); +static int __init readahead_init(void) +{ + /* + * Scale down default readahead size for small memory systems. + * For example, a 64MB box will do 64KB read-ahead/read-around + * instead of the default 512KB. + * + * Note that the default readahead size will also be scaled down + * for small devices in add_disk(). + */ + if (!user_defined_readahead_size) { + unsigned long max = roundup_pow_of_two(totalram_pages / 1024); + if (default_backing_dev_info.ra_pages > max) + default_backing_dev_info.ra_pages = max; + if (default_backing_dev_info.ra_pages < MIN_READAHEAD_PAGES) + default_backing_dev_info.ra_pages = MIN_READAHEAD_PAGES; + } + return 0; +} +fs_initcall(readahead_init); + /* * Initialise a struct file's readahead state. Assumes that the caller has * memset *ra to zero. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Wu Fengguang on 21 Feb 2010 20:40 > +unsigned long max_readahead_pages = VM_MAX_READAHEAD * 1024 / PAGE_CACHE_SIZE; > + > +static int __init readahead(char *str) > +{ > + unsigned long bytes; > + > + if (!str) > + return -EINVAL; > + bytes = memparse(str, &str); > + if (*str != '\0') > + return -EINVAL; > + > + if (bytes) { > + if (bytes < PAGE_CACHE_SIZE) /* missed 'k'/'m' suffixes? */ > + return -EINVAL; > + if (bytes > 128 << 20) /* limit to 128MB */ > + bytes = 128 << 20; > + } > + > + max_readahead_pages = bytes / PAGE_CACHE_SIZE; > + default_backing_dev_info.ra_pages = max_readahead_pages; > + return 0; > +} > + > +early_param("readahead", readahead); This further optimizes away max_readahead_pages :) --- make default readahead size a kernel parameter From: Nikanth Karthikesan <knikanth(a)suse.de> Add new kernel parameter "readahead", which allows user to override the static VM_MAX_READAHEAD=512kb. CC: Ankit Jain <radical(a)gmail.com> CC: Dave Chinner <david(a)fromorbit.com> CC: Christian Ehrhardt <ehrhardt(a)linux.vnet.ibm.com> Signed-off-by: Nikanth Karthikesan <knikanth(a)suse.de> Signed-off-by: Wu Fengguang <fengguang.wu(a)intel.com> --- Documentation/kernel-parameters.txt | 4 ++++ block/blk-core.c | 3 +-- fs/fuse/inode.c | 2 +- mm/readahead.c | 22 ++++++++++++++++++++++ 4 files changed, 28 insertions(+), 3 deletions(-) --- linux.orig/Documentation/kernel-parameters.txt 2010-02-21 22:41:29.000000000 +0800 +++ linux/Documentation/kernel-parameters.txt 2010-02-21 22:41:30.000000000 +0800 @@ -2174,6 +2174,10 @@ and is between 256 and 4096 characters. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. + readahead=nn[KM] + Default max readahead size for block devices. + Range: 0; 4k - 128m + reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode Format: <reboot_mode>[,<reboot_mode2>[,...]] See arch/*/kernel/reboot.c or arch/*/kernel/process.c --- linux.orig/block/blk-core.c 2010-02-21 22:41:29.000000000 +0800 +++ linux/block/blk-core.c 2010-02-21 22:41:30.000000000 +0800 @@ -498,8 +498,7 @@ struct request_queue *blk_alloc_queue_no q->backing_dev_info.unplug_io_fn = blk_backing_dev_unplug; q->backing_dev_info.unplug_io_data = q; - q->backing_dev_info.ra_pages = - (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; + q->backing_dev_info.ra_pages = default_backing_dev_info.ra_pages; q->backing_dev_info.state = 0; q->backing_dev_info.capabilities = BDI_CAP_MAP_COPY; q->backing_dev_info.name = "block"; --- linux.orig/fs/fuse/inode.c 2010-02-21 22:41:29.000000000 +0800 +++ linux/fs/fuse/inode.c 2010-02-21 22:41:30.000000000 +0800 @@ -870,7 +870,7 @@ static int fuse_bdi_init(struct fuse_con int err; fc->bdi.name = "fuse"; - fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; + fc->bdi.ra_pages = default_backing_dev_info.ra_pages; fc->bdi.unplug_io_fn = default_unplug_io_fn; /* fuse does it's own writeback accounting */ fc->bdi.capabilities = BDI_CAP_NO_ACCT_WB; --- linux.orig/mm/readahead.c 2010-02-21 22:41:29.000000000 +0800 +++ linux/mm/readahead.c 2010-02-21 22:42:15.000000000 +0800 @@ -19,6 +19,28 @@ #include <linux/pagevec.h> #include <linux/pagemap.h> +static int __init config_readahead_size(char *str) +{ + unsigned long bytes; + + if (!str) + return -EINVAL; + bytes = memparse(str, &str); + if (*str != '\0') + return -EINVAL; + + if (bytes) { + if (bytes < PAGE_CACHE_SIZE) /* missed 'k'/'m' suffixes? */ + return -EINVAL; + if (bytes > 128 << 20) /* limit to 128MB */ + bytes = 128 << 20; + } + + default_backing_dev_info.ra_pages = bytes / PAGE_CACHE_SIZE; + return 0; +} +early_param("readahead", config_readahead_size); + /* * Initialise a struct file's readahead state. Assumes that the caller has * memset *ra to zero. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christian Ehrhardt on 22 Feb 2010 03:20 Wu Fengguang wrote: > Nikanth, > >> I didn't want to impose artificial restrictions. I think Wu's patch set would >> be adding some restrictions, like minimum readahead. He could fix it when he >> modifies the patch to include in his patch set. > > OK, I imposed a larger bound -- 128MB. > And values 1-4095 (more exactly: PAGE_CACHE_SIZE) are prohibited mainly to > catch "readahead=128" where the user really means to do 128 _KB_ readahead. > > Christian, with this patch and more patches to scale down readahead > size on small memory/device size, I guess it's no longer necessary to > introduce a CONFIG_READAHEAD_SIZE? Yes as I mentioned before a kernel parameter supersedes a config symbol in my opinion too. -> agreed > Thanks, > Fengguang > --- -- Gr�sse / regards, Christian Ehrhardt IBM Linux Technology Center, System z Linux Performance -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 22 Feb 2010 21:30 On Sun, Feb 21, 2010 at 10:26:00PM +0800, Wu Fengguang wrote: > Nikanth, > > > > > + readahead= Default readahead value for block devices. > > > > + > > > > > > I think the description should define the units (kb) and valid value > > > ranges e.g. page size to something not excessive - say 65536kb. The > > > above description is, IMO, useless without refering to the source to > > > find out this information.... > > > > > > > The parameter can be specified with/without any suffix(k/m/g) that memparse() > > helper function can accept. So it can take 1M, 1024k, 1050620. I checked other > > parameters that use memparse() to get similar values and they didn't document > > it. May be this should be described here. > > Hope this helps clarify things to user: > > + readahead=nn[KM] > + Default max readahead size for block devices. > + Range: 0; 4k - 128m Yes, that is exactly what I was thinĸing of. Thanks. Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Pages: 1 2 Prev: btrfs: add discard_compat support Next: [PATCH 2/4] block: support compat discard mode by default. |