Prev: [PATCH 4/17] drivers/isdn/hardware/mISDN: Add missing spin_unlock
Next: [PATCH] Typo in netdevice.h
From: Nick Piggin on 26 May 2010 12:20 On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner(a)redhat.com> > > The inode unused list is currently a global LRU. This does not match > the other global filesystem cache - the dentry cache - which uses > per-superblock LRU lists. Hence we have related filesystem object > types using different LRU reclaimatin schemes. Is this an improvement I wonder? The dcache is using per sb lists because it specifically requires sb traversal. What allocation/reclaim really wants (for good scalability and NUMA characteristics) is per-zone lists for these things. It's easy to convert a single list into per-zone lists. It is much harder to convert per-sb lists into per-sb x per-zone lists. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 26 May 2010 19:10 On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner(a)redhat.com> > > > > The inode unused list is currently a global LRU. This does not match > > the other global filesystem cache - the dentry cache - which uses > > per-superblock LRU lists. Hence we have related filesystem object > > types using different LRU reclaimatin schemes. > > Is this an improvement I wonder? The dcache is using per sb lists > because it specifically requires sb traversal. Right - I originally implemented the per-sb dentry lists for scalability purposes. i.e. to avoid monopolising the dentry_lock during unmount looking for dentries on a specific sb and hanging the system for several minutes. However, the reason for doing this to the inode cache is not for scalability, it's because we have a tight relationship between the dentry and inode cacheѕ. That is, reclaim from the dentry LRU grows the inode LRU. Like the registration of the shrinkers, this is kind of an implicit, undocumented behavour of the current shrinker implemenation. What this patch series does is take that implicit relationship and make it explicit. It also allows other filesystem caches to tie into the relationship if they need to (e.g. the XFS inode cache). What it _doesn't do_ is change the macro level behaviour of the shrinkers... > What allocation/reclaim really wants (for good scalability and NUMA > characteristics) is per-zone lists for these things. It's easy to > convert a single list into per-zone lists. > > It is much harder to convert per-sb lists into per-sb x per-zone lists. No it's not. Just convert the s_{dentry,inode}_lru lists on each superblock and call the shrinker with a new zone mask field to pick the correct LRU. That's no harder than converting a global LRU. Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRUs, so changing the inode cache to per-sb makes no difference. However, this is a moot point because we don't have per-zone shrinker interfaces. That's an entirely separate discussion because of the macro-level behavioural changes it implies.... Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on 26 May 2010 22:10 On Thu, May 27, 2010 at 09:01:29AM +1000, Dave Chinner wrote: > On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > > From: Dave Chinner <dchinner(a)redhat.com> > > > > > > The inode unused list is currently a global LRU. This does not match > > > the other global filesystem cache - the dentry cache - which uses > > > per-superblock LRU lists. Hence we have related filesystem object > > > types using different LRU reclaimatin schemes. > > > > Is this an improvement I wonder? The dcache is using per sb lists > > because it specifically requires sb traversal. > > Right - I originally implemented the per-sb dentry lists for > scalability purposes. i.e. to avoid monopolising the dentry_lock > during unmount looking for dentries on a specific sb and hanging the > system for several minutes. > > However, the reason for doing this to the inode cache is not for > scalability, it's because we have a tight relationship between the > dentry and inode cacheѕ. That is, reclaim from the dentry LRU grows > the inode LRU. Like the registration of the shrinkers, this is kind > of an implicit, undocumented behavour of the current shrinker > implemenation. Right, that's why I wonder whether it is an improvement. It would be interesting to see some tests (showing at least parity). > What this patch series does is take that implicit relationship and > make it explicit. It also allows other filesystem caches to tie > into the relationship if they need to (e.g. the XFS inode cache). > What it _doesn't do_ is change the macro level behaviour of the > shrinkers... > > > What allocation/reclaim really wants (for good scalability and NUMA > > characteristics) is per-zone lists for these things. It's easy to > > convert a single list into per-zone lists. > > > > It is much harder to convert per-sb lists into per-sb x per-zone lists. > > No it's not. Just convert the s_{dentry,inode}_lru lists on each > superblock and call the shrinker with a new zone mask field to pick > the correct LRU. That's no harder than converting a global LRU. > Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRUs, > so changing the inode cache to per-sb makes no difference. Right, it just makes it harder to do. By much harder, I did mostly mean the extra memory overhead. If there is *no* benefit from doing per-sb icache then I would question whether we should. > However, this is a moot point because we don't have per-zone shrinker > interfaces. That's an entirely separate discussion because of the > macro-level behavioural changes it implies.... Yep. I have some patches for it, but they're currently behind the other fine grained locking stuff. But it's something that really needs to be implemented, IMO. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 27 May 2010 00:10 On Thu, May 27, 2010 at 12:04:45PM +1000, Nick Piggin wrote: > On Thu, May 27, 2010 at 09:01:29AM +1000, Dave Chinner wrote: > > On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > > > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > > > From: Dave Chinner <dchinner(a)redhat.com> > > > > > > > > The inode unused list is currently a global LRU. This does not match > > > > the other global filesystem cache - the dentry cache - which uses > > > > per-superblock LRU lists. Hence we have related filesystem object > > > > types using different LRU reclaimatin schemes. > > > > > > Is this an improvement I wonder? The dcache is using per sb lists > > > because it specifically requires sb traversal. > > > > Right - I originally implemented the per-sb dentry lists for > > scalability purposes. i.e. to avoid monopolising the dentry_lock > > during unmount looking for dentries on a specific sb and hanging the > > system for several minutes. > > > > However, the reason for doing this to the inode cache is not for > > scalability, it's because we have a tight relationship between the > > dentry and inode cacheѕ. That is, reclaim from the dentry LRU grows > > the inode LRU. Like the registration of the shrinkers, this is kind > > of an implicit, undocumented behavour of the current shrinker > > implemenation. > > Right, that's why I wonder whether it is an improvement. It would > be interesting to see some tests (showing at least parity). I've done some testing showing parity. They've been along the lines of: - populate cache with 1m dentries + inodes - run 'time echo 2 > /proc/sys/vm/drop_caches' I've used different methods of populating the caches to have them non-sequential in the LRU (i.e. trigger fragmentation), have dirty backing inodes (e.g. the VFS inode clean, the xfs inode dirty because transactions haven't completed), etc. The variation on the test is around +-10%, with the per-sb shrinkers averaging about 5% lower time to reclaim. This is within the error margin of the test, so it's not really a conclusive win, but it is certainly shows that it does not slow anything down. If you've got a better way to test it, then I'm all ears.... > > What this patch series does is take that implicit relationship and > > make it explicit. It also allows other filesystem caches to tie > > into the relationship if they need to (e.g. the XFS inode cache). > > What it _doesn't do_ is change the macro level behaviour of the > > shrinkers... > > > > > What allocation/reclaim really wants (for good scalability and NUMA > > > characteristics) is per-zone lists for these things. It's easy to > > > convert a single list into per-zone lists. > > > > > > It is much harder to convert per-sb lists into per-sb x per-zone lists. > > > > No it's not. Just convert the s_{dentry,inode}_lru lists on each > > superblock and call the shrinker with a new zone mask field to pick > > the correct LRU. That's no harder than converting a global LRU. > > Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRUs, > > so changing the inode cache to per-sb makes no difference. > > Right, it just makes it harder to do. By much harder, I did mostly mean > the extra memory overhead. You've still got to allocate that extra memory on the per-sb dentry LRUs so it's not really a valid argument. IOWs, if it's too much memory for per-sb inode LRUs, then it's too much memory for the per-sb dentry LRUs as well... > If there is *no* benefit from doing per-sb > icache then I would question whether we should. The same vague questions wondering about the benefit of per-sb dentry LRUs were raised when I first proposed them years ago, and look where we are now. Besides, focussing on whether this one patch is a benefit or not is really missing the point because it's the benefits of this patchset as a whole that need to be considered.... Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on 27 May 2010 00:30 On Thu, May 27, 2010 at 02:02:10PM +1000, Dave Chinner wrote: > On Thu, May 27, 2010 at 12:04:45PM +1000, Nick Piggin wrote: > > On Thu, May 27, 2010 at 09:01:29AM +1000, Dave Chinner wrote: > > > On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > > > > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > > > > From: Dave Chinner <dchinner(a)redhat.com> > > > > > > > > > > The inode unused list is currently a global LRU. This does not match > > > > > the other global filesystem cache - the dentry cache - which uses > > > > > per-superblock LRU lists. Hence we have related filesystem object > > > > > types using different LRU reclaimatin schemes. > > > > > > > > Is this an improvement I wonder? The dcache is using per sb lists > > > > because it specifically requires sb traversal. > > > > > > Right - I originally implemented the per-sb dentry lists for > > > scalability purposes. i.e. to avoid monopolising the dentry_lock > > > during unmount looking for dentries on a specific sb and hanging the > > > system for several minutes. > > > > > > However, the reason for doing this to the inode cache is not for > > > scalability, it's because we have a tight relationship between the > > > dentry and inode cacheѕ. That is, reclaim from the dentry LRU grows > > > the inode LRU. Like the registration of the shrinkers, this is kind > > > of an implicit, undocumented behavour of the current shrinker > > > implemenation. > > > > Right, that's why I wonder whether it is an improvement. It would > > be interesting to see some tests (showing at least parity). > > I've done some testing showing parity. They've been along the lines > of: > - populate cache with 1m dentries + inodes > - run 'time echo 2 > /proc/sys/vm/drop_caches' > > I've used different methods of populating the caches to have them > non-sequential in the LRU (i.e. trigger fragmentation), have dirty > backing inodes (e.g. the VFS inode clean, the xfs inode dirty > because transactions haven't completed), etc. > > The variation on the test is around +-10%, with the per-sb shrinkers > averaging about 5% lower time to reclaim. This is within the error > margin of the test, so it's not really a conclusive win, but it is > certainly shows that it does not slow anything down. If you've got a > better way to test it, then I'm all ears.... I guess the problem is that inode LRU cache isn't very useful as long as there are dentries in the way (which is most of the time, isn't it?). I think nfsd will exercise them better? Dont know of any other cases. > > Right, it just makes it harder to do. By much harder, I did mostly mean > > the extra memory overhead. > > You've still got to allocate that extra memory on the per-sb dentry > LRUs so it's not really a valid argument. Well it would be per-zone, per-sb list, but I don't think that makes it an ivalid point. > IOWs, if it's too much > memory for per-sb inode LRUs, then it's too much memory for the > per-sb dentry LRUs as well... Not about how much is too much, it's about more cost or memory usage for what benefit? I guess it isn't a lot more memory though. > > If there is *no* benefit from doing per-sb > > icache then I would question whether we should. > > The same vague questions wondering about the benefit of per-sb > dentry LRUs were raised when I first proposed them years ago, and > look where we are now. To be fair that is because there were specific needs to do per-sb pruning. This isn't the case with icache. > Besides, focussing on whether this one patch > is a benefit or not is really missing the point because it's the > benefits of this patchset as a whole that need to be considered.... I would indeed like to focus on the benefits of the patchset as a whole. Leaving aside the xfs changes, it would be interesting to have at least a few numbers for dcache/icache heavy workloads. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 Prev: [PATCH 4/17] drivers/isdn/hardware/mISDN: Add missing spin_unlock Next: [PATCH] Typo in netdevice.h |