Prev: Staging: comedi: fix code warnings in adl_pci9111.c
Next: [PATCH v3 1/2] tmpfs: Add accurate compare function to percpu_counter library
From: Tim Chen on 17 Jun 2010 20:10 This patch series helps to resolve scalability problem for tmpfs. With these patches, Aim7 fserver throughput for tmpfs improved by 270% on a 4 socket, 32 cores NHM-EX system. In current implementation of tmpfs, whenever we get a new page, stat_lock in shmem_sb_info needs to be acquired. This causes a lot of lock contentions when multiple threads are using tmpfs simultaneously, which makes system with large number of cpus scale poorly. Almost 75% of cpu time was spent contending on stat_lock when we ran Aim7 fserver load with 128 threads on a 4 socket, 32 cores NHM-EX system. We made use of the percpu_counter library for used blocks accounting to speed up the getting and returning of blocks to local per cpu counter without lock acquisition. The first patch in the series add a function to provide fast but accurate comparison for the percpu_counter library. The second patch update the shmem code of tmpfs to use percpu_counter library to improve tmpfs performance. Version 3 Changes: 1. Use percpu_counter instead of adding a new qtoken library for fast block accounting. 2. Change accounting from free_blocks to used_blocks so we do not need to reset percpu_counter when we remount tmpfs. Regards, Tim Chen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |