Prev: [GIT PULL] perf fixes
Next: E-mail ID
From: Randy Dunlap on 4 Jul 2010 16:40 On Sun, 4 Jul 2010 00:44:18 -0700 Kent Overstreet wrote: > Documentation/bcache.txt | 75 ++++++++++++++++++++++++++++++++++++++++++++++ > block/Kconfig | 14 ++++++++ > 2 files changed, 89 insertions(+), 0 deletions(-) > create mode 100644 Documentation/bcache.txt > > diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt > new file mode 100644 > index 0000000..53079a7 > --- /dev/null > +++ b/Documentation/bcache.txt > @@ -0,0 +1,75 @@ > +Say you've got a big slow raid 6, and an X-25E or three. Wouldn't it be > +nice if you could use them as cache... Hence bcache. > + > +It's designed around the performance characteristics of SSDs - it only allocates > +in erase block sized buckets, and it uses a bare minimum btree to track cached > +extants (which can be anywhere from a single sector to the bucket size). It's > +also designed to be very lazy, and use garbage collection to clean stale > +pointers. > + > +Cache devices are used as a pool; all available cache devices are used for all > +the devices that are being cached. The cache devices store the UUIDs of > +devices they have, allowing caches to safely persist across reboots. There's > +space allocated for 256 UUIDs right after the superblock - which means for now > +that there's a hard limit of 256 devices being cached. > + > +Currently only writethrough caching is supported; data is transparently added > +to the cache on writes but the write is not returned as completed until it has > +reached the underlying storage. Writeback caching will be supported when > +journalling is implemented. > + > +To protect against stale data, the entire cache is invalidated if it wasn't > +cleanly shutdown, and if caching is turned on or off for a device while it is > +opened read/write, all data for that device is invalidated. > + > +Caching can be transparently enabled and disabled for devices while they are in > +use. All configuration is done via sysfs. To use our SSD sde to cache our > +raid md1: > + > + make-bcache /dev/sde > + echo "/dev/sde" > /sys/kernel/bcache/register_cache > + echo "<UUID> /dev/md1" > /sys/kernel/bcache/register_dev Hi, Where does one find 'make-bcache'? Maybe that info could be added here. > +And that's it. > + > +If md1 was a raid 1 or 10, that's probably all you want to do; there's no point > +in caching multiple copies of the same data. However, if you have a raid 5 or > +6, caching the raw devices will allow the p and q blocks to be cached, which > +will help your random write performance: > + echo "<UUID> /dev/sda1" > /sys/kernel/bcache/register_dev > + echo "<UUID> /dev/sda2" > /sys/kernel/bcache/register_dev > + etc. > + > +To script the UUID lookup, you could do something like: > + echo "`find /dev/disk/by-uuid/ -lname "*md1"|cut -d/ -f5` /dev/md1"\ > + > /sys/kernel/bcache/register_dev > + > +Of course, if you were already referencing your devices by UUID, you could do: > + echo "$UUID /dev/disk/by-uiid/$UUID"\ > + > /sys/kernel/bcache/register_dev > + > +There are a number of other files in sysfs, some that provide statistics, > +others that allow tweaking of heuristics. Directories are also created > +for both cache devices and devices that are being cached, for per device > +statistics and device removal. > + > +Statistics: cache_hits, cache_misses, cache_hit_ratio > +These should be fairly obvious, they're simple counters. > + > +Cache hit heuristics: cache_priority_seek contributes to the new bucket > +priority once per cache hit; this lets us bias in favor of random IO. > +The file cache_priority_hit is scaled by the size of the cache hit, so > +we can give a 128k cache hit a higher weighting than a 4k cache hit. > + > +When new data is added to the cache, the initial priority is taken from > +cache_priority_initial. Every so often, we must rescale the priorities of > +all the in use buckets, so that the priority of stale data gradually goes to > +zero: this happens every N sectors, taken from cache_priority_rescale. The > +rescaling is currently hard coded at priority *= 7/8. > + > +For cache devices, there are a few more files. Most should be obvious; > +min_priority shows the priority of the bucket that will next be pulled off > +the heap, and tree_depth shows the current btree height. > + > +Writing to the unregister file in a device's directory will trigger the > +closing of that device. > diff --git a/block/Kconfig b/block/Kconfig > index 9be0b56..ae2be2d 100644 > --- a/block/Kconfig > +++ b/block/Kconfig > @@ -77,6 +77,20 @@ config BLK_DEV_INTEGRITY > T10/SCSI Data Integrity Field or the T13/ATA External Path > Protection. If in doubt, say N. > > +config BLK_CACHE > + tristate "Block device as cache" > + default m We try not to add (enable) non-core drivers to the kernel build. OTOH, in a year or a few, this could be a core driver. > + ---help--- > + Allows a block device to be used as cache for other devices; uses > + a btree for indexing and the layout is optimized for SSDs. > + > + Caches are persistent, and store the UUID of devices they cache. > + Hence, to open a device as cache, use > + echo /dev/foo > /sys/kernel/bcache/register_cache > + And to enable caching for a device > + echo "<UUID> /dev/bar" > /sys/kernel/bcache/register_dev > + See Documentation/bcache.txt for details. > + > endif # BLOCK > > config BLOCK_COMPAT > -- --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Kent Overstreet on 4 Jul 2010 21:30 On 07/04/2010 01:34 PM, Randy Dunlap wrote: > On Sun, 4 Jul 2010 00:44:18 -0700 Kent Overstreet wrote: >> + make-bcache /dev/sde >> + echo "/dev/sde"> /sys/kernel/bcache/register_cache >> + echo "<UUID> /dev/md1"> /sys/kernel/bcache/register_dev > > Hi, > > Where does one find 'make-bcache'? > Maybe that info could be added here. Yeah, that should be there - git://evilpiepirate.org/~kent/bcache-tools.git Added it to the documentation, probably well past time to add some documentation to the user space stuff too.. >> --- a/block/Kconfig >> +++ b/block/Kconfig >> @@ -77,6 +77,20 @@ config BLK_DEV_INTEGRITY >> T10/SCSI Data Integrity Field or the T13/ATA External Path >> Protection. If in doubt, say N. >> >> +config BLK_CACHE >> + tristate "Block device as cache" >> + default m > > We try not to add (enable) non-core drivers to the kernel build. > OTOH, in a year or a few, this could be a core driver. Alright, turned that off. It's certainly getting to be quite a lot of code, is there anything I can do to make it easier to look at? I really appreciate you taking the time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: [GIT PULL] perf fixes Next: E-mail ID |