Prev: always create the locations that need frequent I/O /home, swap on the outer tracks ?
Next: Clarifications needed to configure NFS client
From: annalissa on 20 Jul 2010 06:12 Hi all, The following is what i have read in a magazine named "linux for you" , To what extent this is true ? ideally dedicate a set of high performance disks spread across two or more controllers . if swap space resides on a busy disk, then to reduce latency, it should be located as close to a busy partition like (/var) as possible to reduce seek time for the drive head while using different partitions or hard disks of different speeds for swap, you can assign priorities to each partition so that the kernel will use higher priority hard disk first. in addition , the kernel will distribute visit counts in a round robin fashion across all devices with equal priorities Ex:- /dev/sda6 /swap swap pri=4 0 0 /dev/sda8 /swap swap pri=4 0 0 /dev/sdb3 /swap swap pri=1 0 0
From: Aragorn on 20 Jul 2010 07:01 On Tuesday 20 July 2010 12:12 in comp.os.linux.setup, somebody identifying as annalissa wrote... > Hi all, > > The following is what i have read in a magazine named "linux for > you" , To what extent this is true ? > > > > ideally dedicate a set of high performance disks spread across two or > more controllers . This in itself is a rather vague description. I presume that the above suggest the use of a RAID solution, although this need not necessarily be the case. But let's handle RAID first before we get into the other aspects... When using a RAID solution, then if you are using hard disks of the IDE/PATA variety, you will typically be using software RAID - true hardware PATA RAID controllers do exist, but they are rare and few. With such a software RAID set-up, it is generally advised to use separate disk controllers, due to the limitations of the throughput on PATA - i.e. 133 MB/sec for the entire disk controller with UDMA enabled. So in other words, if you have multiple disks connected to the same PATA controller, then the controller will be the bottleneck. The same is true for parallel SCSI. I haven't exactly followed the latest developments in parallel SCSI anymore since I have switched to SAS ("Serial-Attached Scsi") for my own SCSI implementations, but for as far as I know, the fastest parallel SCSI standard at the moment is Ultra 320 - it is possible that there's already Ultra 640 by now; again, I have not been following the evolution on that anymore - and that means that every SCSI channel has a maximum throughput of 320 MB per second. However, for SAS, SATA and Firewire, things are different. These types of disks are connected via a point-to-point connection, and so the controller itself does not form a bottleneck anymore, except in the event that the controller has a higher throughput capacity than its PCI, PCI-X or PCIe bus allows, but then it is the bus that forms the bottleneck, not the controller. However, as explained by several people in reply to an earlier post from you on this subject, an important thing to keep into account is caching. Not only do the disks have a cache, but in the event of a hardware RAID solution, the RAID controller will also have a cache, and on top of all that, the Linux kernel also caches and buffers, and on some filesystem types more than on others. (XFS and reiser4 for instance are aggressively caching filesystem types.) A lot also depends on the type of RAID that you choose to use. RAID 0 (striping) is the ultimate throughput solution, but RAID 0 is not redundant. RAID 1 (mirroring) on the other hand offers RAID 0 performance during reads, but not during writes, as the entire data cache has to be written to both disks - assuming we are talking of two disks here - in its entirety when the kernel flushes the data to the disk. During reads, the RAID controller - or the software RAID code in the Linux kernel, depending on your set-up - will behave somewhat like RAID 0 in that part of the data will be read from one disk and part form the other disk. Now, the above all said, if you do not have a RAID set-up, but just a bunch of hard disks in your controller - and by this I do not mean a JBOD set-up, but just the regular scenario where a person has multiple hard disks in their machine - then you /may/ theoretically gain a bit in performance if you spread the filesystems with the highest I/O demands across multiple disks, e.g. "/var" on one disk and "/usr" on another disk. But as explained in the replies to your earlier inquiry about this subject and as I have mentioned myself higher up, this is all theoretical because you have to keep caching into account. However, when you have two hard disks in your computer and your machine does not have a lot of RAM in it - and thus in other words: if your machine has to use swap a lot - then you can set up a swap partition on each disk and give them equal priority, in which case the kernel will swap the data to each swap partition in an alternating way. Or you can set a higher priority for one of the swap partitions - e.g. if one of the two disks is faster than the other. > if swap space resides on a busy disk, then to reduce latency, it > should be located as close to a busy partition like (/var) as possible > to reduce seek time for the drive head Hmm... I think that's rather far-fetched. The Linux kernel has a couple of very good I/O schedulers, e.g. cfq ("completely fair queueing") and the anticipatory scheduler. The latter was the default for a long time and still may be the default in systems that for some reason require an older kernel, but for newer kernels "cfq" has become the default, and the older kernels support it as well, albeit that you may have to tell the kernel to use it, either via a manual kernel commandline parameter at boot, or via your bootloader configuration. In LILO, add "elevator=cfq" to the "append" line for your kernel's stanza. In GRUB, just add it to the kernel's boot options on the "kernel" line. Note: All recent kernels use "cfq" by default, but if you're running one of the pre-2.6.26 kernels - I'm not sure on the exact kernel version in which "cfq" became the default, but it must have been around that release - then you may need to tell the kernel to use "cfq" instead of "anticipatory". In order to find out what your kernel is using, run... dmesg | grep -i scheduler On my machine here, this provides me with the following output...: [12:52:54][localhost:/home/aragorn] [aragorn] $> sudo dmesg | grep -i scheduler Password: io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) (Note: Whether you need superuser privileges or not for the use of "dmesg" depends on your distribution.) And just to be in the clear on what kernel I'm running...: [12:53:47][localhost:/home/aragorn] [aragorn] $> uname -a Linux localhost 2.6.26.8.tex3 #1 SMP Mon Jan 12 04:33:38 CST 2009 i686 AMD Athlon(TM) XP 2800+ GNU/Linux > while using different partitions or hard disks of different speeds for > swap, you can assign priorities to each partition so that the kernel > will use higher priority hard disk first. True. > in addition , the kernel will distribute visit counts in a round robin > fashion across all devices with equal priorities > > Ex:- > /dev/sda6 /swap swap pri=4 0 0 > /dev/sda8 /swap swap pri=4 0 0 > /dev/sdb3 /swap swap pri=1 0 0 True. But, again, if you need to gain performance by distributing the swap across multiple partitions on multiple hard disks, then you've got a problem, i.e. your system has too little RAM. These days, RAM is relatively cheap, and the performance gain from putting a sufficiently high enough amount of RAM in your machine to avoid swapping is a lot better than the performance gain you get from distributing your swap space across multiple disks. ;-) -- *Aragorn* (registered GNU/Linux user #223157)
From: The Natural Philosopher on 20 Jul 2010 08:12 annalissa wrote: > Hi all, > > The following is what i have read in a magazine named "linux for > you" , To what extent this is true ? > > > > ideally dedicate a set of high performance disks spread across two or > more controllers . > > if swap space resides on a busy disk, then to reduce latency, it > should be located as close to a busy partition > like (/var) as possible to reduce seek time for the drive head > > > while using different partitions or hard disks of different speeds for > swap, you can assign priorities to each partition so that the kernel > will use higher priority hard disk first. > > in addition , the kernel will distribute visit counts in a round robin > fashion across all devices with equal priorities > > Ex:- > /dev/sda6 /swap swap pri=4 0 0 > /dev/sda8 /swap swap pri=4 0 0 > /dev/sdb3 /swap swap pri=1 0 0 > > > > > > > > > If you need to tune swap, you are already in such bad trouble, that its time to fit more RAM. My machine has now got to te stage that if I run two graphic intensive apps, it has to start swapping. Performance is down by about 10,000 No amount of swap tuning is going to compensate for the fact its basically too small a machine for the jobs I am now asking it to do. So I don't run two apps loaded with bitmaps,. together. End of story.
From: Grant on 20 Jul 2010 09:45 On Tue, 20 Jul 2010 13:12:45 +0100, The Natural Philosopher <tnp(a)invalid.invalid> wrote: >annalissa wrote: >> Hi all, >> >> The following is what i have read in a magazine named "linux for >> you" , To what extent this is true ? >> >> >> >> ideally dedicate a set of high performance disks spread across two or >> more controllers . >> >> if swap space resides on a busy disk, then to reduce latency, it >> should be located as close to a busy partition >> like (/var) as possible to reduce seek time for the drive head >> >> >> while using different partitions or hard disks of different speeds for >> swap, you can assign priorities to each partition so that the kernel >> will use higher priority hard disk first. >> >> in addition , the kernel will distribute visit counts in a round robin >> fashion across all devices with equal priorities >> >> Ex:- >> /dev/sda6 /swap swap pri=4 0 0 >> /dev/sda8 /swap swap pri=4 0 0 >> /dev/sdb3 /swap swap pri=1 0 0 >> >> >> >> >> >> >> >> >> >If you need to tune swap, you are already in such bad trouble, that its >time to fit more RAM. > > >My machine has now got to te stage that if I run two graphic intensive >apps, it has to start swapping. Performance is down by about 10,000 > >No amount of swap tuning is going to compensate for the fact its >basically too small a machine for the jobs I am now asking it to do. > > >So I don't run two apps loaded with bitmaps,. together. End of story. Not the end of story for some who run memory intensive computations that rely on swap. Not everyone is interested in a nicely tuned interactive desktop ;) The kernel will treat swaps on different like RAID0 if they're set to same priority. I usually put swap in at partition five, first in the logicals, on each drive than run them at same priority. Large swap rarely comes in handy, but is good for the occasional large or silly task. Better than have the kernel start killing off processes in response to out-of-memory. Yes, if the memory usage goes into swap, things really slow down, but that's what computing used to be like all the time some years ago (hmm, maybe last century?). Way I setup disk is that OS lives on primaries at fast end of disk, and the slower archival stuff at the slow end of disk. For example on a two disk machine I may put / on sda2 and /usr on sdb2, I allow for at least two OS installs so I can update OS but keep the old one about until new one is bedded down. Share the swaps, of course. If the box has windoze, I'll put the paging file on other drive to where OS is installed. If, for some reason I need lots extra swap, it's easy to add some swapfiles somewhere convenient -- that happens rarely, but I have done it while processing quite large database tables a while back. Grant.
From: Doug Freyburger on 20 Jul 2010 11:00
The Natural Philosopher wrote: > > If you need to tune swap, you are already in such bad trouble, that its > time to fit more RAM. Incidentally this includes memory leaks. Since memory leaks are slow there is no motivation to tune swap space to optimize it. The solution is to patch the executible that's leaking memory not to optimize swap. > My machine has now got to te stage that if I run two graphic intensive > apps, it has to start swapping. Performance is down by about 10,000 Cache inside the CPU chip may be one order of magnitude faster than main memory. Virtual memory on disk may be four orders of magnitude slower than main memory. > No amount of swap tuning is going to compensate for the fact its > basically too small a machine for the jobs I am now asking it to do. If possible keep enough RAM to not swap. Buy more machines to run the extra applications on. Hosts are cheap to add to the data center. It's not always possible. A use can always be found that blows any RAM. The question to ask yourself is - Should what I am doing blow the maximum RAM I can put in this host? If it shouldn't but it does then it's time to trim down the application. If it should that's when it is worth tuning swap. Uses that will automatically blow any installed RAM are rare. If you can't easily explain why your use will blow any installed RAM then your use shouldn't. Early in my career I did VLSI CAD development. That's the sort of use that will blow any conceivable installed RAM. As with any other tuning operation we got 1000 times the speed improvement from carefully tuning the internal loops to reduce paging that we got from any effort at tuning where the paging took place. If you need more swap for memory leaks, backing store, reserved images pages (an HPUX and AIX issue that does not seem to occur on Linux) and occasional spikes of usage, there will be no benefit to tuning swap. If you know exactly why your application should blow any conceivable installed RAM then it's worth some effort. At one point I supported engineers doing a very big mechanical engineering CADAM project. The RAM available was big enough for the designers of the individual parts. It took some initial benchmarking to ensure that. The machines that swapping were the NASTRAN simulators and the ones used by the assembly testers. For the NASTRAN simulators we got better performance increase by rebuilding the hosts with a single larger swap partition than by adding extra swap partitions. That result surprised me because I expected that a swap partition on each of the 4 internal drives would do better. It turned out the reinstall reorganized the files and slightly decreased the amount of data read into memory in the first place. Even a small decrease in the number of page faults to disk overwhelmed the benefit from using multiple spindles. Most likely what happened is I learned what I wanted to do with those hosts and so when I rebuilt them the second time I rebuilt them based on the previous experience so the build was better in general for the exact useage. So know your applications and consider how to rebuild your host specifically for that use. Making it a specialist box will work better than adding a random application to a generalist box and then worrying about performance. |