From: Maxwell Lol on 23 May 2010 22:02 unruh <unruh(a)wormhole.physics.ubc.ca> writes: >> One of the things you can do to protect yourself is to minimize the >> number of NFS-mounted directories in your searchpath. I suggest you >> eliminate all of them. > > This is absurd. Computers are here to help us, not to bend our behaviour > to their whims. If nfs refuses to unmount even when the remote end has > died, it is a bug in nfs. Perhaps. I am just telling you of my experience, on a site where we have hundreds of NFS servers with thousands of users. I agree that clients should not freeze. If they do, then what I suggested will reduce or eliminate the problem. >> I created a directory on the local system that had symlinks to the >> remote executables on all of the remote servers. That way, my shell >> only hangs when I executed a command on a remote server. > > And what do symlinks do for you. If you need something from the nfs > filesystem the symlink does not prevent the attempt to access the nfs > partition. Or the "in use" error. > >> Let me explain in detail. Suppose you need to occasionally execute the program /home/apollo/bin/cad on the NFS server called apollo. Configuration #1) You have /home/apollo/bin in your searchpath Configuration #2 You place /local/cache/bin in your searchpath /local/cache/bin/cad is a symlink to /home/apollo/cad/bin Now suppose apollo goes off-line. And if you execute /home/apollo/bin/cad, that process will "hang" waiting for a response from the NFS server. What's the difference? In Configuraiton #1, every time you open a new shell, that shell will freeze - and time-out. In configuration #2, you can open a new shell window. It will ONLY hang when you execute the "cad" program. If you do not, then you may not notice the server is down, because it does not affect you.
From: unruh on 24 May 2010 00:01 On 2010-05-24, Robert Heller <heller(a)deepsoft.com> wrote: > At Sun, 23 May 2010 22:30:13 GMT unruh <unruh(a)wormhole.physics.ubc.ca> wrote: > >> >> On 2010-05-23, Maxwell Lol <nospam(a)com.invalid> wrote: >> > unruh <unruh(a)wormhole.physics.ubc.ca> writes: >> > >> >>> for any command that might access the file descriptor on the remote >> >>> machine, run it in the background. (i.e. type & at the end of the >> >>> line). >> >> >> >> ??? Since almost any program could, that would essentially mean running >> >> all programs in background, including the ones from the gui. >> > >> > >> > Yes, any shell program COULD access the NFS mounted directory. >> > >> > One of the things you can do to protect yourself is to minimize the >> > number of NFS-mounted directories in your searchpath. I suggest you >> > eliminate all of them. >> >> This is absurd. Computers are here to help us, not to bend our behaviour >> to their whims. If nfs refuses to unmount even when the remote end has >> died, it is a bug in nfs. The purpose of nfs mounted drives is precisely >> to have access to the files, and to run programs. Demanding that all >> programs be run in the background (useless if the purpose of the program >> is to produce console or gui output) or that nfs mounts not be in your >> searchpath is simply idiotic. > > NFS works fine, so long as the NFS server is 'well armored' -- that is > the server needs to be a solidly reliable system. NFS does not work > well with a server system that might crash randomly or can be shutdown > on a whim. I would not use NFS for occasional access -- using scp or > some on-demand / as needed method is better. None of my servers are shutdown on a whim. But crashes are almost always random. If they were predictable, they would not be allowed to happen. Things happen- cleaners plug their floor polishers into your powerbar, power glitches happen, etc. NFS should be able to handle them. It does not. That is a bug. > >> >> >> > >> > I created a directory on the local system that had symlinks to the >> > remote executables on all of the remote servers. That way, my shell >> > only hangs when I executed a command on a remote server. >> >> And what do symlinks do for you. If you need something from the nfs >> filesystem the symlink does not prevent the attempt to access the nfs >> partition. Or the "in use" error. >> >> > >> >
From: Doctor J. Frink on 24 May 2010 04:44 On 2010-05-22, James H. Markowitz <noone(a)nowhere.net> wrote: > I have an NFS partition P mounted locally from a remote machine. > The problem is that the remote machine has died. When I do > > umount -f P > > I get a diagnostic whereby somebody is still using P. I try to find out > who by means of fuser, and fuser hangs, apparently forever. > > How can I get rid of P without rebooting my box? If you really *must* remove the mount while a process is still accessing it you can use umount -l P This 'lazy' unmounting will remove the mount even if it's busy. Adding 'intr' to the mount options (before mounting) will also allow you to interrupt/kill programs blocking on a dead mount. I don't think that always works though. Frink -- Doctor J. Frink : 'Rampant Ribald Ringtail' See his mind here : http://www.cmp.liv.ac.uk/frink/ Annoy his mind here : pjf at cmp dot liv dot ack dot ook "No sir, I didn't like it!" - Mr Horse
From: Kevin D. Snodgrass on 24 May 2010 16:35 unruh wrote: > This is not "crashing". Have an external machine from which you mount > a directory by nfs on your machines. cd into that directory. > Switch off that external machine, or > unplug its ethernet cable. It is now impossible to unmount the directory > on your local machine. df may well go into an eternal wait for the > output from that external machine. etc. > If you have some way of avoiding that, I at least would be extremely > grateful. From a local LUG member: There are 2 kinds of nfs mounts : soft mounts and hard mounts. With a soft mount, if the server goes away (reboot/lost network/whatever) the client can gracefully close the connection, unmount the share and continue on. However, in order for the client to maintain this ability, it has to do more caching of the reads and writes, so performace is lower and it's possible to lose data if the server goes away unexpectedly. A hard mount means that the client never gives up trying to hit the server. Ever. Eventually the load on the client will be so high due to backed up i/o requests, you'll have to reboot it. Not all that good, but hard mounts don't have the caching overhead soft mounts do, so you have less chance of losing data. All typos are his... From my understanding of "man 5 nfs", the "intr" option will allow you to interrupt (kill -15?) any process that is indefinately waiting on a lost nfs mount, once the timeouts have expired. Check on timeo= and retrans= in the man page.
From: Grant on 24 May 2010 23:29
On Mon, 24 May 2010 15:35:10 -0500, "Kevin D. Snodgrass" <kdsnodgrass(a)yahoo.com> wrote: >unruh wrote: >> This is not "crashing". Have an external machine from which you mount >> a directory by nfs on your machines. cd into that directory. >> Switch off that external machine, or >> unplug its ethernet cable. It is now impossible to unmount the directory >> on your local machine. df may well go into an eternal wait for the >> output from that external machine. etc. >> If you have some way of avoiding that, I at least would be extremely >> grateful. > > From a local LUG member: > >There are 2 kinds of nfs mounts : soft mounts and hard mounts. With a >soft mount, if the server goes away (reboot/lost network/whatever) the >client can gracefully close the connection, unmount the share and >continue on. However, in order for the client to maintain this ability, >it has to do more caching of the reads and writes, so performace is >lower and it's possible to lose data if the server goes away >unexpectedly. A hard mount means that the client never gives up trying >to hit the server. Ever. Eventually the load on the client will be so >high due to backed up i/o requests, you'll have to reboot it. Not all >that good, but hard mounts don't have the caching overhead soft mounts >do, so you have less chance of losing data. > >All typos are his... > > From my understanding of "man 5 nfs", the "intr" option will allow you >to interrupt (kill -15?) any process that is indefinately waiting on a >lost nfs mount, once the timeouts have expired. Check on timeo= and >retrans= in the man page. Yes, I already mentioned intr. I mount nfs stuff 'hard,intr' here for years and cannot remember nfs getting wedged. Grant. -- http://bugs.id.au/ |