From: Bruce Esquibel on
Gary Mills <mills(a)cc.umanitoba.ca> wrote:

> I'm running an anonymous FTP server using the stock in.ftpd on Solaris
> 10. It works nicely, except that an idle session occasionally
> remains. Eventually, these reach the session limit, preventing new
> sessions from starting. This may take a month. I have to disable and
> enable the service when that happens to clear the idle sessions.

> This sort of idle session is always blocked on the same read():

> # truss -p 21257
> *** SUID: ruid/euid/suid = 49 / 49 / 0 ***
> read(9, 0x08045934, 1) (sleeping...)


This isn't much of an answer but is whatever directory the anonymous account
has access to, is it mounted nfs by any chance?

That truss output and where it ends reminds me of a problem we ran into on
S10 when we were setting up a shell box with like home and mail mounted from
a file server (not local directories). Everything was fine for a few days, a
week, sometimes a month then by no cause I ever found, all kinds of simple
user processess would get stuck like that.

By trial and error everything pointed to the nfs directories and I think I
posted a couple times about nfsd on the file server racking up a lot (and I
mean a lot) of cpu time unlike the S9 server we used to use.

It seemed like from the truss being stuck where it is, it has something to
do with the extended attributes for nfs but it was never clear how to
properly disable it or even if it would of helped.

In our case since the shell box sees such little use, was easier to put it
on a zone on the file server itself and mount everything lofs, which stopped
the problem cold.

-bruce
bje(a)ripco.com
From: Gary Mills on
In <hfm1vl$a0o$1(a)remote5bge0.ripco.com> Bruce Esquibel <bje(a)ripco.com> writes:

>Gary Mills <mills(a)cc.umanitoba.ca> wrote:

>> I'm running an anonymous FTP server using the stock in.ftpd on Solaris
>> 10. It works nicely, except that an idle session occasionally
>> remains. Eventually, these reach the session limit, preventing new
>> sessions from starting. This may take a month. I have to disable and
>> enable the service when that happens to clear the idle sessions.

>> This sort of idle session is always blocked on the same read():

>> # truss -p 21257
>> *** SUID: ruid/euid/suid = 49 / 49 / 0 ***
>> read(9, 0x08045934, 1) (sleeping...)


>This isn't much of an answer but is whatever directory the anonymous account
>has access to, is it mounted nfs by any chance?

No, they are on local disk. The problem seems to be that the client
has disconnected but the server never times out.

--
-Gary Mills- -Unix Group- -Computer and Network Services-
From: Bruce Esquibel on
Gary Mills <mills(a)cc.umanitoba.ca> wrote:

> No, they are on local disk. The problem seems to be that the client
> has disconnected but the server never times out.

I don't know then, was a guess.

Have you considered replacing the in.ftpd with something else like proftpd?

On the machines that we still support ftp on (we try to get people to use
sftp instead), I sort of like running the proftpd in paranoid log mode.

The only questions are, you said in the original message that this was
happening with the anonymous ftp sessions, is that implying you are using
the same daemon for normal user logins and it's not happening with them?

The only thought I have now is, maybe there is no problem at all. I noticed
in the past couple years with ftp, it's not that common people are even
using an ftp program anymore. Like I think on the Macs, you can mount an ftp
server as a directory just through the finder. I'd guess there is some kind
of keep alive going on.

Even with the windows stuff, something was changed lately, I noticed a half
dozen logins now that continuosly send a keep alive, they do what they have
to do, probably stick it in the background but whatever program it is, just
doesn't time out the connection anymore.

I'm not sure any of this applies if socket is in fact closed, but besides
trying another ftp daemon, I don't see what else you can try.

-bruce
bje(a)ripco.com
From: Gary Mills on
In <hfoovu$ium$1(a)remote5bge0.ripco.com> Bruce Esquibel <bje(a)ripco.com> writes:

>Gary Mills <mills(a)cc.umanitoba.ca> wrote:

>> No, they are on local disk. The problem seems to be that the client
>> has disconnected but the server never times out.

>I don't know then, was a guess.

>Have you considered replacing the in.ftpd with something else like proftpd?

No, I prefer to use Sun's FTP server.

>The only questions are, you said in the original message that this was
>happening with the anonymous ftp sessions, is that implying you are using
>the same daemon for normal user logins and it's not happening with them?

I don't know. The problem appears on our anonymous FTP server where
all of the sessions are anonymous.

>Even with the windows stuff, something was changed lately, I noticed a half
>dozen logins now that continuosly send a keep alive, they do what they have
>to do, probably stick it in the background but whatever program it is, just
>doesn't time out the connection anymore.

The FTP server configuration itself specifies TCP keepalives, but I
don't know why they would affect connections in the FIN_WAIT_2 state.
Shouldn't it still time out?

--
-Gary Mills- -Unix Group- -Computer and Network Services-
From: Bruce Esquibel on
Gary Mills <mills(a)cc.umanitoba.ca> wrote:

> The FTP server configuration itself specifies TCP keepalives, but I
> don't know why they would affect connections in the FIN_WAIT_2 state.
> Shouldn't it still time out?

It looks like it to me, at least from using the google finger for seeing
what FIN_WAIT_2 is supposed to be doing. That info that Oscar passed along
with ndd and tcp_fin_wait_2_flush_interval seems to indicate after 11
minutes, it should close everything up.

But it does seem like FIN_WAIT_2 has been a plague on mankind, at least
since the mid/late 90's on different operating systems. Similar problem,
hundreds of them ending up which eventually choke the daemon, both ftp and
httpd.

It seems to point to something I mentioned, the client software and broken
implementations of how the disconnect is supposed to be handled.

Being FIN_WAIT_2 is sort of global for any tcp connection, you really should
reconsider the loyalty to the sun built in.ftpd and try something else. It
appears setting the value to under 10 minutes is a general no-no across most
os's. Since it doesn't seem to be working anyway, changing the values likely
isn't going to help.

-bruce
bje(a)ripco.com