From: Peter Duniho on 14 Dec 2009 02:43 Goran Sliskovic wrote: > CLOSE_WAIT means that the TCP connection has been terminated, but the > socket descriptor has not been closed (by calling close by application, > of course). There is no possibility that there are messages still in > transit (at least messages that could do any damage). TIME_WAIT state is > there to ensure this. Sorry�I have confused the two. Thanks. > The socket is bound in OP's case because it is not closed, for whatever > reason. Connection is closed, however socket is not. The OP mentions another case where the connection indefinitely remains in CLOSE_WAIT even after the process exits. Can you think of a reason that might happen? I couldn't. Pete
From: Goran Sliskovic on 14 Dec 2009 17:20 "Peter Duniho" <no.peted.spam(a)no.nwlink.spam.com> wrote in message news:eUPItEJfKHA.5020(a)TK2MSFTNGP02.phx.gbl... .... > The OP mentions another case where the connection indefinitely remains in > CLOSE_WAIT even after the process exits. Can you think of a reason that > might happen? I couldn't. > > Pete Once I saw this with WebRequest class. Either .NET uses some other process that cleans up the socket (or some other class) or process is not shown in task manager but is still somehow active.But this would be internals of windows/.net that I'm not familiar with. I'm pretty sure that exiting the process should close all sockets. The OP problem looks to me as a case of deadlock, possibly in the finalizer. The only way to see what is going on is to attach debugger (eg. Windbg with SOS.dll). Regards, Goran
From: Michael Ober on 14 Dec 2009 20:34 "Goran Sliskovic" <gsliskov(a)yahoo.com> wrote in message news:%23BVPjuQfKHA.5608(a)TK2MSFTNGP05.phx.gbl... > > "Peter Duniho" <no.peted.spam(a)no.nwlink.spam.com> wrote in message > news:eUPItEJfKHA.5020(a)TK2MSFTNGP02.phx.gbl... > ... >> The OP mentions another case where the connection indefinitely remains in >> CLOSE_WAIT even after the process exits. Can you think of a reason that >> might happen? I couldn't. >> >> Pete > > Once I saw this with WebRequest class. Either .NET uses some other process > that cleans up the socket (or some other class) or process is not shown in > task manager but is still somehow active.But this would be internals of > windows/.net that I'm not familiar with. I'm pretty sure that exiting the > process should close all sockets. > The OP problem looks to me as a case of deadlock, possibly in the > finalizer. The only way to see what is going on is to attach debugger (eg. > Windbg with SOS.dll). > > Regards, > Goran Unfortunately I can't attach a debugger to the XP system, but I'll try the updated server code on Wednesday to see if the problem is corrected. Mike.
From: Michael Ober on 14 Dec 2009 21:09 Peter, I think I found it this morning. It actually wasn't in the IPServer namespace at all. Using Remote Desktop, I started the server inside the debugger on my workstation yesterday and checked it a couple of times during the day several hours apart. No problems - making the "long idle" period unlikely. This morning I put a breakpoint on the OnAccept method and connected, single stepping through it. Inside this method there is a call to WriteLog, which creates consistent log files across multiple Windows development environments (not dotNet logging) and during the change of day processing an infinite recursive loop occurred resulting in a stack overflow. I fixed that problem and tested by changing the clock on my system. I'll check tomorrow morning to verify. What baffles me is why I didn't get a Stack Overflow error on the console, even in a Debug release. The My Namespace event handlers were a late addition to the application and I would have expected a Stack Overflow error to be reported prior to my adding them. That would have made it a lot quicker to find and fix. I think I corrected all but the _SendResults and _bufOut issues that you pointed out and I'm simply not sure how to fix that one, short of using SyncLock. Assuming the server is working tomorrow morning, I'll post the updated code tomorrow evening. I also realized that creating a disconnect timer for each connection was extremely inefficient so I recoded this feature to use a single timer in the IPServer class and a decrement counter in each ClientInformation object. Since timeouts are always measured in minutes and the exact timeout period isn't critical, this change too hard to do. The decrement counter is set to the max inactivity time passed to the IPServer.New method and decremented once a minute by a timertick handler in the IPServer class itself. The Send and Receive methods of the ClientInformation class reset this counter to the time out value. Yes, there can be a race condition here, but if it occurs, one of two things will happen. Either the server will break down the connection, which still complies with the server's contract with its clients, or the connection timeout will be reset, which will simply keep the connection alive longer. In either case, if the client disconnects, the EndRecieve method of the socket will return 0 bytes and the connection will be shutdown by the server. The events being raised were really for user feedback, so I combined them into a new Event that better describes what it's for. This broke the interface "contract" presented by the IPServer class to the application, but as there were only two applications using this class, that wasn't a big deal. If this had been for external use or there had been a lot of applications using this interface, I would have added the new event and left the old ones there. The new code actually hides the Clients List and the ConnectionInformation class by making them private to the IPServer class as well as flattening the NameSpace. The original desired design was to inherit this class and override the ProcessMessage method for the business logic, hiding all the complexity of handling the TCP communications. There wasn't supposed to be external access to the client connections. I was close previously, but not quite there. This has the additional benefit of allowing multiple IPServers to exist in the same application without the possibility of them interfering with each other. Each IPServer override could then implement different business logic, simplifying the ProcessMessage override. In the environment I work in, this is useful as I have a server that collects information from multiple sources and updates multiple external applications. This particular application is the other one that uses this class. It uses the IPServer class and the MSMQ interface to collect data and it updates three different systems, providing an integrated view of the data in a near real-time environment. The vendor of the external system would like to do away with the MSMQ interface and go strictly TCP for updates. As for the recompile - I'm actually surprised this application worked at all during date changes. After the rebuild I did restore from tape and the restored executable also crashed. The WriteLog stack overflow problem has been in the code since it was ported from VC6 and VB6 to VB2005 several years ago. This bug doesn't exist in the VC6 version as I have another application server, writtin in VC++/MFC, running on the same server that doesn't exhibit this problem. I don't know if it exists in the VB6 version of the class and will probably never find out as all new development is in C# and VB 2008. In my environment, there are no extended execution TCP servers written in VB6. I used VC++/MFC for those as it was much easier for me to handle TCP messages in VC++/MFC than in VB6. Once again, thanks for all your feedback. Mike Ober. "Peter Duniho" <no.peted.spam(a)no.nwlink.spam.com> wrote in message news:eXhLzCJfKHA.5292(a)TK2MSFTNGP06.phx.gbl... > Michael Ober wrote: >> [...] >> I can duplicate this with a simple connect/disconnect. The problem >> occurs on the connection. > > According to your original problem statement, the problem occurs _after_ > there has been some activity, followed by a period of idle time, and then > a new connection. Until you've completely isolated the problem, it's not > possible to rule out potential dependencies on the client with respect to > the problem. > > If you _can_ in fact reproduce the problem with a simple connect, perhaps > after some specific period of being idle, even so...you need to provide > the code for that. If you expect people to reproduce the problem, you > can't expect them to write new code to do that. > > If you are satisfied with conjecture, then don't provide a > concise-but-complete code example. > >> [...] >> Let me see if I understand. Dump the two class level variables and pass >> the class (Me) through the Begin... method. Then pull the class out of >> the AsyncState property for access to the buffers. > > I'm not sure I understand the above. Generally, an i/o system will track > two kinds of data structures: "per-connection", and "per-operation". You > seem to have a per-connection structure now. That structure would not > contain any per-operation data (i.e. buffers). Instead, you'd have a > separate data structure for that data. > >> [...] >> I have made several changes based on your comments. I haven't touched >> the _bufOut or the _SendResults variables, but the _ReceiveResults >> variable was actually not used anywhere and is now gone. I've started >> the program inside the VS debugger. I've confirmed that it is accepting >> connections and returning results today and tomorrow morning I'll see >> what happens when I try to connect. > > Just to be clear: I have not identified any "smoking gun" in the code you > posted. There are a number of things that I feel are potential issues, at > least with respect to maintainability if not correctness. But none are > obviously causing problems, never mind the problem you describe. > > I remain curious about the answers to the questions I asked in my other > reply, with respect to your comments that the problem started occurring > after you had other problems with the computer and had recompiled the code > that had been working fine for nearly a year. > > In spite of all the "fishy" things about the code, it's entirely possible > that there's nothing wrong with the code that would lead to incorrect > behavior, and that the problem lies elsewhere. > > Pete
From: Peter Duniho on 14 Dec 2009 21:48 Michael Ober wrote: > Peter, > > I think I found it this morning. It actually wasn't in the IPServer > namespace at all. Just as I suggested might be the case. I'm glad you were able to find the problem. But I hope you can see, based on this experience, why a concise-but-complete code example is so important in terms of obtaining help. Without one, all anyone can do is make random guesses and suggestions, none of which may have anything at all to do with the problem. While comments you received to your question might have been helpful in a general sense, I doubt any of them were instrumental in leading you to a solution to the specific problem you were asking about. > [...] > I think I corrected all but the _SendResults and _bufOut issues that you > pointed out and I'm simply not sure how to fix that one, short of using > SyncLock. Nothing wrong with a SyncLock, IMHO. It's always most important to make sure the code works right, then worry about whether there's a better, more efficient solution. And .NET's Monitor class is reasonably efficient. > [...] > I also realized that creating a disconnect timer for each connection was > extremely inefficient so I recoded this feature to use a single timer in > the IPServer class and a decrement counter in each ClientInformation > object. If I were to code the timeout/disconnect logic from scratch, I might use a sorted event queue, sorted by the event time and with a single thread, and a call to Thread.Sleep() to delay until the next event. Add the various logic to deal with resetting timers, adding new ones with a future time earlier than the earliest current one in the queue, etc. and you're done. Fortunately, I believe that's basically what the System.Timers.Timer timer does. It's specifically designed for server-related timers (or at least, so the documentation says :) ), and I'm pretty sure one of its main features is the ability to track a large number of timed events as timers, without putting a significant load on the system. Note that as compared to this approach: > Since timeouts are always measured in minutes and the exact > timeout period isn't critical, this change too hard to do. The > decrement counter is set to the max inactivity time passed to the > IPServer.New method and decremented once a minute by a timertick handler > in the IPServer class itself. The Send and Receive methods of the > ClientInformation class reset this counter to the time out value. Advantage to your approach: reset of a timer is simply a new counter value. Disadvantage to your approach: on every tick of the timer, you have to inspect every connected client's data structure. One downside to the sorted list of future events is the cost of the insertion of a new element (either O(n) or O(log n), depending on the implementation, which I don't know), borne every time you add a timer or reset one. And of course, for large numbers of clients, this can be large too. Of course, if all the connections use the same timeout (which seems likely), than any reset timer always winds up at the very end of the sorted list, and the cost of doing that can be almost as cheap as resetting a counter, because you already know where the new position is in the list rather than having to search for it. Obviously it depends somewhat on the specifics. But based on what I think is likely about your implementation (specifically, every connection uses the same timeout), you may find it preferable to actually just create your own single-threaded event queue for the timing, taking advantage of the fact that you know every new event added (whether a new connection or one for which i/o operation happened and thus needs a reset timer) goes to the end of the list. I don't know for sure, but it's possible System.Timers.Timer includes an optimization to check for end-of-list insertions, and so could in fact perform basically as well as a custom implementation would. (I took a quick look in Reflector, but unfortunately, the meat of the implementation seems to be in native code, which doesn't show up in Reflector; I can, however, verify that System.Timers.Timer just uses System.Threading.Timer underneath, so whatever characteristics one has, the other is pretty much the same :) ). > Yes, > there can be a race condition here, but if it occurs, one of two things > will happen. Either the server will break down the connection, which > still complies with the server's contract with its clients, or the > connection timeout will be reset, which will simply keep the connection > alive longer. In either case, if the client disconnects, the EndRecieve > method of the socket will return 0 bytes and the connection will be > shutdown by the server. [...] Right. Timeouts always involve race conditions. The best you can do is just make sure you have a way of knowing who won, and then get that part right without corrupting your data. :) Pete
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Reflection and compiler inlining Next: Economical Auction Site |