Prev: DOWNLOAD FREE PALTALK LIVE VIDEO AND VOICE CHAT SOFTWARE
Next: Should -Xmx be a multiple of -Xms?
From: Kevin McMurtrie on 9 Jun 2010 01:13 In article <l6udnc0kra5W-pvRnZ2dnUVZ_qadnZ2d(a)earthlink.com>, Patricia Shanahan <pats(a)acm.org> wrote: > Kevin McMurtrie wrote: > ... > > To clarify a bit, this isn't hammering a shared resource. I'm talking > > about 100 to 800 synchronizations on a shared object per second for a > > duration of 10 to 1000 nanoseconds. Yes, nanoseconds. That shouldn't > > cause a complete collapse of concurrency. > > > ... > > Have you considered other possibilities, such as memory thrashing? The > resource does not seem heavily enough used for contention to be a big > issue, but it is about the sort of access rate that is low enough to > allow a page to be swapped out, but high enough for the time waiting for > it to matter. > > Patricia It happened today again during testing of a different server class on the same OS and hardware. This time it was under a microscope. There were 10 gigabytes of idle RAM, no DB contention, no tenured GC, no disk contention, and the total CPU was around 25%. There was no gridlock effect - it always involved one synchronized method that did not depend on other resources to complete. Throughput dropped to ~250 calls per second at a specific method for several seconds then it recovered. Then it happened again elsewhere, then recovered. After several minutes the server was at top speed again. We then pushed traffic until its 1Gbps Ethernet link saturated and there wasn't a trace of thread contention ever returning. -- I won't see Google Groups replies because I must filter them as spam
From: Robert Klemme on 9 Jun 2010 01:43 On 09.06.2010 07:13, Kevin McMurtrie wrote: > In article<l6udnc0kra5W-pvRnZ2dnUVZ_qadnZ2d(a)earthlink.com>, > Patricia Shanahan<pats(a)acm.org> wrote: > >> Kevin McMurtrie wrote: >> ... >>> To clarify a bit, this isn't hammering a shared resource. I'm talking >>> about 100 to 800 synchronizations on a shared object per second for a >>> duration of 10 to 1000 nanoseconds. Yes, nanoseconds. That shouldn't >>> cause a complete collapse of concurrency. >>> >> ... >> >> Have you considered other possibilities, such as memory thrashing? The >> resource does not seem heavily enough used for contention to be a big >> issue, but it is about the sort of access rate that is low enough to >> allow a page to be swapped out, but high enough for the time waiting for >> it to matter. >> >> Patricia > > It happened today again during testing of a different server class on > the same OS and hardware. This time it was under a microscope. There > were 10 gigabytes of idle RAM, no DB contention, no tenured GC, no disk > contention, and the total CPU was around 25%. There was no gridlock > effect - it always involved one synchronized method that did not depend > on other resources to complete. Throughput dropped to ~250 calls per > second at a specific method for several seconds then it recovered. Then > it happened again elsewhere, then recovered. After several minutes the > server was at top speed again. We then pushed traffic until its 1Gbps > Ethernet link saturated and there wasn't a trace of thread contention > ever returning. Did you scrutinize the GC's log? This would be something I definitively would look into. Other than that it's difficult to come up with concrete information with such a general problem description. Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
From: Robert Klemme on 9 Jun 2010 01:46 On 09.06.2010 05:09, Mike Schilling wrote: > > > "Robert Klemme" <shortcutter(a)googlemail.com> wrote in message > news:877quaFr6gU1(a)mid.individual.net... >> On 08.06.2010 05:39, Kevin McMurtrie wrote: > >> >>> Fixing every single shared synchronized method in every 3rd party >>> library could take a very, very long time. >> >> I have no idea where you take that from. Nobody suggested fixing third >> party libraries - if anything the suggestion was to use them properly. > > What if they use system properties promiscuously? Hypothetically: > > 1. My application receives XML messages. > 2. I use a third-party library to deserialize the XML into Java objects. > 3. The third-party library uses JAXP to find an XML parser. > 4. JAXP always checks for a system property that points to the parser's > class name. > > Even if the details are off (I don't know whether current versions of > JAXP cache the class name), you get the idea. In that case I would check whether the lib was used properly and if not indeed the lib would need fixing. Alternatively you would have to replace it with something else (or a newer version, but IIRC JAXP is part of the JDK nowadays). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
From: Kevin McMurtrie on 9 Jun 2010 02:06 In article <86mc28Fn90U1(a)mid.individual.net>, Robert Klemme <shortcutter(a)googlemail.com> wrote: > On 02.06.2010 07:45, Kevin McMurtrie wrote: > > In article<4c048acd$0$22090$742ec2ed(a)news.sonic.net>, > > Kevin McMurtrie<mcmurtrie(a)pixelmemory.us> wrote: > > > >> I've been assisting in load testing some new high performance servers > >> running Tomcat 6 and Java 1.6.0_20. It appears that the JVM or Linux is > >> suspending threads for time-slicing in very unfortunate locations. For > >> example, a thread might suspend in Hashtable.get(Object) after a call to > >> getProperty(String) on the system properties. It's a synchronized > >> global so a few hundred threads might pile up until the lock holder > >> resumes. Odds are that those hundreds of threads won't finish before > >> another one stops to time slice again. The performance hit has a ton of > >> hysteresis so the server doesn't recover until it has a lower load than > >> before the backlog started. > >> > >> The brute force fix is of course to eliminate calls to shared > >> synchronized objects. All of the easy stuff has been done. Some > >> operations aren't well suited to simple CAS. Bottlenecks that are part > >> of well established Java APIs are time consuming to fix/avoid. > >> > >> Is there JVM or Linux tuning that will change the behavior of thread > >> time slicing or preemption? I checked the JDK 6 options page but didn't > >> find anything that appears to be applicable. > > > > To clarify a bit, this isn't hammering a shared resource. I'm talking > > about 100 to 800 synchronizations on a shared object per second for a > > duration of 10 to 1000 nanoseconds. Yes, nanoseconds. That shouldn't > > cause a complete collapse of concurrency. > > It's the nature of locking issues. Up to a particular point it works > pretty well and then locking delays explode because of the positive > feedback. > > If you have "a few hundred threads" accessing a single shared lock with > a frequency of 800Hz then you have a design issue - whether you call it > "hammering" or not. It's simply not scalable and if it doesn't break > now it likely breaks with the next step of load increasing. > > > My older 4 core Mac Xenon can have 64 threads call getProperty(String) > > on a shared Property instance 2 million times each in only 21 real > > seconds. That's one call every 164 ns. It's not as good as > > ConcurrentHashMap (one per 0.30 ns) but it's no collapse. > > Well, then stick with the old CPU. :-) It's not uncommon that moving to > newer hardware with increased processing resources uncovers issues like > this. > > > Many of the basic Sun Java classes are synchronized. Eliminating all > > shared synchronized objects without making a mess of 3rd party library > > integration is no easy task. > > It would certainly help the discussion if you pointed out which exact > classes and methods you are referring to. I would readily agree that > Sun did a few things wrong initially in the std lib (Vector) which they > partly fixed later. But I am not inclined to believe in a massive (i.e. > affecting many areas) concurrency problem in the std lib. > > If they synchronize they do it for good reasons - and you simply need to > limit the number of threads that try to access a resource. A globally > synchronized, frequently accessed resource in a system with several > hundred threads is a design problem - but not necessarily in the > implementation of the resource used but rather in the usage. > > > Next up is looking at the Linux scheduler version and the HotSpot > > spinlock timeout. Maybe the two don't mesh and a thread is very likely > > to enter a semaphore right as its quanta runs out. > > Btw, as far as I can see you didn't yet disclose how you found out about > the point where the thread is suspended. I'm still curios to learn how > you found out. Might be a valuable addition to my toolbox. > > Kind regards > > robert I have tools based on java.lang.management that will trace thread contention. Thread dumps from QUIT signals show it too. The threads aren't permanently stuck, they're just passing through 100000 times slower than normal. The problem with staying with on the old system is that Oracle bought Sun and some unpleasant changes are coming. MacOS X is only suited for development machines. Problem areas: java.util.Properties - Removed from in-house code but still everywhere else for everything. Used a lot by Sun and 3rd party code. Only performs poorly on Linux. org.springframework.context.support.ReloadableResourceBundleMessageSource - Single-threaded methods down in the bowels of Spring. Only performs poorly on Linux. Log4J - Always sucks and needs to be replaced. In the meantime, removing logging calls except when critical. Pools, caches, and resource managers - In-house code that is expected to run 100 - 300 times per second. Has no dependencies during synchronization. Has been carefully tuned to be capable of millions of calls per second on 2, 4, and 8 core hardware. They only stall on a high-end Linux boxes. -- I won't see Google Groups replies because I must filter them as spam
From: Lew on 9 Jun 2010 02:46
Kevin McMurtrie wrote: > The problem with staying with on the old system is that Oracle bought > Sun and some unpleasant changes are coming. Oh? -- Lew |