From: DB on 15 Mar 2010 15:15 Every few days, our SQL 2008 server comes to a crawl and we have to reboot it. The CPU becomes pegged at 100%. I ran a query to roll-up my wait type counts. The top 3 wait types are SOS_SCHEDULER_YIELD, CXPACKET, and ASYNC_NETWORK_IO. Their wait time in seconds are 828426, 183026, 156414 respectively. Does anybody have any idea as to what is going on? This is killing me! Thanks, -- DB
From: Erland Sommarskog on 15 Mar 2010 18:23 DB (DB(a)discussions.microsoft.com) writes: > Every few days, our SQL 2008 server comes to a crawl and we have to > reboot it. The CPU becomes pegged at 100%. I ran a query to roll-up my > wait type counts. The top 3 wait types are SOS_SCHEDULER_YIELD, > CXPACKET, and ASYNC_NETWORK_IO. Their wait time in seconds are 828426, > 183026, 156414 respectively. Does anybody have any idea as to what is > going on? This is killing me! Of course we have no idea what is going since you have not performed very much diagnosis. But presumably, there is a query from hell driving your system over the edge. You mention three wait types. SOS_SCHEDULER_YIELD is when a task yields to let another task execute. This may indicate that processes are starved for CPU, but I believe it is common for this type to be at the top. CXPACKET pertains to parallelism; a CXPACKET wait is when a thread waits for other threads to complete. It is not unusual to see high CXPACKET waits, but it may indicate that parallelism is not working well for you. Maybe you have parallel queries where one thread gets all the work. ASYNC_NETWORK_IO has no relation to CPU load at all. Rather high ASYNC_NETWORK_IO waits may reduce the CPU load, because this wait state occurs when a client does not pick up data as fast as SQL Server feeds it. What you need to do is to run more detailed diagnostics when the server gets the fever. The most important is to find out which queries that are running on the server. On http://www.sommarskog.se/sqlutil/beta_lockinfo.html you find a stored procedure that gives the currently running statements, their query plans, and locks held. (The latter is likely to be of less interest right now.) This can help you to find the culprit. Beware that the query plans are only the estimated plans, so the plans may look good, when they in fact are a disaster. For further discussion, what type of server is this: OLTP? OLAP? A consolidated server with lots of applications? One a one-app server? Test? Development? Production? -- Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
From: Jeffrey Williams on 15 Mar 2010 21:37 Are you by chance running SQL Server x64? If so, have you set the max memory setting for SQL Server? "Erland Sommarskog" <esquel(a)sommarskog.se> wrote in message news:Xns9D3CEDE6B1610Yazorman(a)127.0.0.1... > DB (DB(a)discussions.microsoft.com) writes: >> Every few days, our SQL 2008 server comes to a crawl and we have to >> reboot it. The CPU becomes pegged at 100%. I ran a query to roll-up my >> wait type counts. The top 3 wait types are SOS_SCHEDULER_YIELD, >> CXPACKET, and ASYNC_NETWORK_IO. Their wait time in seconds are 828426, >> 183026, 156414 respectively. Does anybody have any idea as to what is >> going on? This is killing me! > > Of course we have no idea what is going since you have not performed > very much diagnosis. But presumably, there is a query from hell driving > your system over the edge. > > You mention three wait types. SOS_SCHEDULER_YIELD is when a task yields > to let another task execute. This may indicate that processes are starved > for CPU, but I believe it is common for this type to be at the top. > > CXPACKET pertains to parallelism; a CXPACKET wait is when a thread > waits for other threads to complete. It is not unusual to see high > CXPACKET waits, but it may indicate that parallelism is not working > well for you. Maybe you have parallel queries where one thread gets > all the work. > > ASYNC_NETWORK_IO has no relation to CPU load at all. Rather high > ASYNC_NETWORK_IO waits may reduce the CPU load, because this wait > state occurs when a client does not pick up data as fast as SQL Server > feeds it. > > What you need to do is to run more detailed diagnostics when the server > gets the fever. The most important is to find out which queries that > are running on the server. On > http://www.sommarskog.se/sqlutil/beta_lockinfo.html you find a stored > procedure that gives the currently running statements, their query > plans, and locks held. (The latter is likely to be of less interest > right now.) This can help you to find the culprit. Beware that the > query plans are only the estimated plans, so the plans may look good, > when they in fact are a disaster. > > For further discussion, what type of server is this: OLTP? OLAP? A > consolidated server with lots of applications? One a one-app server? > Test? Development? Production? > > -- > Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se > > Links for SQL Server Books Online: > SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx > SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx > SQL 2000: > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx >
From: Uri Dimant on 16 Mar 2010 02:48 db ---This first thing to check if CPU is at 100% is to look for parallel queries: -- Tasks running in parallel (filtering out MARS requests below): select * from sys.dm_os_tasks as t where t.session_id in ( select t1.session_id from sys.dm_os_tasks as t1 group by t1.session_id having count(*) > 1 and min(t1.request_id) = max(t1.request_id)); -- Requests running in parallel: select * from sys.dm_exec_requests as r join ( select t1.session_id, min(t1.request_id) from sys.dm_os_tasks as t1 group by t1.session_id having count(*) > 1 and min(t1.request_id) = max(t1.request_id) ) as t(session_id, request_id) on r.session_id = t.session_id and r.request_id = t.request_id; "DB" <DB(a)discussions.microsoft.com> wrote in message news:885879B7-BB05-42D4-8B69-E3BE76E7CFD7(a)microsoft.com... > Every few days, our SQL 2008 server comes to a crawl and we have to reboot > it. The CPU becomes pegged at 100%. I ran a query to roll-up my wait > type > counts. The top 3 wait types are SOS_SCHEDULER_YIELD, CXPACKET, and > ASYNC_NETWORK_IO. Their wait time in seconds are 828426, 183026, 156414 > respectively. Does anybody have any idea as to what is going on? This is > killing me! > > Thanks, > > -- > DB
|
Pages: 1 Prev: Locating Invalid rows after restoring S2K db to S2K8 (DBCC err Next: message not appearing |