From: Joe Matise on 29 Jan 2010 12:31 Is it on a server? If so I'd blame the server, probably has use issues (more people hitting it now). Or perhaps you have (nearly) run out of storage space and it's scrambling to find space for the last bit? Can you find the temporary work location and see how big the file is that's being created? Should give you an idea of how far it is, at least. I can't think of a sas i/o reason but I'm not an expert on that side either. -Joe On Fri, Jan 29, 2010 at 11:26 AM, Nordlund, Dan (DSHS/RDA) < NordlDJ(a)dshs.wa.gov> wrote: > I have a 12 GB (SAS compressed) SAS file that I need to process and add > some computed variables to it. The resulting file is 15 GB (SAS compressed, > 40 million) records). For a variety of reasons, I had to re-run the > process. Both times the job finished in approximately 1/2 hour. > > Then I wanted to read just 3 variables (about 150 bytes per record) from > the 15 GB file using a simple data step. > > data svclib. service_codes (compress=yes); > set svclib.svc_span_transactions(keep=service_code tie_breaker > source_system_id); > run; > > The resulting file will be about 1 GB (SAS compressed). The job has been > running for nearly 3 hours and hasn't finished yet. > > Does anyone have any ideas about what might be going on? It seems to me > that reading 3 variables from the 15 GB file shouldn't take 6+ times longer > than creating the file in the first place. I am about to contact my IT > people to see about disk problems, I/O contention, etc. but wanted to verify > that there is no SAS file I/O reasons for the above results. I would be > happy to provide any other info that might be helpful. > > Puzzled near Seattle, > > Dan > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA 98504-5204 >
From: NordlDJ on 29 Jan 2010 12:47 Yes it is on a Windows Server and the files are on an attached high = speed storage area network. The usual explanation for slow processing = that I see is in fact I/O contention with other large jobs that are = running concurrently. But I have never experienced this magnitude of = slow down. =20 Dan =20 Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 =20 From: Joe Matise [mailto:snoopy369(a)gmail.com]=20 Sent: Friday, January 29, 2010 9:32 AM To: Nordlund, Dan (DSHS/RDA) Cc: SAS-L(a)listserv.uga.edu Subject: Re: puzzling SAS I/O question =20 Is it on a server? If so I'd blame the server, probably has use issues = (more people hitting it now). Or perhaps you have (nearly) run out of = storage space and it's scrambling to find space for the last bit? Can = you find the temporary work location and see how big the file is that's = being created? Should give you an idea of how far it is, at least. I can't think of a sas i/o reason but I'm not an expert on that side = either. -Joe On Fri, Jan 29, 2010 at 11:26 AM, Nordlund, Dan (DSHS/RDA) = <NordlDJ(a)dshs.wa.gov> wrote: I have a 12 GB (SAS compressed) SAS file that I need to process and add = some computed variables to it. The resulting file is 15 GB (SAS = compressed, 40 million) records). For a variety of reasons, I had to = re-run the process. Both times the job finished in approximately 1/2 = hour. Then I wanted to read just 3 variables (about 150 bytes per record) from = the 15 GB file using a simple data step. data svclib. service_codes (compress=3Dyes); set svclib.svc_span_transactions(keep=3Dservice_code tie_breaker = source_system_id); run; The resulting file will be about 1 GB (SAS compressed). The job has = been running for nearly 3 hours and hasn't finished yet. Does anyone have any ideas about what might be going on? It seems to me = that reading 3 variables from the 15 GB file shouldn't take 6+ times = longer than creating the file in the first place. I am about to = contact my IT people to see about disk problems, I/O contention, etc. = but wanted to verify that there is no SAS file I/O reasons for the above = results. I would be happy to provide any other info that might be = helpful. Puzzled near Seattle, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 =20
From: "Terjeson, Mark" on 29 Jan 2010 13:09 Hi Joe, You certainly could do the traditional process of elimination to help locate where the problem is for sure. i.e. Most PC's nowadays have multi-Gig harddrives. If I write 40M obs with 3vars to a local drive it completes in about 1m13s. Your incoming dataset buffer and your datastep processing may be more involved, but you certainly could do a test and repoint your program to output to your local drive. If, indeed, it runs faster, then you have determined that the slow-down is remote from your local box, which would confirm that your processing is ruled out of the equation of contributing factors. You could also reconfirm the SAN issues by writing a very stripped down test with the volume of records necessary and see if the with/without SAN shows the difference. Hope this is helpful. Mark Terjeson Investment Business Intelligence Investment Management & Research Russell Investments 253-439-2367 Russell Global Leaders in Multi-Manager Investing -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Nordlund, Dan (DSHS/RDA) Sent: Friday, January 29, 2010 9:48 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: puzzling SAS I/O question Yes it is on a Windows Server and the files are on an attached high speed storage area network. The usual explanation for slow processing that I see is in fact I/O contention with other large jobs that are running concurrently. But I have never experienced this magnitude of slow down. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 From: Joe Matise [mailto:snoopy369(a)gmail.com] Sent: Friday, January 29, 2010 9:32 AM To: Nordlund, Dan (DSHS/RDA) Cc: SAS-L(a)listserv.uga.edu Subject: Re: puzzling SAS I/O question Is it on a server? If so I'd blame the server, probably has use issues (more people hitting it now). Or perhaps you have (nearly) run out of storage space and it's scrambling to find space for the last bit? Can you find the temporary work location and see how big the file is that's being created? Should give you an idea of how far it is, at least. I can't think of a sas i/o reason but I'm not an expert on that side either. -Joe On Fri, Jan 29, 2010 at 11:26 AM, Nordlund, Dan (DSHS/RDA) <NordlDJ(a)dshs.wa.gov> wrote: I have a 12 GB (SAS compressed) SAS file that I need to process and add some computed variables to it. The resulting file is 15 GB (SAS compressed, 40 million) records). For a variety of reasons, I had to re-run the process. Both times the job finished in approximately 1/2 hour. Then I wanted to read just 3 variables (about 150 bytes per record) from the 15 GB file using a simple data step. data svclib. service_codes (compress=yes); set svclib.svc_span_transactions(keep=service_code tie_breaker source_system_id); run; The resulting file will be about 1 GB (SAS compressed). The job has been running for nearly 3 hours and hasn't finished yet. Does anyone have any ideas about what might be going on? It seems to me that reading 3 variables from the 15 GB file shouldn't take 6+ times longer than creating the file in the first place. I am about to contact my IT people to see about disk problems, I/O contention, etc. but wanted to verify that there is no SAS file I/O reasons for the above results. I would be happy to provide any other info that might be helpful. Puzzled near Seattle, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
From: Arthur Tabachneck on 29 Jan 2010 17:17 Dan, In addition to what others have mentioned, I've run into similar problems using a Windows-based network to analyze large files. And, when I've confronted such problems they haven't been limited to SAS, but all programs trying to do analyses across servers. That is, in my cases, network activity was the culprit per se. I ended up moving all of our large files onto the same server where we keep SAS and, alas, all of the problems went away. FWIW, Art --------- On Fri, 29 Jan 2010 09:26:23 -0800, Nordlund, Dan (DSHS/RDA) <NordlDJ(a)DSHS.WA.GOV> wrote: >I have a 12 GB (SAS compressed) SAS file that I need to process and add some computed variables to it. The resulting file is 15 GB (SAS compressed, 40 million) records). For a variety of reasons, I had to re- run the process. Both times the job finished in approximately 1/2 hour. > >Then I wanted to read just 3 variables (about 150 bytes per record) from the 15 GB file using a simple data step. > >data svclib. service_codes (compress=yes); > set svclib.svc_span_transactions(keep=service_code tie_breaker source_system_id); >run; > >The resulting file will be about 1 GB (SAS compressed). The job has been running for nearly 3 hours and hasn't finished yet. > >Does anyone have any ideas about what might be going on? It seems to me that reading 3 variables from the 15 GB file shouldn't take 6+ times longer than creating the file in the first place. I am about to contact my IT people to see about disk problems, I/O contention, etc. but wanted to verify that there is no SAS file I/O reasons for the above results. I would be happy to provide any other info that might be helpful. > >Puzzled near Seattle, > >Dan > >Daniel J. Nordlund >Washington State Department of Social and Health Services >Planning, Performance, and Accountability >Research and Data Analysis Division >Olympia, WA 98504-5204
From: "Terjeson, Mark" on 29 Jan 2010 17:26
Hi, To second Art's motion, when dealing with large files, i/o (across the wire) between boxes is typically the slowest link in the chain. --as the old saying goes. Mark -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Arthur Tabachneck Sent: Friday, January 29, 2010 2:18 PM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: puzzling SAS I/O question Dan, In addition to what others have mentioned, I've run into similar problems using a Windows-based network to analyze large files. And, when I've confronted such problems they haven't been limited to SAS, but all programs trying to do analyses across servers. That is, in my cases, network activity was the culprit per se. I ended up moving all of our large files onto the same server where we keep SAS and, alas, all of the problems went away. FWIW, Art --------- On Fri, 29 Jan 2010 09:26:23 -0800, Nordlund, Dan (DSHS/RDA) <NordlDJ(a)DSHS.WA.GOV> wrote: >I have a 12 GB (SAS compressed) SAS file that I need to process and add some computed variables to it. The resulting file is 15 GB (SAS compressed, 40 million) records). For a variety of reasons, I had to re- run the process. Both times the job finished in approximately 1/2 hour. > >Then I wanted to read just 3 variables (about 150 bytes per record) from the 15 GB file using a simple data step. > >data svclib. service_codes (compress=yes); > set svclib.svc_span_transactions(keep=service_code tie_breaker source_system_id); >run; > >The resulting file will be about 1 GB (SAS compressed). The job has been running for nearly 3 hours and hasn't finished yet. > >Does anyone have any ideas about what might be going on? It seems to me that reading 3 variables from the 15 GB file shouldn't take 6+ times longer than creating the file in the first place. I am about to contact my IT people to see about disk problems, I/O contention, etc. but wanted to verify that there is no SAS file I/O reasons for the above results. I would be happy to provide any other info that might be helpful. > >Puzzled near Seattle, > >Dan > >Daniel J. Nordlund >Washington State Department of Social and Health Services >Planning, Performance, and Accountability >Research and Data Analysis Division >Olympia, WA 98504-5204 |