Prev: DateTimePicker return wrong format
Next: Binary Deserialization w slight mismatch on target object
From: Bob on 18 May 2010 07:29 I need to scan a large number of web-resident files, primarily to get file size. IOW, a simple operation. Can anyone provide the benefit of their intuition on how to set the timeout, and how many retries to attempt? Currently I have the WebRequest timeout set for 2 seconds, and if the request times out, I loop back and try again. So just 2 tries. Not sure if that's optimal. I realize that this is arbitrary, but the files reside on various places on the net, so it's impossible to profile in advance. I'm sure that someone else has done something similar though, and may have a good feel for median values. Another thing: I've often got WebResponse file sizes that are one byte different from the actual size of the file. Any idea what's up there?
From: Arne Vajhøj on 18 May 2010 19:55 On 18-05-2010 07:29, Bob wrote: > I need to scan a large number of web-resident files, primarily to get > file size. IOW, a simple operation. Can anyone provide the benefit of > their intuition on how to set the timeout, and how many retries to > attempt? > > Currently I have the WebRequest timeout set for 2 seconds, and if the > request times out, I loop back and try again. So just 2 tries. Not > sure if that's optimal. > > I realize that this is arbitrary, but the files reside on various > places on the net, so it's impossible to profile in advance. I'm sure > that someone else has done something similar though, and may have a > good feel for median values. I assume that you send HEAD and not GET !? A reasonable small timeout should be sufficient. You should do it thread based - possible queuing work to the ThreadPool to maximize throughput. > Another thing: I've often got WebResponse file sizes that are one byte > different from the actual size of the file. Any idea what's up there? Difficult to say without an example URL. Arne
From: Bob on 18 May 2010 20:40 On Tue, 18 May 2010 19:55:42 -0400, Arne Vajh�j <arne(a)vajhoej.dk> wrote: >On 18-05-2010 07:29, Bob wrote: >> I need to scan a large number of web-resident files, primarily to get >> file size. IOW, a simple operation. Can anyone provide the benefit of >> their intuition on how to set the timeout, and how many retries to >> attempt? >> >> Currently I have the WebRequest timeout set for 2 seconds, and if the >> request times out, I loop back and try again. So just 2 tries. Not >> sure if that's optimal. >> >> I realize that this is arbitrary, but the files reside on various >> places on the net, so it's impossible to profile in advance. I'm sure >> that someone else has done something similar though, and may have a >> good feel for median values. > >I assume that you send HEAD and not GET !? Yes. >A reasonable small timeout should be sufficient. I've been using a 2 second timeout, then retrying once if it fails. Is that what you meant by 'small'? The way I arrived at that: I noticed that a large timeout didn't succeed much more than a shorter one; if it was going to fail, it would just fail. But a -very- small timeout was under the response time of many servers. There is a sort of median range that is probably optimal. I just don't have a good feel for what the median value might be. Retries: I also found that some failures that would stall forever on the first try, would succeed on the second try. Something about just reinitiating the request. Not sure if it's worth doing a third or not. So the 2 second timeout was arbitrary, and I haven't had time to test over a huge number of servers. That's the main thing that I'd like find from you web guys. (Hey, I'm a desktop programmer... I don't do this stuff often). >You should do it thread based - possible queuing work to the >ThreadPool to maximize throughput. I usually use a BackgroundWorker with a dialog with progress bar. Watching the little bar move seems to provide some comfort during those delays. (-:> >> Another thing: I've often got WebResponse file sizes that are one byte >> different from the actual size of the file. Any idea what's up there? > >Difficult to say without an example URL. > >Arne I'll try to look for a few examples. I thought maybe that was a very common thing, given that it seems to be just one byte much of the time...just seemed 'coincidental'. Thanks, Arne.
From: Arne Vajhøj on 18 May 2010 21:50 On 18-05-2010 20:40, Bob wrote: > On Tue, 18 May 2010 19:55:42 -0400, Arne Vajh�j<arne(a)vajhoej.dk> > wrote: > >> On 18-05-2010 07:29, Bob wrote: >>> I need to scan a large number of web-resident files, primarily to get >>> file size. IOW, a simple operation. Can anyone provide the benefit of >>> their intuition on how to set the timeout, and how many retries to >>> attempt? >>> >>> Currently I have the WebRequest timeout set for 2 seconds, and if the >>> request times out, I loop back and try again. So just 2 tries. Not >>> sure if that's optimal. >>> >>> I realize that this is arbitrary, but the files reside on various >>> places on the net, so it's impossible to profile in advance. I'm sure >>> that someone else has done something similar though, and may have a >>> good feel for median values. >> >> I assume that you send HEAD and not GET !? > > Yes. > >> A reasonable small timeout should be sufficient. > > I've been using a 2 second timeout, then retrying once if it fails. Is > that what you meant by 'small'? > > The way I arrived at that: I noticed that a large timeout didn't > succeed much more than a shorter one; if it was going to fail, it > would just fail. But a -very- small timeout was under the response > time of many servers. There is a sort of median range that is > probably optimal. I just don't have a good feel for what the median > value might be. > > Retries: I also found that some failures that would stall forever on > the first try, would succeed on the second try. Something about just > reinitiating the request. Not sure if it's worth doing a third or not. > > So the 2 second timeout was arbitrary, and I haven't had time to test > over a huge number of servers. That's the main thing that I'd like > find from you web guys. (Hey, I'm a desktop programmer... I don't do > this stuff often). 2 seconds is a pretty huge timeout for HTTP. >> You should do it thread based - possible queuing work to the >> ThreadPool to maximize throughput. > > I usually use a BackgroundWorker with a dialog with progress bar. > Watching the little bar move seems to provide some comfort during > those delays. (-:> I think doing many in parallel would be speed up things a lot. And you can still use the progress bar. >>> Another thing: I've often got WebResponse file sizes that are one byte >>> different from the actual size of the file. Any idea what's up there? >> >> Difficult to say without an example URL. > > I'll try to look for a few examples. I thought maybe that was a very > common thing, given that it seems to be just one byte much of the > time...just seemed 'coincidental'. OK. Arne
From: Bob on 20 May 2010 08:23 On Tue, 18 May 2010 21:50:52 -0400, Arne Vajh�j <arne(a)vajhoej.dk> wrote: >On 18-05-2010 20:40, Bob wrote: >> On Tue, 18 May 2010 19:55:42 -0400, Arne Vajh�j<arne(a)vajhoej.dk> >> wrote: >> >>> On 18-05-2010 07:29, Bob wrote: >>>> I need to scan a large number of web-resident files, primarily to get >>>> file size. IOW, a simple operation. Can anyone provide the benefit of >>>> their intuition on how to set the timeout, and how many retries to >>>> attempt? >>>> >>>> Currently I have the WebRequest timeout set for 2 seconds, and if the >>>> request times out, I loop back and try again. So just 2 tries. Not >>>> sure if that's optimal. >> I've been using a 2 second timeout, then retrying once if it fails. Is >> that what you meant by 'small'? > >2 seconds is a pretty huge timeout for HTTP. Hi again, Arne. I've run some tests (time consuming) on the file info retrieval function. Reliabilty actually does stay pretty consistent when the timeout is dropped from 2 seconds to 1 second as long as I do at least one retry on failure. At 1/2 sec, I get a few errors, but at 1/4 sec, the error rate goes up. Doing at least one retry seems important. Otherwise, even with a 4 second timeout, I get a consiiderable number of errors. When I say "errors" above, I mean that the WebRequest times out. IOW, setting the WebRequest timeout function to 4 seconds does not work as well as 1 sec with a single retry. Interesting how that works, but it took a long while to do those tests. >I think doing many in parallel would be speed up things a lot. > >And you can still use the progress bar. Now that you mention it, is there an easy way to determine the number of 'channels' that would be optimal? There's got to be a logical limit on connections. >>>> Another thing: I've often got WebResponse file sizes that are one byte >>>> different from the actual size of the file. Any idea what's up there? >>> >>> Difficult to say without an example URL. >> >> I'll try to look for a few examples. I thought maybe that was a very >> common thing, given that it seems to be just one byte much of the >> time...just seemed 'coincidental'. > >OK. Of course I haven't been able to get that to happen again since my last post.
|
Next
|
Last
Pages: 1 2 Prev: DateTimePicker return wrong format Next: Binary Deserialization w slight mismatch on target object |