Prev: Scrolling in tile
Next: Tcl 8.6 & IncrTcl...
From: Alexandre Ferrieux on 2 Nov 2009 03:31 On Oct 31, 8:12 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > On Oct 31, 9:44 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> > wrote: > > > On Oct 30, 8:07 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > How do you do chunked reads and signal when to stop, remove the > > > <cr><lf>, read the next chunk size, remove the <cr><lf> and start > > > reading again? > > > With a state machine, fileevents, and nonblocking gets. > > HTTP data is binary, or maybe it is better to say it is opaque. The > chunked transfer encoding is specifically byte oriented with a well > defined structure. Nothing in the standards I have read indicates that > you should treat the data as line oriented. Don't expect an RFC to hold your hand as to "how you should treat data" ;-) What I'm saying is merely that chunked transfer is an alternated text/ binary syntax, and that implementing it as [gets;read] is the most natural way. Moreover, it turns out to be _efficient_ thanks to input buffering. > I've also pointed out several times that since gets can fail, you have > to handle error conditions: again this just adds new, unnecessary > states to the machine. Uh ??? You may point something out several times, if it still lacks arguments... Are you unfamiliar with error propagation in Tcl, or are you just testing my resilience to random nonsense ? -Alex
From: tom.rmadilo on 2 Nov 2009 17:32 On Nov 2, 12:31 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> wrote: > On Oct 31, 8:12 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > On Oct 31, 9:44 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> > > wrote: > > > > On Oct 30, 8:07 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > > How do you do chunked reads and signal when to stop, remove the > > > > <cr><lf>, read the next chunk size, remove the <cr><lf> and start > > > > reading again? > > > > With a state machine, fileevents, and nonblocking gets. > > > HTTP data is binary, or maybe it is better to say it is opaque. The > > chunked transfer encoding is specifically byte oriented with a well > > defined structure. Nothing in the standards I have read indicates that > > you should treat the data as line oriented. > > Don't expect an RFC to hold your hand as to "how you should treat > data" ;-) > > What I'm saying is merely that chunked transfer is an alternated text/ > binary syntax, and that implementing it as [gets;read] is the most > natural way. Moreover, it turns out to be _efficient_ thanks to input > buffering. Why not show me? I've provided code and a framework which should make it easy to plug in your chunked transfer implementation. In addition, if you use my chunk-data to read the data, you only need to replace this proc: proc ::htclient::htChunkSize { client } { variable clientChunkRemain variable hexCharMap set sock [getVar $client sock] set char [read $sock 1] set class $hexCharMap($char) log Debug "htChunkSize char $client = '$char' class =\ $class size = $clientChunkRemain($client)" switch -exact -- $class { "HEXCHAR" { if {![getVar $client inCR]} { set val $clientChunkRemain($client) set cval [format %i 0x$char] set val [expr {$val * 16 + $cval}] set clientChunkRemain($client) $val } else { htError $client msg "unexpected HEXCHAR in CR mode" } } "CR" { if {![getVar $client inCR]} { setVar $client inCR 1 } else { htError $client msg "unexpected CR in Chunk Size" } } "LF" { if {[getVar $client inCR]} { setVar $client inCR 0 if {!$clientChunkRemain($client)} { setVar $client state done } else { log Debug "Chunk size $client = $clientChunkRemain($client)" setVar $client state chunkdata } } else { htError $client msg "unexpected LF in Chunk Size" } } "EMPTY" { } "NONHEX" { htError $client lastchar $char msg "unexpected NONHEX in Chunk Size" } } }
From: tom.rmadilo on 2 Nov 2009 19:19 On Nov 2, 12:31 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> wrote: > On Oct 31, 8:12 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > I've also pointed out several times that since gets can fail, you have > > to handle error conditions: again this just adds new, unnecessary > > states to the machine. > > Uh ??? You may point something out several times, if it still lacks > arguments... > Are you unfamiliar with error propagation in Tcl, or are you just > testing my resilience to random nonsense ? There are two kinds of errors: 1. errors you can/could predict, and 2. those you can't The first type of error is the result of bugs or poor programming. It means you could have detected ahead of time that the next operation will fail given the current known information or state. The second type of errors are the result of external conditions that can't be known or predicted in advance, even if they happen often and can be easily classified. I'm not interested in the first type of errors, other than to suggest removing them from code as quickly as possible. The second type of errors have to be analyzed in terms of their effect on the application. If an error results in the loss of a resource, the damage is serious and the code which reports the error should be isolated so that resource loss can be minimized, but then the error should propagate up so that the application can decide what to do. Otherwise, the error just needs to be reported up (allowed to happen). Basically this is the Mafia Theory of Error Management: protect but notify the boss.
From: Alexandre Ferrieux on 3 Nov 2009 03:11 On Nov 2, 11:32 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > Why not show me? I've provided code and a framework which should make > it easy to plug in your chunked transfer implementation. Yup, here ya go. # assuming socket in non-blocking mode and fileevent readable # on function below proc ::htclient::htChunkSize { client } { variable clientChunkRemain set sock [getVar $client sock] if {[gets $sock line]<0} { if {![eof $sock]} return htError $client msg "unexpected EOF between chunks" } if {![regexp -nocase {^[0-9A-F]+\r$} $line]} { htError $client msg "illegal chunk size syntax" } scan $line %x clientChunkRemain($client) if {!$clientChunkRemain($client)} { setVar $client state done } else { log Debug "Chunk size $client = $clientChunkRemain($client)" setVar $client state chunkdata } } -Alex
From: tom.rmadilo on 3 Nov 2009 12:37
On Nov 3, 12:11 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> wrote: > On Nov 2, 11:32 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > > > Why not show me? I've provided code and a framework which should make > > it easy to plug in your chunked transfer implementation. > > Yup, here ya go. > > # assuming socket in non-blocking mode and fileevent readable > # on function below > > proc ::htclient::htChunkSize { client } { > variable clientChunkRemain > set sock [getVar $client sock] > if {[gets $sock line]<0} { > if {![eof $sock]} return > htError $client msg "unexpected EOF between chunks" > } > if {![regexp -nocase {^[0-9A-F]+\r$} $line]} { > htError $client msg "illegal chunk size syntax" > } > scan $line %x clientChunkRemain($client) > if {!$clientChunkRemain($client)} { > setVar $client state done > } else { > log Debug "Chunk size $client = $clientChunkRemain($client)" > setVar $client state chunkdata > } > } Okay, I setup a test case using your code: http://www.junom.com/gitweb/gitweb.perl?p=htclient.git;a=commit;h=4c449 Depending on the url and the number of simultaneous downloads, your version is sometimes consistently faster 1-5%, in other cases, the original code is 1-5% faster. But the variability between tests is much larger than the average difference (20-25%). One thing is very consistent: both the old and new code are about 100% faster (twice as fast) than ::http::geturl when grabbing a single copy of a url. When grabbing 10 copies, both old and new code are about 200% faster (three times as fast). |