From: Alexandre Ferrieux on 23 Feb 2010 16:13 On Feb 23, 7:08 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > On Feb 23, 9:07 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> > wrote: > > > On Feb 23, 5:33 pm, "Donal K. Fellows" > > > <donal.k.fell...(a)manchester.ac.uk> wrote: > > > > > I'm also confused as to how I accumulate input without causing a > > > > potential blocking or other problems. > > > > IIRC, there was [chan pending] added to 8.5 to allow you to handle > > > that sort of thing. Not an area I've experimented much in. > > > Not sure [chan pending] pushes the envelope of what was already > > possible with non-blocking [gets] ;-) > > I've not experimented with non-blocking [gets], but I understand that > you have to always check for failure and handle such situations. Yes, using [eof] or [fblocked] after [gets] returns -1. Note that checking for failures is a fact of life in programming ;-) > You also [gets] chars, not octets. This doesn't match up with what signals > a readable event. I don't understand this statement. Again, nonblocking [gets] and [fileevent] marry perfectly, as witnessed by Eric Hassold's idiomatic form: proc ReadCB {fd} { while {1} { if {[gets $fd l]<0} { if {[eof $fd]} { # End of input catch {close $fd} return } # Incomplete input. Wait for another fileevent to fire return } # Get complete line. normal processing .... } } > One thing which confuses me is that [chan pending] will sometimes > return zero after a readable event (50% of the time if the previous > read drained the buffer), if you don't read at least one char, the > buffer never fills up. Re-read the manpage. [chan pending input] gives the number of bytes _already_ buffered. When a fileevent fires, it can be of two kinds: fd- level when there's no buffered byte and select was called upon, and buffer-level when there are buffered bytes, and no select came into play. -Alex
From: tom.rmadilo on 23 Feb 2010 17:16 On Feb 23, 1:13 pm, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> wrote: > On Feb 23, 7:08 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > > > > > On Feb 23, 9:07 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> > > wrote: > > > > On Feb 23, 5:33 pm, "Donal K. Fellows" > > > > <donal.k.fell...(a)manchester.ac.uk> wrote: > > > > > > I'm also confused as to how I accumulate input without causing a > > > > > potential blocking or other problems. > > > > > IIRC, there was [chan pending] added to 8.5 to allow you to handle > > > > that sort of thing. Not an area I've experimented much in. > > > > Not sure [chan pending] pushes the envelope of what was already > > > possible with non-blocking [gets] ;-) > > > I've not experimented with non-blocking [gets], but I understand that > > you have to always check for failure and handle such situations. > > Yes, using [eof] or [fblocked] after [gets] returns -1. Note that > checking for failures is a fact of life in programming ;-) > > > You also [gets] chars, not octets. This doesn't match up with what signals > > a readable event. > > I don't understand this statement. Again, nonblocking [gets] and > [fileevent] marry perfectly, as witnessed by Eric Hassold's idiomatic > form: > > proc ReadCB {fd} { > while {1} { > if {[gets $fd l]<0} { > if {[eof $fd]} { > # End of input > catch {close $fd} > return > } > # Incomplete input. Wait for another fileevent to fire > return > } > # Get complete line. normal processing > .... > } > } > > > One thing which confuses me is that [chan pending] will sometimes > > return zero after a readable event (50% of the time if the previous > > read drained the buffer), if you don't read at least one char, the > > buffer never fills up. > > Re-read the manpage. [chan pending input] gives the number of bytes > _already_ buffered. When a fileevent fires, it can be of two kinds: fd- > level when there's no buffered byte and select was called upon, and > buffer-level when there are buffered bytes, and no select came into > play. Given my example, I would say that this analysis is incomplete. If a fileevent is triggered, and [chan pending] then returns "0", and no data is read (and the callback returns), the buffer never fills up. You get into an infinite loop. If it was just a select returning with no bytes, eventually data would show up in the buffer and event would fire based upon data in the buffer. But assuming you are correct, I guess the only question is what do you do when the input buffers are full and you have not found a newline? From [chan pending]: "(especially useful in a readable event callback to impose application- specific limits on input line lengths to avoid a potential denial-of- service attack where a hostile user crafts an extremely long line that exceeds the available memory to buffer it)" It is valid to say you don't care about this situation, but I do. The truth is that you can't set application limits on input unless the input actually gets into the application. In this case the data is stuck in a buffer waiting for a newline.
From: Andreas Kupries on 24 Feb 2010 00:11 Alexandre Ferrieux <alexandre.ferrieux(a)gmail.com> writes: >> So the first thing needed is a regexp which can distinguish between >> correct and incorrect headers and parse correct headers into tokens. > > I suspect some kind of escaping hell in those dynamically built > regexps... > Why don't you start back from the constant, braced, commented regexp > we've built together ? Wondering, could the principles of shallow parsing (http://wiki.tcl.tk/8720) apply here? -- So long, Andreas Kupries <akupries(a)shaw.ca> <http://www.purl.org/NET/akupries/> Developer @ <http://www.activestate.com/> -------------------------------------------------------------------------------
From: tom.rmadilo on 24 Feb 2010 02:49 On Feb 23, 9:11 pm, Andreas Kupries <akupr...(a)shaw.ca> wrote: > Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> writes: > >> So the first thing needed is a regexp which can distinguish between > >> correct and incorrect headers and parse correct headers into tokens. > > > I suspect some kind of escaping hell in those dynamically built > > regexps... > > Why don't you start back from the constant, braced, commented regexp > > we've built together ? > > Wondering, could the principles of shallow parsing > (http://wiki.tcl.tk/8720) apply here? Seems like this has a similar problem as my regexps: works okay on valid input, but not so much on invalid. I made a little addition to the xml ("xyz" inside a tag): set xml {<html> <head> <title>XML Shallow Parsing with Regular Expressions</title> <meta http-equiv="Pragma" content="no-cache"></meta> <meta http-equiv="Expire" content="Mon, 04 Dec 1999 21:29:02 GMT"></ meta> <link rel="stylesheet" href="http://wiki.tcl.tk/wikit.css" "xyz" type="text/css"></link> <base href="http://wiki.tcl.tk/"> </head>} % ShallowParse $xml <html> { } <head> { } <title> {XML Shallow Parsing with Regular Expressions} </title> { } {<meta http-equiv="Pragma" content="no-cache">} </meta> { } {<meta http-equiv="Expire" content="Mon, 04 Dec 1999 21:29:02 GMT">} </meta> { } {<link rel="stylesheet" href="http://wiki.tcl.tk/wikit.css" } {"xyz" type="text/css">} </link> { } {<base href="http://wiki.tcl.tk/">} { } </head> For some reason, the link tag gets chopped into two and you have no indication the xml is not valid.
From: Alexandre Ferrieux on 24 Feb 2010 02:59
On Feb 23, 11:16 pm, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote: > > > > Re-read the manpage. [chan pending input] gives the number of bytes > > _already_ buffered. When a fileevent fires, it can be of two kinds: fd- > > level when there's no buffered byte and select was called upon, and > > buffer-level when there are buffered bytes, and no select came into > > play. > > Given my example, I would say that this analysis is incomplete. If a > fileevent is triggered, and [chan pending] then returns "0", and no > data is read (and the callback returns), the buffer never fills up. > You get into an infinite loop. ??? What the heck are you talking about ??? If you've found a case where fileevent repeatedly fires a false alarm (and not only sporadic ones as it has been reported with TLS), please file a bug. Otherwise, stick to the idioms provided and use them. > But assuming you are correct, I guess the only question is what do you > do when the input buffers are full and you have not found a newline? In that case [gets] return -1, [fblocked] returns 1, and [eof] returns 0. At that time you can do your limit-check with [chan pending] if you're on a DoS-paranoid mood. But whether you do it or not, you then immediately return from your fileevent waiting for new data to come. No infinite loop. Season with timeouts if you wish. -Alex |