From: davidlallen on
Folks,

I am using the nntp library in tcllib 1.12 for an application which
fetches millions of headers from a news server. It works; but
occasionally the link fails and a transaction stalls. Today this
causes the application to simply hang forever, since the nntp library
does not support any timeout. You can see below the original code
which simply uses "while not eof on socket". Some other libraries,
including the http library in tcl itself, provide a timeout. So I
have tried to modify the nntp code so that it will timeout, say after
60 seconds.

My problem is that the new code seems to work correctly on small
testcases but in the actual application it loses data, resulting in
the downloaded binary file failing to unrar. When there is no timeout
situation, the application using the original nntp library does not
have this problem. So, somehow my fix "appears" correct but is not
really.

Can anybody suggest a different approach by inspection of the changes
I have made? I am certainly not an expert on vwait and fileevent.

Original (see tcllib 1.12 nntp library):

proc ::nntp::fetch {name} {
upvar 0 ::nntp::${name}data data
set eol "\012"
if {![::nntp::okprint $name]} {
return ""
}
set sock $data(sock)
if {$data(binary)} {
set oldenc [fconfigure $sock -encoding]
fconfigure $sock -encoding binary
}
set result [list ]
while {![eof $sock]} {
gets $sock line
regsub -- {\015?\012$} $line $data(eol) line
if {[string match "." $line]} {
break
}
if { [string match "..*" $line] } {
lappend result [string range $line 1 end]
} else {
lappend result $line
}
}
if {$data(binary)} {
fconfigure $sock -encoding $oldenc
}
return $result
}

Modified:

proc ::nntp::fetch {name} {
upvar 0 ::nntp::${name}data data
if {![::nntp::okprint $name]} {
return ""
}
set sock $data(sock)
if {$data(binary)} {
set oldenc [fconfigure $sock -encoding]
fconfigure $sock -encoding binary
}
# davidlallen changes start here; see getline and timeout
below
set data(after) [after 60000 [list ::nntp::timeout $name]]
set data(result) [list ]
fconfigure $sock -blocking off
fileevent $sock readable [list ::nntp::getline $name]
vwait ::nntp::${name}done
fileevent $sock readable {}
catch {after cancel $data(after)}
fconfigure $sock -blocking on
# davidlallen changes end here
if {$data(binary)} {
fconfigure $sock -encoding $oldenc
}
return $data(result)
}

proc ::nntp::getline {name} {
upvar 0 ::nntp::${name}data data
set eol "\012"
set sock $data(sock)
gets $sock line
regsub -- {\015?\012$} $line $data(eol) line
if {[string match "." $line]} {
set ::nntp::${name}done 1
}
if { [string match "..*" $line] } {
lappend data(result) [string range $line 1 end]
} else {
lappend data(result) $line
}
}

proc ::nntp::timeout {name} {
upvar 0 ::nntp::${name}data data
set data(result) {}
set ::nntp::${name}done 1
}

From: Alexandre Ferrieux on
On Jun 21, 6:21 pm, davidlallen <d...(a)jendaveallen.com> wrote:
> Folks,
>
> I am using the nntp library in tcllib 1.12 for an application which
> fetches millions of headers from a news server.  It works; but
> occasionally the link fails and a transaction stalls.  Today this
> causes the application to simply hang forever, since the nntp library
> does not support any timeout.  You can see below the original code
> which simply uses "while not eof on socket".  Some other libraries,
> including the http library in tcl itself, provide a timeout.  So I
> have tried to modify the nntp code so that it will timeout, say after
> 60 seconds.
>
> My problem is that the new code seems to work correctly on small
> testcases but in the actual application it loses data, resulting in
> the downloaded binary file failing to unrar.  When there is no timeout
> situation, the application using the original nntp library does not
> have this problem.  So, somehow my fix "appears" correct but is not
> really.
>
> Can anybody suggest a different approach by inspection of the changes
> I have made?  I am certainly not an expert on vwait and fileevent.
>
> Original (see tcllib 1.12 nntp library):
>
>     proc ::nntp::fetch {name} {
>         upvar 0 ::nntp::${name}data data
>         set eol "\012"
>         if {![::nntp::okprint $name]} {
>             return ""
>         }
>         set sock $data(sock)
>         if {$data(binary)} {
>             set oldenc [fconfigure $sock -encoding]
>             fconfigure $sock -encoding binary
>         }
>         set result [list ]
>         while {![eof $sock]} {
>             gets $sock line
>             regsub -- {\015?\012$} $line $data(eol) line
>             if {[string match "." $line]} {
>                 break
>             }
>             if { [string match "..*" $line] } {
>                 lappend result [string range $line 1 end]
>             } else {
>                 lappend result $line
>             }
>         }
>         if {$data(binary)} {
>             fconfigure $sock -encoding $oldenc
>         }
>         return $result
>     }
>
> Modified:
>
>     proc ::nntp::fetch {name} {
>         upvar 0 ::nntp::${name}data data
>         if {![::nntp::okprint $name]} {
>             return ""
>         }
>         set sock $data(sock)
>         if {$data(binary)} {
>             set oldenc [fconfigure $sock -encoding]
>             fconfigure $sock -encoding binary
>         }
>         # davidlallen changes start here; see getline and timeout
> below
>         set data(after) [after 60000 [list ::nntp::timeout $name]]
>         set data(result) [list ]
>         fconfigure $sock -blocking off
>         fileevent $sock readable [list ::nntp::getline $name]
>         vwait ::nntp::${name}done
>         fileevent $sock readable {}
>         catch {after cancel $data(after)}
>         fconfigure $sock -blocking on
>         # davidlallen changes end here
>         if {$data(binary)} {
>             fconfigure $sock -encoding $oldenc
>         }
>         return $data(result)
>     }
>
>     proc ::nntp::getline {name} {
>         upvar 0 ::nntp::${name}data data
>         set eol "\012"
>         set sock $data(sock)
>         gets $sock line
>         regsub -- {\015?\012$} $line $data(eol) line
>         if {[string match "." $line]} {
>             set ::nntp::${name}done 1
>         }
>         if { [string match "..*" $line] } {
>             lappend data(result) [string range $line 1 end]
>         } else {
>             lappend data(result) $line
>         }
>     }
>
>     proc ::nntp::timeout {name} {
>         upvar 0 ::nntp::${name}data data
>         set data(result) {}
>         set ::nntp::${name}done 1
>     }

Once you've set the channel to nonblocking mode (which makes sense if
you want to avoid stalling), you get one extra possible outcome for
[gets], meaning "no complete line so far": returned value is -1 (like
for eof), but [fblocked] returns 1. In that case, it means the
fileevent has been woken up by new bytes, but none terminated a line,
hence you should resume waiting without taking the empty $line as new
data (lappending an empty string may be your problem here, assuming
something like [join $data(result) \n]).

Bottom line: man fblocked ;-)

-Alex
From: tom.rmadilo on
On Jun 21, 9:21 am, davidlallen <d...(a)jendaveallen.com> wrote:
> Folks,
>
> I am using the nntp library in tcllib 1.12 for an application which
> fetches millions of headers from a news server.  It works; but
> occasionally the link fails and a transaction stalls.  Today this
> causes the application to simply hang forever, since the nntp library
> does not support any timeout.  You can see below the original code
> which simply uses "while not eof on socket".  Some other libraries,
> including the http library in tcl itself, provide a timeout.  So I
> have tried to modify the nntp code so that it will timeout, say after
> 60 seconds.
>
> My problem is that the new code seems to work correctly on small
> testcases but in the actual application it loses data, resulting in
> the downloaded binary file failing to unrar.  When there is no timeout
> situation, the application using the original nntp library does not
> have this problem.  So, somehow my fix "appears" correct but is not
> really.
>
> Can anybody suggest a different approach by inspection of the changes
> I have made?  I am certainly not an expert on vwait and fileevent.
>
> Original (see tcllib 1.12 nntp library):
>
>     proc ::nntp::fetch {name} {
>         upvar 0 ::nntp::${name}data data
>         set eol "\012"
>         if {![::nntp::okprint $name]} {
>             return ""
>         }
>         set sock $data(sock)
>         if {$data(binary)} {
>             set oldenc [fconfigure $sock -encoding]
>             fconfigure $sock -encoding binary
>         }
>         set result [list ]
>         while {![eof $sock]} {
>             gets $sock line
>             regsub -- {\015?\012$} $line $data(eol) line
>             if {[string match "." $line]} {
>                 break
>             }
>             if { [string match "..*" $line] } {
>                 lappend result [string range $line 1 end]
>             } else {
>                 lappend result $line
>             }
>         }
>         if {$data(binary)} {
>             fconfigure $sock -encoding $oldenc
>         }
>         return $result
>     }
>
> Modified:
>
>     proc ::nntp::fetch {name} {
>         upvar 0 ::nntp::${name}data data
>         if {![::nntp::okprint $name]} {
>             return ""
>         }
>         set sock $data(sock)
>         if {$data(binary)} {
>             set oldenc [fconfigure $sock -encoding]
>             fconfigure $sock -encoding binary
>         }
>         # davidlallen changes start here; see getline and timeout
> below
>         set data(after) [after 60000 [list ::nntp::timeout $name]]
>         set data(result) [list ]
>         fconfigure $sock -blocking off
>         fileevent $sock readable [list ::nntp::getline $name]
>         vwait ::nntp::${name}done
>         fileevent $sock readable {}
>         catch {after cancel $data(after)}
>         fconfigure $sock -blocking on
>         # davidlallen changes end here
>         if {$data(binary)} {
>             fconfigure $sock -encoding $oldenc
>         }
>         return $data(result)
>     }
>
>     proc ::nntp::getline {name} {
>         upvar 0 ::nntp::${name}data data
>         set eol "\012"
>         set sock $data(sock)
>         gets $sock line
>         regsub -- {\015?\012$} $line $data(eol) line
>         if {[string match "." $line]} {
>             set ::nntp::${name}done 1
>         }
>         if { [string match "..*" $line] } {
>             lappend data(result) [string range $line 1 end]
>         } else {
>             lappend data(result) $line
>         }
>     }
>
>     proc ::nntp::timeout {name} {
>         upvar 0 ::nntp::${name}data data
>         set data(result) {}
>         set ::nntp::${name}done 1
>     }

Problem is that Tcl's API isn't complete. You can have non-blocking
xyz, but you can't try to [read] some number of bytes and either
timeout or return a number of bytes less than requested. You should be
able to try to read X bytes and give up after some number of
milliseconds. If you try to read a line with [gets], you should be
able to give up after some number of milliseconds, regardless of the
number of reads which failed to find an EOL.

It is also possible to manipulate the Tcl [gets] API into adding
additional lines. Just send \r wait a while then send \n. The actual
byte stream has remained the same, but permissive applications might
add an additional line. This could affect when headers end and message
body starts.

Many current exploits are based upon this permissive "can't we all
just get along" programming model. Unfortunately you cannot avoid this
with Tcl's [gets] API. But you can use [read] one byte at a time.

That is one rant. The second pertains to the above code: constant
switching between blocking and non-blocking. You cannot do this. As
soon as you configure a channel to blocking, your entire application
is a slave to that channel.

The tell here is why switch to blocking mode? The reason is that you
want to read a certain number of chars or bytes. If you had a timeout,
you could use blocking mode to do this (or return a short read)
without forever blocking the application (limited blocking).

You can't do this with Tcl's [gets] because it continues to read until
an eol is found.

The basic problem with this code is the incorrect assumption that you
can switch back and forth between blocking and non-blocking mode.
This is total nonsense. The nonsense is somewhat suggested and allowed
by the Tcl I/O API.

But never switch from non-blocking to blocking mode on a channel and
expect that your application will never freeze. Once one channel is in
blocking mode, your entire application may block "forever", unless you
have timeouts on your API, which don't exist in the Tcl API.
From: Alexandre Ferrieux on
On Jun 22, 2:10 am, "tom.rmadilo" <tom.rmad...(a)gmail.com> wrote:
> On Jun 21, 9:21 am, davidlallen <d...(a)jendaveallen.com> wrote:
>
>
>
>
>
> > Folks,
>
> > I am using the nntp library in tcllib 1.12 for an application which
> > fetches millions of headers from a news server.  It works; but
> > occasionally the link fails and a transaction stalls.  Today this
> > causes the application to simply hang forever, since the nntp library
> > does not support any timeout.  You can see below the original code
> > which simply uses "while not eof on socket".  Some other libraries,
> > including the http library in tcl itself, provide a timeout.  So I
> > have tried to modify the nntp code so that it will timeout, say after
> > 60 seconds.
>
> > My problem is that the new code seems to work correctly on small
> > testcases but in the actual application it loses data, resulting in
> > the downloaded binary file failing to unrar.  When there is no timeout
> > situation, the application using the original nntp library does not
> > have this problem.  So, somehow my fix "appears" correct but is not
> > really.
>
> > Can anybody suggest a different approach by inspection of the changes
> > I have made?  I am certainly not an expert on vwait and fileevent.
>
> > Original (see tcllib 1.12 nntp library):
>
> >     proc ::nntp::fetch {name} {
> >         upvar 0 ::nntp::${name}data data
> >         set eol "\012"
> >         if {![::nntp::okprint $name]} {
> >             return ""
> >         }
> >         set sock $data(sock)
> >         if {$data(binary)} {
> >             set oldenc [fconfigure $sock -encoding]
> >             fconfigure $sock -encoding binary
> >         }
> >         set result [list ]
> >         while {![eof $sock]} {
> >             gets $sock line
> >             regsub -- {\015?\012$} $line $data(eol) line
> >             if {[string match "." $line]} {
> >                 break
> >             }
> >             if { [string match "..*" $line] } {
> >                 lappend result [string range $line 1 end]
> >             } else {
> >                 lappend result $line
> >             }
> >         }
> >         if {$data(binary)} {
> >             fconfigure $sock -encoding $oldenc
> >         }
> >         return $result
> >     }
>
> > Modified:
>
> >     proc ::nntp::fetch {name} {
> >         upvar 0 ::nntp::${name}data data
> >         if {![::nntp::okprint $name]} {
> >             return ""
> >         }
> >         set sock $data(sock)
> >         if {$data(binary)} {
> >             set oldenc [fconfigure $sock -encoding]
> >             fconfigure $sock -encoding binary
> >         }
> >         # davidlallen changes start here; see getline and timeout
> > below
> >         set data(after) [after 60000 [list ::nntp::timeout $name]]
> >         set data(result) [list ]
> >         fconfigure $sock -blocking off
> >         fileevent $sock readable [list ::nntp::getline $name]
> >         vwait ::nntp::${name}done
> >         fileevent $sock readable {}
> >         catch {after cancel $data(after)}
> >         fconfigure $sock -blocking on
> >         # davidlallen changes end here
> >         if {$data(binary)} {
> >             fconfigure $sock -encoding $oldenc
> >         }
> >         return $data(result)
> >     }
>
> >     proc ::nntp::getline {name} {
> >         upvar 0 ::nntp::${name}data data
> >         set eol "\012"
> >         set sock $data(sock)
> >         gets $sock line
> >         regsub -- {\015?\012$} $line $data(eol) line
> >         if {[string match "." $line]} {
> >             set ::nntp::${name}done 1
> >         }
> >         if { [string match "..*" $line] } {
> >             lappend data(result) [string range $line 1 end]
> >         } else {
> >             lappend data(result) $line
> >         }
> >     }
>
> >     proc ::nntp::timeout {name} {
> >         upvar 0 ::nntp::${name}data data
> >         set data(result) {}
> >         set ::nntp::${name}done 1
> >     }
>
> Problem is that Tcl's API isn't complete. You can have non-blocking
> xyz, but you can't try to [read] some number of bytes and either
> timeout or return a number of bytes less than requested. You should be
> able to try to read X bytes and give up after some number of
> milliseconds. If you try to read a line with [gets], you should be
> able to give up after some number of milliseconds, regardless of the
> number of reads which failed to find an EOL.
>
> It is also possible to manipulate the Tcl [gets] API into adding
> additional lines. Just send \r wait a while then send \n. The actual
> byte stream has remained the same, but permissive applications might
> add an additional line. This could affect when headers end and message
> body starts.
>
> Many current exploits are based upon this permissive "can't we all
> just get along" programming model. Unfortunately you cannot avoid this
> with Tcl's [gets] API. But you can use [read] one byte at a time.

You did it again. Advising [read 1] because you just don't understand
the nonblocking-gets/fblocked/chan-pending idiom. Please stop
confusing people who are asking for help.

-Alex
From: davidlallen on
I did not intend to step into a previous hot debate. The reason my
code switches in and out of nonblocking mode is because the nntp.tcl
file has many other routines, not shown here. There *seem* to be two
different uses of "gets" from the channel; one is used many times to
get a single line of reply which is expected to be short. The second
one which I showed is used to get a large multiline block of data,
such as an entire usenet message or a range of multiple headers. By
volume, 99%+ of the data transfer uses the second one. My hope was to
make a more limited change and only switch to non-blocking mode for
the second one. Of course, if the shorter single line transaction is
happening when the server dies, my application will still hang; but I
hope that will become "much less likely".

I have modified the code to check fblocked as shown in "man fblocked",
but I have not had the opportunity to finish testing it. I will
update when I have a result.