From: RG on 11 Aug 2010 16:20 In article <i3uu74$uga$1(a)speranza.aioe.org>, Tim Harig <usernet(a)ilthio.net> wrote: > On 2010-08-11, RG <rNOSPAMon(a)flownet.com> wrote: > > In article <i3uo7t$6mk$1(a)speranza.aioe.org>, > > Tim Harig <usernet(a)ilthio.net> wrote: > > > >> On 2010-08-11, RG <rNOSPAMon(a)flownet.com> wrote: > >> > I'm writing a system in a different language but want to use a Python > >> > library. I know of lots of ways to do this (embed a Python interpreter, > >> > fire up a python server) but by far the easiest to implement is to have > >> > the main program spawn a Python interpreter and interact with it through > >> > its stdin/stdout. > >> > >> Or, open python using a socket. > > > > You mean a TCP/IP socket? Or a unix domain socket? The former has > > security issues, and the latter seems like a lot of work. Or is there > > an easy way to do it that I don't know about? > > I was referring to unix domain sockets or more specifically stream > pipes. I guess it depends what language you are using and what libraries > you have access to. Under C, working with stream pipes is no more trivial > then using pipe(). You can simply create the socket descriptors using > socketpair(). Keep one of the descriptors for your process and pass the > other to the python child process as both stdin and stdout. Ah. That is in fact exactly what I am doing, and that is how I first encountered this problem. rg
From: RG on 11 Aug 2010 17:07 In article <rNOSPAMon-3CC595.13205911082010(a)news.albasani.net>, RG <rNOSPAMon(a)flownet.com> wrote: > In article <i3uu74$uga$1(a)speranza.aioe.org>, > Tim Harig <usernet(a)ilthio.net> wrote: > > > On 2010-08-11, RG <rNOSPAMon(a)flownet.com> wrote: > > > In article <i3uo7t$6mk$1(a)speranza.aioe.org>, > > > Tim Harig <usernet(a)ilthio.net> wrote: > > > > > >> On 2010-08-11, RG <rNOSPAMon(a)flownet.com> wrote: > > >> > I'm writing a system in a different language but want to use a Python > > >> > library. I know of lots of ways to do this (embed a Python > > >> > interpreter, > > >> > fire up a python server) but by far the easiest to implement is to > > >> > have > > >> > the main program spawn a Python interpreter and interact with it > > >> > through > > >> > its stdin/stdout. > > >> > > >> Or, open python using a socket. > > > > > > You mean a TCP/IP socket? Or a unix domain socket? The former has > > > security issues, and the latter seems like a lot of work. Or is there > > > an easy way to do it that I don't know about? > > > > I was referring to unix domain sockets or more specifically stream > > pipes. I guess it depends what language you are using and what libraries > > you have access to. Under C, working with stream pipes is no more trivial > > then using pipe(). You can simply create the socket descriptors using > > socketpair(). Keep one of the descriptors for your process and pass the > > other to the python child process as both stdin and stdout. > > Ah. That is in fact exactly what I am doing, and that is how I first > encountered this problem. > > rg And now I have encountered another problem: -> print sys.stdin.encoding <- None But when I run from a terminal: [ron(a)mickey:~]$ python Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.stdout.encoding 'UTF-8' I thought the value of sys.stdin.encoding was hard-coded into the Python executable at compile time, but that's obviously wrong. So how does Python get the value of sys.stdin.encoding? rg
From: Cameron Simpson on 11 Aug 2010 18:12 On 11Aug2010 12:35, Tim Harig <usernet(a)ilthio.net> wrote: | > The buffering is a performance choice. Every write requires a context | > switch from userspace to kernel space, and availability of data in the | > pipe will wake up a downstream process blocked trying to read. | > It is far more efficient to do as few such copies as possible, [...] | | Right, I don't question the optimization. I question whether the | intelligence that performes that optimation should be placed within cat or | whether it should be placed within the shell. It seems to me that the | shell has a better idea of how the command is being used and can therefore | make a better decision about whether or not buffering is appropriate. I would argue it's not much better placed, though it would be nice if the control could be issued from there. But it can't. Regarding the former, in this pipeline: cat some files... | python filter program | something else how shall the shell know if the python filter (to take the OP's case) wants its input line buffered (rare) or block buffered (usually ok)? What might be useful would be a way to attach an attribute to a pipe or other file descriptor indicating the desired buffering behaviour that writers to the file descriptor should adopt. Of course, the ugly sides to that are how many buffering regimes should it be possible to express and how and when should the upstream (writing) program decide to check? In a pipeline the pipes are made _before_ any of the programs commence because the programs need to be attached to the pipes (this is done before the programs themselves are dispatched). So, _after_ dispatch the python-wanting-line-buffering issues an ioctl on the pipe saying "I want line buffering". However, the upstream program may well already have commenced operation before that happens. It may even have run to completion before that happens! So, shall all upstream programs be required to poll? How often? On every write? Shall they receive a signal? What if they don't catch it? If the downstream program _requires_ line buffering then the whole situation is racey and unreliable. You can see that on reflection this isn't easy to resolve cleanly from _outside_ the writing program. To do it from inside requires all programs to sprout an option like GNU cat's -u option. Cheers, -- Cameron Simpson <cs(a)zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ What progress we are making. In the Middle Ages they would have burned me. Now they are content with burning my books. - Sigmund Freud
From: Nobody on 11 Aug 2010 20:42 On Wed, 11 Aug 2010 10:32:41 +0000, Tim Harig wrote: >>> Usually you either >>> need an option on the upstream program to tell it to line >>> buffer explicitly >> >> once cat had an option -u doing exactly that but nowadays >> -u seems to be ignored >> >> http://www.opengroup.org/onlinepubs/009695399/utilities/cat.html > > I have to wonder why cat knows or cares. The issue relates to the standard C library. By convention[1], stdin and stdout are line-buffered if the descriptor refers to a tty, and are block-buffered otherwise. stderr is always unbuffered. Any program which uses stdin and stdout without explicitly setting the buffering or using fflush() will exhibit this behaviour. [1] ANSI/ISO C is less specific; C99, 7.19.3p7: As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device. POSIX says essentially the same thing: <http://www.opengroup.org/onlinepubs/9699919799/functions/stdin.html>
From: RG on 11 Aug 2010 21:49 In article <pan.2010.08.12.00.42.26.343000(a)nowhere.com>, Nobody <nobody(a)nowhere.com> wrote: > On Wed, 11 Aug 2010 10:32:41 +0000, Tim Harig wrote: > > >>> Usually you either > >>> need an option on the upstream program to tell it to line > >>> buffer explicitly > >> > >> once cat had an option -u doing exactly that but nowadays > >> -u seems to be ignored > >> > >> http://www.opengroup.org/onlinepubs/009695399/utilities/cat.html > > > > I have to wonder why cat knows or cares. > > The issue relates to the standard C library. By convention[1], stdin and > stdout are line-buffered if the descriptor refers to a tty, and are > block-buffered otherwise. stderr is always unbuffered. > > Any program which uses stdin and stdout without explicitly setting the > buffering or using fflush() will exhibit this behaviour. > > [1] ANSI/ISO C is less specific; C99, 7.19.3p7: > > As initially opened, the standard error stream is not fully > buffered; the standard input and standard output streams are > fully buffered if and only if the stream can be determined not > to refer to an interactive device. > > POSIX says essentially the same thing: > > <http://www.opengroup.org/onlinepubs/9699919799/functions/stdin.html> This doesn't explain why "cat | cat" when run interactively outputs line-by-line (which it does). STDIN to the first cat is a TTY, but the second one isn't. For that matter, you can also do this: nc -l 1234 | cat and then telnet localhost 1234 and type at it, and it still works line-by-line. rg
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Access lotus notes using Python 2.5.1 Next: regex to remove lines made of only whitespace |