From: Antoon Pardon on
I have a numver of tarfiles on a remote host that I need to process and
that I fetch via FTP. I wrote the following as a test to see how I would
best approach this. (I'm using python 2.5)

-------------------------------- ftptst1 --------------------------------
import tarfile as tar
from ftplib import FTP, error_perm as FTPError, error_temp as FTPProblem
from socket import error as SocketError

ftp = FTP(host, user, password)

def gettarfile(rfn):
import tempfile
tfl = tempfile.TemporaryFile()
ftp.retrbinary("RETR %s" % (rfn,), tfl.write)
tfl.seek(0)
return tar.open(mode = "r:bz2", fileobj = tfl)

def process():
for rfn in ("testfile.tbz", "nosuchfile"):
try:
tf = gettarfile(rfn)
for tarinfo in tf:
print tarinfo.name
print
except Exception:
print "Something went wrong with '%s'" % rfn

process()
-------------------------------------------------------------------------

Executing this gives me this result:

testfile/
testfile/tstfl.0

Something went wrong with 'nosuchfile'


However the tarfile can be to big to store localy. That is why I rewrote
the above as follows:

-------------------------------- ftptst2 --------------------------------
import tarfile as tar
from ftplib import FTP, error_perm as FTPError, error_temp as FTPProblem
from socket import error as SocketError

ftp = FTP("nestor", "apardon", "0nZM,F!m")

def connect(lfl, rfn):
ftp.retrbinary("RETR %s" % (rfn,), lfl.write)
lfl.close()

def gettarfile(rfn):
import os, threading
rfd, wfd = os.pipe()
wfl = os.fdopen(wfd, "w")
rfl = os.fdopen(rfd, "r")
xfer = threading.Thread(target = connect, args = (wfl, rfn))
xfer.setDaemon(True)
xfer.start()
return tar.open(mode = "r|bz2", fileobj = rfl)

def process():
for rfn in ("testfile.tbz", "nosuchfile"):
try:
tf = gettarfile(rfn)
for tarinfo in tf:
print tarinfo.name
print
except Exception:
print "Something went wrong with '%s'" % rfn

process()
-------------------------------------------------------------------------

Executing this new test gives this result:

testfile/
testfile/tstfl.0

Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner
self.run()
File "/usr/lib/python2.5/threading.py", line 446, in run
self.__target(*self.__args, **self.__kwargs)
File "ftptst2", line 10, in connect
ftp.retrbinary("RETR %s" % (rfn,), lfl.write)
File "/usr/lib/python2.5/ftplib.py", line 390, in retrbinary
conn = self.transfercmd(cmd, rest)
File "/usr/lib/python2.5/ftplib.py", line 356, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "/usr/lib/python2.5/ftplib.py", line 327, in ntransfercmd
resp = self.sendcmd(cmd)
File "/usr/lib/python2.5/ftplib.py", line 241, in sendcmd
return self.getresp()
File "/usr/lib/python2.5/ftplib.py", line 216, in getresp
raise error_perm, resp
error_perm: 550 nosuchfile: No such file or directory.


Now I totally understand what is happening. What is less clear is how
best to fix ftptst2, so that it behaves like ftptst1 in case of ftp
problems. I know about the PyThreadState_SetAsyncExc function; would
this be an acceptable solution here by catching the excption in the
xfer/connect thread and raising them in the main thread? Is there
an other possibility of combining ftplib and tarfile, that doesn't
need threads but also doesn't need to store the remote file locally?

Any other suggestion?

--
Antoon Pardon