From: The Frog on
Hi Everyone,

I am just wondering if anyone has had any experience or words of
wisdom to offer on approaching a seemingly simple network type
application. Here is the scenario.....

There is a website, actually an ftp site with (surprise surprise!)
files on it. each week a new file is added. On my local network is a
shared folder that I need to keep updated with files from the FTP
site. So far so simple. Here's the issue: Due to a proxy server
(simple authentication where it requests a username and password to
access the internet) I can only access the FTP site via a web browser
(eg/ Internet Explorer, Firefox, etc...) and download the file(s) I
want manually. What is being done 'behind the scenes' so to speak is
that the web browser is using HFTP to download the file (FTP over
HTTP). What I would like to do is to write an application that can
download these files for me.

I have done a proof of concept using simple VBA to automate IE to show
to our IT department that it can be done (using regular expressions to
strip the filenames from the 'web page' and compare that with the list
of files in the network folder - whats on the ftp but not on the local
network goes into a list and gets downloaded via IE automation). This
is clumsy but it works. I would prefer to write a 'proper'
application, and do this in Java. I have no idea how to approach HFTP
and need help with this.

Can anyone offer any advice on this? It would be much appreciated.

Cheers

The Frog
From: Roedy Green on
On Thu, 21 Jan 2010 00:50:03 -0800 (PST), The Frog
<mr.frog.to.you(a)googlemail.com> wrote, quoted or indirectly quoted
someone who said :

> What is being done 'behind the scenes' so to speak is
>that the web browser is using HFTP to download the file (FTP over
>HTTP). What I would like to do is to write an application that can
>download these files for me.

see http://mindprod.com/webstart/replicator.html
http://mindprod.com/webstart/replicatormanual.html

The Replicator maintains a local mirror of a group of files you post
to the website in zipped form. It automatically only downloads what
has recently changed and unzips it.

It automatically only compresses and uploads the files that have
recently changed.

You can try it out to maintain a local mirror of my entire website.

--
Roedy Green Canadian Mind Products
http://mindprod.com
Responsible Development is the style of development I aspire to now. It can be summarized by answering the question, �How would I develop if it were my money?� I�m amazed how many theoretical arguments evaporate when faced with this question.
~ Kent Beck (born: 1961 age: 49) , evangelist for extreme programming .
From: Tom Anderson on
On Thu, 21 Jan 2010, The Frog wrote:

> There is a website, actually an ftp site with (surprise surprise!) files
> on it. each week a new file is added. On my local network is a shared
> folder that I need to keep updated with files from the FTP site. So far
> so simple. Here's the issue: Due to a proxy server (simple
> authentication where it requests a username and password to access the
> internet) I can only access the FTP site via a web browser (eg/ Internet
> Explorer, Firefox, etc...) and download the file(s) I want manually.
> What is being done 'behind the scenes' so to speak is that the web
> browser is using HFTP to download the file (FTP over HTTP). What I would
> like to do is to write an application that can download these files for
> me.

Have a read of sections 2.3 and 3 in here:

http://java.sun.com/javase/6/docs/technotes/guides/net/proxies.html

You can do 'HFTP' with a normal URLConnection. To summarise that link (all
classes not from java.lang are from java.net):

SocketAddress proxyAddr = new InetSocketAddress("proxy.frogcorp.com", 8080);
Proxy proxy = new Proxy(Proxy.Type.HTTP, proxyAddr);
URL url = new URL("ftp://frog:noteasy(a)ftp.the-pond.org");
URConnection conn = url.openConnection(proxy);

You mention that your proxy server requires authentication. If it's doing
this with the HTTP authentication mechanism, then you can, i believe,
cooperate with that using Authenticator:

String username;
String password;
Authenticator.setDefault(new Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
// should really validate host and prompt/realm here to make sure it's the proxy
return new PasswordAuthentication(username, password.toCharArray());
}
});

I don't think there's a way to do per-connection authentication with
standard JDK classes. Apache HttpClient can do it, though, so that would
be a cleaner option if you're willing to use an extra library and write
slightly more long-winded code.

tom

--
You have now found yourself trapped in an incomprehensible maze.
From: The Frog on
Thankyou both for the feedback. I am always amazed at the bredth and
depth of knowledge in this forum. Thankyou both so much, I will try
and put it all into practice, and explore the tool too.

Much appreciated

The Frog
From: New Java 456 on
On Jan 22, 3:50 am, The Frog <mr.frog.to....(a)googlemail.com> wrote:
> Thankyou both for the feedback. I am always amazed at the bredth and
> depth of knowledge in this forum. Thankyou both so much, I will try
> and put it all into practice, and explore the tool too.
>
> Much appreciated
>
> The Frog

Not sure if my other message got filtered or what. Basically, SAX
Parser doesn't match on end token of XML but awaits end of stream; so,
it has the exact symptons you mention for HTTP when it is streaming
from network using the Java Bean decoder. Maybe you have a similar
issue. Of course you can troubleshoot by terminating, disconnecting
the wire, or otherwise closing the socket. If closing the socket fixes
it then you know you have a protocol issue. Your sending code is not
completing the protocol or your reading code is not finding the end of
packet properly. In the case of the encoder/decoder for java beans I
first had to parse the data manually from the network by matching on
the </java> and then put it into a ByteArrayInputStream. Both Java 6
and apache xerces failed to match on the stop token. Its been years
since I did a compiler class; but I think the stop token is one of the
basic thigns of a parser. I'm sure it is for XML.

In HTTP, it could be as simple as an extra \r needed. Also, IBM
WebSphere in 6.1 I think started adding http headers if you set
cookies. You can turn this off with a config parm in the web.xml. It
is bad as one major SW vendor mis-interpretted no-cache to mean what
no-store really means when they made their web browser and, so, the
IBM change exposes this bug in their implementation and causes PDF's
and such to not load properly (can be varied by varying the security
setting).

A possibility is your are sending binary data but your size setting is
wrong. This will cause the loading object in the browser to behave as
you mention.

HTH,
Tim