Prev: FAQ 5.35 How do I close a file descriptor by number?
Next: FAQ 3.21 How can I compile my Perl program into byte code or C?
From: slugger3113 on 19 Apr 2010 11:51 Hi, I'm trying to get full/absolute URLs from relative links in HTML documents. I've been trying to fudge this using File::Basename, WWW::Mechanize, etc. but was wondering if there's a more ready-made way to do this. For example, if my main doc is: http://www.abc.com/x/y/z/mydoc.html and it contains a relative link to: .../../otherdir/yourdoc.html how do I get the absolute URL to "yourdoc.html"? Using the above modules I've been able to get: http://www.abc.com/x/y/z/../../otherdir/yourdoc.html when what I want is: http://www.abc.com/x/otherdir/yourdoc.html Of course I could try and parse all of the possible variations for relative paths, but it's making my head hurt and I was wondering if there's a module that could help with this. Any thoughts would be appreciated. thanks Scott
From: J�rgen Exner on 19 Apr 2010 12:07 slugger3113 <sstark(a)hi-beam.net> wrote: >http://www.abc.com/x/y/z/../../otherdir/yourdoc.html > >when what I want is: > >http://www.abc.com/x/otherdir/yourdoc.html For file names there is a module that will compute the canonical path, but I can't remember the name right now. And I don't know if it will work with URLs, either. jue
From: Steve C on 19 Apr 2010 12:11 slugger3113 wrote: > Hi, I'm trying to get full/absolute URLs from relative links in HTML > documents. I've been trying to fudge this using File::Basename, > WWW::Mechanize, etc. but was wondering if there's a more ready-made > way to do this. > > For example, if my main doc is: > > http://www.abc.com/x/y/z/mydoc.html > > and it contains a relative link to: > > ../../otherdir/yourdoc.html > > how do I get the absolute URL to "yourdoc.html"? Using the above > modules I've been able to get: > > http://www.abc.com/x/y/z/../../otherdir/yourdoc.html > > when what I want is: > > http://www.abc.com/x/otherdir/yourdoc.html > > Of course I could try and parse all of the possible variations for > relative paths, but it's making my head hurt and I was wondering if > there's a module that could help with this. Any thoughts would be > appreciated. You also need to know if there is a base tag in the head section since that changes the meaning of a relative link.
From: C.DeRykus on 19 Apr 2010 12:19 On Apr 19, 8:51 am, slugger3113 <sst...(a)hi-beam.net> wrote: > Hi, I'm trying to get full/absolute URLs from relative links in HTML > documents. I've been trying to fudge this using File::Basename, > WWW::Mechanize, etc. but was wondering if there's a more ready-made > way to do this. > > For example, if my main doc is: > > http://www.abc.com/x/y/z/mydoc.html > > and it contains a relative link to: > > ../../otherdir/yourdoc.html > > how do I get the absolute URL to "yourdoc.html"? Using the above > modules I've been able to get: > > http://www.abc.com/x/y/z/../../otherdir/yourdoc.html > > when what I want is: > > http://www.abc.com/x/otherdir/yourdoc.html > > Of course I could try and parse all of the possible variations for > relative paths, but it's making my head hurt and I was wondering if > there's a module that could help with this. Any thoughts would be > appreciated. > See: perldoc URI eg, print URI->new_abs('../../otherdir/yourdoc.html' , 'http://www.abc.com/x/y/z/') -- Charles DeRykus
From: slugger3113 on 19 Apr 2010 12:24
On Apr 19, 11:07 am, Jürgen Exner <jurge...(a)hotmail.com> wrote: > For file names there is a module that will compute the canonical path, > but I can't remember the name right now. And I don't know if it will > work with URLs, either. > > jue Hm it looks like File::Spec will do what I want: my($dpath) = "/one/two/../three/four"; my $cpath = File::Spec->canonpath( $dpath ); print $cpath,$/; result: /one/three/four thanks for the tip on "canonical" (whatever that means)! Scott |