From: Peter J. Holzer on 20 Mar 2010 07:35 On 2010-03-19 22:39, Steve <steve(a)staticg.com> wrote: > On Mar 19, 3:30�pm, Ben Morrow <b...(a)morrow.me.uk> wrote: >> Quoth Steve <st...(a)staticg.com>: >> > I have no idea, but it's personal use. �I don't see what so bad about >> > it, if I was using my web browser I'd be doing the same thing. >> >> That's not the point. If their TOS say 'no robots' then that means 'no >> robots', not 'no robots unless it's for personal use and you can't see >> why you shouldn't'. Apart from anything else, a lot of these sites make >> money from ads, which you will completely bypass. ======= >> > Craigslist is just an example. >> >> > That's aside the point though, I'm just doing it for fun/practice/ >> > learning. �Let's say we are using a different site then, perhaps one >> > I'm going to make, it makes no difference to me. >> >> > So any way I can do this or...? >> >> I've already suggested using XML::LibXML. Others have pointed you to an >> example of using HTML::Parser. Pick one and try it. >> >> Ben > > I realize this, Please quote only the relevant parts of the posting you are responding to and write your answer directly beneath the part you are referring to. Nobody knows what "this" is that you realize. From your quoting it looks like you realize that you should use XML::LibXML or HTML::Parser. But from the content of your reply it seems more likely you realize that you should abide of the terms of use of any site you use. If so you should have inserted your response at the point I've marked with "=======" above. And if you don't intend to respond to the part about the tools you should use, don't quote it (and change the subject, since the topic is now no longer "Perl HTML searching" but "TOS of web pages"). hp
|
Pages: 1 Prev: FAQ 5.41 How do I delete a directory tree? Next: FAQ 5.17 Is there a leak/bug in glob()? |