From: Miha on
Hi

Is it possible and how to limit web crawlers (Google) impact to web site
running on IIS 6/7 ?
Dealing with Google's webmaster tools - site config - custom crawl rate
doesn't give us less impacts, so I'm wondering if there is any way to limit
bandwidth available to all crawlers (identified by bot in User-Agent header)
using bandwidth limiting on IIS.
I know that it is possible on Apache, but cannot find this how to set on MS
IIS.
Any help or ideas would be appriciated.
Regards,
Miha


From: .._.. on
Use robots.txt for this. Either just ban the search crawlers that don't
work for you "Baide" or whatever that chinese one is useless for most sites,
and it's aggressive.

Ther are also commands you can use in robots.txt to tell them to slow down.

http://www.kavoir.com/2010/02/how-to-slow-down-the-frequency-googlebot-search-engine-crawler-visits-your-site.html

http://searchengineland.com/a-deeper-look-at-robotstxt-17573

Google honors the robots.txt.

For the crawlers that don't, I'd just band them at the firewall based on IP
address. There's about 5 search engines that most people use and the rest
of them are garbage. Just block the garbage ones.

You may also want to pay attention to what might be causing the issue, it
could be revealing bad code or files that should be compressed better.


"Miha" <miha(a)positiva.si> wrote in message
news:eFfCRDb7KHA.356(a)TK2MSFTNGP05.phx.gbl...
>
> Hi
>
> Is it possible and how to limit web crawlers (Google) impact to web site
> running on IIS 6/7 ?
> Dealing with Google's webmaster tools - site config - custom crawl rate
> doesn't give us less impacts, so I'm wondering if there is any way to
> limit bandwidth available to all crawlers (identified by bot in User-Agent
> header) using bandwidth limiting on IIS.
> I know that it is possible on Apache, but cannot find this how to set on
> MS IIS.
> Any help or ideas would be appriciated.
> Regards,
> Miha
>
>