View previous topic :: View next topic |
Author |
Message |
jrdeahl -
Joined: 27 Dec 2006 Posts: 50
|
Posted: Mon Nov 12, 2007 6:06 am Post subject: Block or ban ip addresses/bots? |
|
|
I have got some ip addresses and bots hitting my server and sometimes multiple connections. How do I stop this crap? Sometimes a googlebot will have 4 or 5 connections from the same ip address.
Chewing up my bandwidth.
Thanks,
john |
|
Back to top |
|
 |
AbyssUnderground -
Joined: 31 Dec 2004 Posts: 3855
|
Posted: Mon Nov 12, 2007 9:19 am Post subject: Re: Block or ban ip addresses/bots? |
|
|
jrdeahl wrote: | I have got some ip addresses and bots hitting my server and sometimes multiple connections. How do I stop this crap? Sometimes a googlebot will have 4 or 5 connections from the same ip address.
Chewing up my bandwidth.
Thanks,
john |
Googlebot won't eat your bandwidth. It browses 1 page per 3-5 seconds, acting like a regular person browsing your site. I wouldn't worry about it. Just think of it as another regular visitor. Multiple connections is normal. Often a browser will open more than one to download content faster. You would make life difficult for a browsing user if you reduced this.
For example if you limit it to 1 connection and you offer file downloads, as soon as they start downloading a file, they can no longer browse your site and will leave thinking your server has gone down. Not good practice.
My advice is leave it alone unless it becomes a problem (like a DOS/DDoS attack) which are rare anyway. _________________ Andy (AbyssUnderground) (previously The Inquisitor)
www.abyssunderground.co.uk |
|
Back to top |
|
 |
jrdeahl -
Joined: 27 Dec 2006 Posts: 50
|
Posted: Mon Nov 12, 2007 4:31 pm Post subject: |
|
|
Last night a googlebot had 14 simultanious connections and was using 45K of bandwidth for 1 1/2 hours.
Today I wake up to the same ip googlebot with 6 connnections and a yahoocrawl with 3 connections. They both are using a steady 41K.
besides having over 3,000 files for download I have a forum. I close4d the forum because the dam bots indexed some pages that were not the main menu. Got a lot of spammers to the point of closing it down.
Hate them dam bots.
Would prefer to be able to stop the bots. |
|
Back to top |
|
 |
Moxxnixx -
Joined: 21 Jun 2003 Posts: 1226 Location: Florida
|
Posted: Mon Nov 12, 2007 6:16 pm Post subject: |
|
|
jrdeahl,
If you want to stop Googlebot completely, add a robot.txt file to your directory.
Code: | User-agent: Googlebot
Disallow: / |
Be advised, your search engine ranking will eventually drop if you use this. |
|
Back to top |
|
 |
AbyssUnderground -
Joined: 31 Dec 2004 Posts: 3855
|
Posted: Mon Nov 12, 2007 8:29 pm Post subject: |
|
|
Googlebot is there to promote your site. It does use a fair amount of bandwidth the first time it index your site. Also forums will be indexed as well. If you don't want it indexed, as said above use a robots.txt.
They won't have been using a steady flow of bandwidth because they only hit your site once in 3-5 seconds. _________________ Andy (AbyssUnderground) (previously The Inquisitor)
www.abyssunderground.co.uk |
|
Back to top |
|
 |
jrdeahl -
Joined: 27 Dec 2006 Posts: 50
|
Posted: Mon Nov 12, 2007 10:32 pm Post subject: |
|
|
What the heck are you talking about?
"as said above use a robots.txt." Was not mentioned above! |
|
Back to top |
|
 |
pkSML -
Joined: 29 May 2006 Posts: 955 Location: Michigan, USA
|
Posted: Mon Nov 12, 2007 10:34 pm Post subject: |
|
|
Here's a sure-fire way to get bots off just a certain area of your site.
robots.txt is generally followed, but URL rewriting will guarantee no crawling.
URL rewriting:
- virtual path regex: ^/no_allow/(.*)
- condition:
- variable: HTTP header: user-agent
- operator: matches with
- regex: Googlebot
- case-sensitive: unchecked
- if rule matches: report an error to the client
- status code: whatever you want, but maybe a 401 - Unauthorized
Create another rule, just changing the regex for the condition to Yahoo to block Yahoo's crawler.
Several months ago, I had GoogleBot crawling Abyss on my desktop computer, which had access to my entire C drive. Oops!
BTW, I'll put a plugin for my site. Please register your Abyss-powered websites at http://abyss-websites.com
None of the old sites are there anymore due to the complete re-design of the site. Thanks! _________________ Stephen
Need a LitlURL?
http://CodeBin.yi.org |
|
Back to top |
|
 |
Moxxnixx -
Joined: 21 Jun 2003 Posts: 1226 Location: Florida
|
Posted: Mon Nov 12, 2007 10:50 pm Post subject: |
|
|
jrdeahl wrote: | What the heck are you talking about?
"as said above use a robots.txt." Was not mentioned above! |
Read the post "above" his. |
|
Back to top |
|
 |
jrdeahl -
Joined: 27 Dec 2006 Posts: 50
|
Posted: Mon Nov 12, 2007 11:41 pm Post subject: |
|
|
That wierd, it wasn't showing up before. |
|
Back to top |
|
 |
|