Garbage in the log

 
Post new topic   Reply to topic    Aprelium Forum Index -> General Questions
View previous topic :: View next topic  
Author Message
briosky
-


Joined: 18 Jun 2005
Posts: 46
Location: Salt Lake City, UT

PostPosted: Wed Sep 21, 2005 5:58 am    Post subject: Garbage in the log Reply with quote

Every darn time that :
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
visits my site, I find about 4 MB of trash in the access.log file.
More than tweaking the robots.txt file to lock access to the Redmond geniuses, any idea plz ?


-- the_wasatch_dude
_________________
Brionews - Everyday Freshnews http://brionews.com
Back to top View user's profile Send private message Visit poster's website MSN Messenger
abyssisthebest
-


Joined: 30 Jun 2005
Posts: 319
Location: Boston, UK

PostPosted: Wed Sep 21, 2005 7:24 am    Post subject: Reply with quote

find out the ip of the msn bot and add it to the do not log list
_________________
My online Portfolio
Back to top View user's profile Send private message Send e-mail MSN Messenger
briosky
-


Joined: 18 Jun 2005
Posts: 46
Location: Salt Lake City, UT

PostPosted: Wed Sep 21, 2005 4:57 pm    Post subject: Reply with quote

Quote:
find out the ip of the msn bot and add it to the do not log list

abyssisthebest:
i can't do that.
the log file is processed every day by a stat analyzer perl script (just an example here).
that would miss the msn visits.
no big deal, of course.
but i would rather prefer to discover why the trash appears, then to find a solution
however i appreciated the help

_________________
Brionews - Everyday Freshnews http://brionews.com
Back to top View user's profile Send private message Visit poster's website MSN Messenger
TRUSTAbyss
-


Joined: 29 Oct 2003
Posts: 3752
Location: USA, GA

PostPosted: Wed Sep 21, 2005 6:28 pm    Post subject: Reply with quote

You will find that lots of Bots visit your website. Every part of your log file is
important and you shouldn't worry whats in the log file , you should be glad
that its in the file because that tells you that Logging is working.

The only thing you should be worried about , are the Error Requests.

Sincerely , TRUSTpunk
Back to top View user's profile Send private message Visit poster's website
chance
-


Joined: 04 Jan 2003
Posts: 27
Location: everett, wa

PostPosted: Sat Oct 22, 2005 7:22 am    Post subject: Reply with quote

Take another look at AWstats and see how many referrals you get from the msn bot. I found about none, but the bot was eating up bandwidth everyday. Since they weren't sending me any hits, I banned them (along with quite a few others who were garbadging up the logs.

Robots txt handles the well mannered bots pretty well, but also I use Sygate firewall which has advanced rules to block ips and ip blocks.
Back to top View user's profile Send private message Visit poster's website
aprelium
-


Joined: 22 Mar 2002
Posts: 6800

PostPosted: Sat Oct 22, 2005 3:12 pm    Post subject: Reply with quote

A simple way to keep bots out of your server is to put in the root of your site a robots.txt file.

For example, if you want to forbid msnbot from crawling your site, the robots.txt file should contain the following:

Code:
User-agent: msnbot
Disallow: /


For more information, refer to http://www.robotstxt.org/ .
_________________
Support Team
Aprelium - http://www.aprelium.com
Back to top View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    Aprelium Forum Index -> General Questions All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB phpBB Group