View previous topic :: View next topic |
Author |
Message |
briosky -
Joined: 18 Jun 2005 Posts: 46 Location: Salt Lake City, UT
|
Posted: Wed Sep 21, 2005 5:58 am Post subject: Garbage in the log |
|
|
Every darn time that :
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
visits my site, I find about 4 MB of trash in the access.log file.
More than tweaking the robots.txt file to lock access to the Redmond geniuses, any idea plz ?
-- the_wasatch_dude _________________ Brionews - Everyday Freshnews http://brionews.com |
|
Back to top |
|
|
abyssisthebest -
Joined: 30 Jun 2005 Posts: 319 Location: Boston, UK
|
Posted: Wed Sep 21, 2005 7:24 am Post subject: |
|
|
find out the ip of the msn bot and add it to the do not log list _________________ My online Portfolio |
|
Back to top |
|
|
briosky -
Joined: 18 Jun 2005 Posts: 46 Location: Salt Lake City, UT
|
Posted: Wed Sep 21, 2005 4:57 pm Post subject: |
|
|
Quote: | find out the ip of the msn bot and add it to the do not log list |
abyssisthebest:
i can't do that.
the log file is processed every day by a stat analyzer perl script (just an example here).
that would miss the msn visits.
no big deal, of course.
but i would rather prefer to discover why the trash appears, then to find a solution
however i appreciated the help
_________________ Brionews - Everyday Freshnews http://brionews.com |
|
Back to top |
|
|
TRUSTAbyss -
Joined: 29 Oct 2003 Posts: 3752 Location: USA, GA
|
Posted: Wed Sep 21, 2005 6:28 pm Post subject: |
|
|
You will find that lots of Bots visit your website. Every part of your log file is
important and you shouldn't worry whats in the log file , you should be glad
that its in the file because that tells you that Logging is working.
The only thing you should be worried about , are the Error Requests.
Sincerely , TRUSTpunk |
|
Back to top |
|
|
chance -
Joined: 04 Jan 2003 Posts: 27 Location: everett, wa
|
Posted: Sat Oct 22, 2005 7:22 am Post subject: |
|
|
Take another look at AWstats and see how many referrals you get from the msn bot. I found about none, but the bot was eating up bandwidth everyday. Since they weren't sending me any hits, I banned them (along with quite a few others who were garbadging up the logs.
Robots txt handles the well mannered bots pretty well, but also I use Sygate firewall which has advanced rules to block ips and ip blocks. |
|
Back to top |
|
|
aprelium -
Joined: 22 Mar 2002 Posts: 6800
|
Posted: Sat Oct 22, 2005 3:12 pm Post subject: |
|
|
A simple way to keep bots out of your server is to put in the root of your site a robots.txt file.
For example, if you want to forbid msnbot from crawling your site, the robots.txt file should contain the following:
Code: | User-agent: msnbot
Disallow: / |
For more information, refer to http://www.robotstxt.org/ . _________________ Support Team
Aprelium - http://www.aprelium.com |
|
Back to top |
|
|
|