AnimeSuki Forums

Register Forum Rules FAQ Members List Social Groups Search Today's Posts Mark Forums Read

Go Back   AnimeSuki Forum > Support > Forum & Site Feedback

Notices

Reply
 
Thread Tools
Old 2004-06-20, 09:52   Link #1
_Sin_
Member of the Year 2004!
 
Join Date: Apr 2004
Location: "And if thou doest not well, _Sin_ lieth at the door."- Genesis 4:7
Age: 39
Ignore/Add to Buddy list

I got some questions about the Forum which can be answered quite quickly I think. I want to know what effect it has if I add someone to my buddy list or in my ignore list. Like can I see the people in my Buddy list even if they are in Invisible mode? Or I don't get any PM's from people in the Ignore list?

EDIT: I feel so stupid >_< It's all in the FAQ
I guess the only question that remains is the last one at the bottom, and I'll look into the FAQ again if I can find any clues about that. EDIT end

Oh and since I'm opening a thread for this, I might ask this as well although it does not have much of an importance: Who/What are the Google Spiders on the Who is online page? Are they guest who got directed here by Google or something?

Thanks

Last edited by _Sin_; 2004-06-20 at 10:03. Reason: Added link
_Sin_ is offline   Reply With Quote
Old 2004-06-20, 10:06   Link #2
Superchop
Lord Sesshoumaru
 
 
Join Date: Nov 2003
Location: "Post a Photo of Yourself!" Thread
Quote:
Originally Posted by _Sin_
Who/What are the Google Spiders on the Who is online page? Are they guest who got directed here by Google or something?

Thanks
lol, I've always wondered about those things as well...I just thought that since noone ever asked about it that either noone noticed it...or that everyone else except me knew what they were
Superchop is offline   Reply With Quote
Old 2004-06-20, 10:16   Link #3
xris
Just call me Ojisan
 
 
Join Date: Jan 2003
Location: U.K. Hampshire
Quote:
Originally Posted by Superchop
lol, I've always wondered about those things as well...I just thought that since noone ever asked about it that either noone noticed it...or that everyone else except me knew what they were
This is how online search engines (such as Google) get the data indexed. The spiders are programs that trawl the internet (spiders, as they crawl around the web, www) and build up the index for the search engines. They download the web page, index every word they find and note the URL of the page. They also look for other URLs on the page so to build a network of links (to find more pages to search). If you look at a log file of a site to see who visits, spiders are common visitors.

Note: This is a simplified (and poor) explanation of how it sort of works, it's meant to just give a general overview. I've never noticed them here before but I've seen the equivalent 'bot trawl through my sites (by inspecting the access log files).
xris is offline   Reply With Quote
Old 2004-06-20, 10:19   Link #4
_Sin_
Member of the Year 2004!
 
Join Date: Apr 2004
Location: "And if thou doest not well, _Sin_ lieth at the door."- Genesis 4:7
Age: 39
Quote:
Originally Posted by xris
This is how online search engines (such as Google) get the data indexed. The spiders are programs that trawl the internet (spiders, as they crawl around the web, www) and build up the index for the search engines. They download the web page, index every word they find and note the URL of the page. They also look for other URLs on the page so to build a network of links (to find more pages to search). If you look at a log file of a site to see who visits, spiders are common visitors.
Thanks for the information
_Sin_ is offline   Reply With Quote
Old 2004-06-20, 10:26   Link #5
Superchop
Lord Sesshoumaru
 
 
Join Date: Nov 2003
Location: "Post a Photo of Yourself!" Thread
Xris - Ah, ok...the few times that i ever checked the who was online page i always saw some 2 or 3 of them browsing but since nobody ever asked about it i just shrugged it off and left it alone
Superchop is offline   Reply With Quote
Old 2004-06-20, 10:31   Link #6
_Sin_
Member of the Year 2004!
 
Join Date: Apr 2004
Location: "And if thou doest not well, _Sin_ lieth at the door."- Genesis 4:7
Age: 39
Quote:
Originally Posted by Superchop
Xris - Ah, ok...the few times that i ever checked the who was online page i always saw some 2 or 3 of them browsing but since nobody ever asked about it i just shrugged it off and left it alone
I'm pretty sure that I even saw one Spider looking at the user profiles - good thing that our E-Mail addresses are kinda crypted
And they even search the Forums like the Admins/Mods - interesting.
_Sin_ is offline   Reply With Quote
Old 2004-06-20, 10:33   Link #7
xris
Just call me Ojisan
 
 
Join Date: Jan 2003
Location: U.K. Hampshire
Here are some examples of search engine bots (I only listed those looking at the robots.txt file, otherwise there are too many to list)

crawler14.googlebot.com - - [19/Jun/2004:22:30:04 +0100] "GET /robots.txt HTTP/1.0" 200 155 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

lj1141.inktomisearch.com - - [19/Jun/2004:17:42:58 +0100] "GET /robots.txt HTTP/1.0" 200 155 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

webcachem08b.cache.pol.co.uk - - [19/Jun/2004:19:37:53 +0100] "GET /robots.txt HTTP/1.1" 200 155 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSIECrawler)"

x1crawler3-1-0.x-echo.com - - [19/Jun/2004:20:48:25 +0100] "GET /robots.txt HTTP/1.1" 200 155 "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) VoilaBot BETA 1.2 (http://www.voila.com/)"

wfp2.almaden.ibm.com - - [20/Jun/2004:01:55:35 +0100] "GET /robots.txt HTTP/1.0" 200 155 "-" "http://www.almaden.ibm.com/cs/crawler [c01]"

msnbot64104.search.msn.com - - [20/Jun/2004:03:04:02 +0100] "GET /robots.txt HTTP/1.0" 200 155 "-" "msnbot/0.11 (+http://search.msn.com/msnbot.htm)"

Note that sites can have a file called robots.txt, which the admin can optionally create to 'help' the bot search the site. I assume the bot starts by accessing the robots.txt file and then proceeds with /

Here's part of a robots.txt file I have
# Hello little robots

user-agent: *
disallow: /tiles
disallow: /status.htm
disallow: /preorder.htm
disallow: /updates.htm

It tells them not to search those directories / files listed (because I think they get updated to often to warrant repeated searches of them).
xris is offline   Reply With Quote
Old 2004-06-23, 09:50   Link #8
[maven]
Junior Member
 
Join Date: Jul 2003
Age: 44
Send a message via MSN to [maven]
Quote:
Originally Posted by xris
Here's part of a robots.txt file I have
Code:
# Hello little robots

user-agent: *
disallow: /tiles
disallow: /status.htm
disallow: /preorder.htm
disallow: /updates.htm
It tells them not to search those directories / files listed (because I think they get updated to often to warrant repeated searches of them).
status.htm - updated 27th January 2002 3:15 PM BST
preorder.htm - updated 27th January 2002 3:15 PM GMT
updates.htm - updated 1st August 2003

Sorry. Couldn't resist...
[maven] is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 03:13.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
We use Silk.