Is there any FREE guide or lesson how to write a robot?

wanhartwanhart subscriber Posts: 2
From reading "Will
A Sitemap Get Google To Crawl My Site Faster?" in
http://www.startupnation.com/forums/6814/1/1. I am adding a site map to
my site as soon as I can.Right now, I wanted to optimize my website ranking and indexing in Live.com (being no.2 website worldwide). At this point I am really concern that my traffic from Live.com is still 0 (zero).I was reading from Live.com web ranking and indexing guidelines and Live.com suggested to develop a robot.txt. I`ve been to http://www.robotstxt.org/wc/robots.html and it did not provide help in how to write a robot. Many of their books recommendation were written back in the 90`s.I was wondering if anybody know a website that can give us a FREE complete guidline on how to write a robot.Irwan.

Comments

  • SearchGuySearchGuy subscriber Posts: 0
    MSN/Live.com has been known to over index website pages which may be a reason why they suggest making a robots file. As hostclick stated, it tell bots what to crawl on your site. I`ve used the robots file to purge some files out of the index however not to increase rankings, not sure how that would work. In any case I quickly checked your indexed pages in Live and your site looks fine. In Yahoo however you only have the homepage indexed so I would consider resubmitting. - Dan
  • mallocmalloc subscriber Posts: 7
    Irwan,
    Please check out http://www.robotstxt.org/wc/norobots.html</A>. This is the standard  for writing and consuming robots.txt. This page also provides several good examples.
    The robot.txt file is basically an instruction set that informs web-crawlers what they should / shouldn`t crawl in your site.
    There are some sites that provide free, online wizards for this tasks. This is a good one: http://www.mcanerin.com/EN/search-engine/robots-txt.asp</A>
    Enjoy,
    David
  • wanhartwanhart subscriber Posts: 2
    Ok, thanks. I don`t need a robot after all.But, wait, I may need robot to tell search engine not to crawl to my customer database?
  • WeblineWebline subscriber Posts: 13 Bronze Level Member
    Bots crawl directories; I don`t believe theres anyway they can get into a database.
  • JoeJustinJoeJustin subscriber Posts: 1
    wanahrt,You do need a robots.txt!  Most folks don`t even know about this file.  The reason why you need this is because if you have an HTML static website you normally want your index page to be the page of entry.  This is the page everyone wants to have the highest Google page rank.  So here`s why you need a robots.txt file.  Every link you have on your index page, probably every other page on your site will take away from your page rank.  So  let`s say that you have a web site with 7 differnt pages other than your index page.  That means that you will have 7 links off of your index page thus taking away page rank from your site.  What you want to do is to figure out what other paged do you want to have on google besides your home page or index page.  We can probably say most of the time you don;t really care if your about us page, your contact page or FAQs page shows up in google.  So your robots.txt file will tell google not to spider those pages.  Using our example we have now shaved off 3 out of the seven pages that hurt your page rank.  contact us, about us and FAQs.  Fi you dig a little deeer you probably can take one or two more off of your list as well.  I hope this helps.  You definitly need a robots.txt file!!!!
  • Fred333Fred333 subscriber Posts: 0
    Thanks for the Live link. I was looking for that.
  • sddreamweaverssddreamweavers subscriber Posts: 5
    Irwan,
    Please check out http://www.robotstxt.org/wc/norobots.html. This is the standard  for writing and consuming robots.txt. This page also provides several good examples.
    The robot.txt file is basically an instruction set that informs web-crawlers what they should / shouldn`t crawl in your site.
    There are some sites that provide free, online wizards for this tasks. This is a good one: http://www.mcanerin.com/EN/search-engine/robots-txt.asp
    Enjoy,
    DavidSecond that.  If you want to look at the robots.txt file of a heavily used site check out Wikipedia`s robot.txt file.As for the Google/Yahoo Sitemap Generator, try GSiteCrawler this sucker will crawl a given website and create sitemaps for Google and Yahoo.
  • abdelrahman80abdelrahman80 subscriber Posts: 2
    ROBOTS is used to block search engines from indexing pages. But many web authors use it to tell search engines to index a page. Here is an example:
    <META NAME="robots" CONTENT="ALL">
    This tag is a waste of time. If a search engine finds your page and wants to index it, and hasn`t been blocked from doing so, it will. And if it doesn`t want to index a page, it won`t. Telling the search engine to do so doesn`t make a difference.
    Here is a special Google meta tag that you can use a couple of ways. Here`s one example:
    <META NAME="googlebot" CONTENT="nosnippet">
    This meta tag tells Google not to use the description snippet, the piece of information it grabs from within a Web page to use as the description; instead it will use the DESCRIPTION meta tag. Here is another example
    <META NAME="googlebot" CONTENT ="noarchive">
    Using the  ROBOTS meta tag or the robots.txt file, you can tell the search engines to stay away . The meta tag looks like this:
    <META NAME="robots" CONTENT="noindex, nofollow">
Sign In or Register to comment.