Robots plays important role in the field of SEO. We have two ways to control pages and folders one is using Robots META tag and other is through robots.txt
A web page creator can specify which page should be indexed and which page should not be indexed by search engines by placing Robots META tag in the HTML section
Here are some Robots tags that are common
< content="NOINDEX" name="ROBOTS">- Ignore content and follow links
< content="NOFOLLOW, INDEX " name="ROBOTS">- Include content and do not follow links
< content="NOINDEX,NOFOLLOW" name="ROBOTS">- Ignore content and do not follow links
< content="INDEX,FOLLOW" name="ROBOTS">- Include content and follow links
< content="NOARCHIVE" name="ROBOTS">- Cache link should not show Search results pages
< content="NOODP" name="ROBOTS">- The Open Directory Project (ODP) title and description for the page should not be displayed in Search results
< content="NOYDIR" name="ROBOTS">- The Yahoo Directory title and description for the page should not be displayed in Search results
< content="NOSNIPPET" name="ROBOTS">- Titles are only displayed in Search results page and not description or text context for this page
In addition to manage folder level user agent control robots.txt file can be used. This file can be placed in root of each server and the format is plain text not HTML
Through this file website owner or webmaster can allow access to web page content and disallow access to admin, cgi and any secured files that you don’t want search engines to index
A typical robots.txt file will look like
User-agent: *
Allow: /
Disallow: /admin*
Allow: *content*
Disallow: /test/
Disallow: /paypal/
Disallow: /credit/
Disallow: /cgi/
Explains, all robots can crawl except the admin files, and crawl files named content folder, and should not crawl test, paypal, credit and cgi folder.
Hope this post helps to know more about robots and its act in SEO. You can analyze further on checking Google's robots.txt file http://www.google.com/robots.txt and post comment or tweet @jagadeeshmp if you need better understanding :)
Also see,
Revised 30 SEO Tips in 30 Days - Explained in Google Knol (Won best knol of the Month March 2009 & Top Pick Knol Award)
A to Z about Search Engine Optimization (SEO), a Internet Marketing Hand Book. In Jag's SEO a day help blog, SEO articles & tips are put together for free to educate SEO newbie’s, web designers, online business owners and seasoned SEO professionals. You're welcome to follow the blog for search & social media best practices. Thank you for visiting the SEO Help blog.
Tuesday, April 21, 2009
Everything about Meta Robots and robots.txt
About Author
Jagadeesh Mohan Kumar Nambiar, commonly known as Jag, an SEO advisor who has been blogging since 2005, and has been in the field of SEO around 10 years, enjoys helping, teaching and recommends ethical SEO.
Jag is a moderator at SEWatch Forums, SERountable Forums, #3 Top pick Google Knol author and Blekko Associate Editor, more about Jag

Labels: Meta robots, Robots.txt, SEO, Seo Tips
Subscribe to:
Post Comments (Atom)


6 comments:
I am still puzzled how the application of this technique. Whether it be fixed?. Thank you friend.
Robots.txt are great for specifying what web pages you want the search robots to index and what not to.
Wow, with your explanation, I think I already had a good grasp about meta robots. Nice!
good information about meta robots, to be read by any seo consultant
Well explanation about meta robots or robots.txt :)
I am a bit confused. Yes, I am a newbie to SEO. I am using Rapid Weaver to build my website. In the past, I used iWeb. However, it wasn't SEO friendly.
As for robot.txt, I didn't know there are some pages that need to be crawled and some should be avoided.
How would I know which page? My site is mostly static. The content doesn't change much. However, I am building a one page blog of all my photography events. Since I would have the most current information on this page, should i have the robots crawl this page?
Please help! thanks
Post a Comment