Amplify Interactive is located in the Olympic Mills building in Portland, Oregon

Amplify your search engine marketing knowledge! Please subscribe to our RSS feed and comment & contribute - our comment links are 'followed' because we want to pay-it-forward for great discussions.



Subscribe

Subscribe to Amplify Interactive updates. Follow our Twitter or use the two traditional convenient RSS subscription options:

Subscribe to email blog updates

Subscribe by RSS Feed

Amplify Twitter
Subscribe to Amplify Interactive's twitter

Hello My Name is Robots Dot Text (Robots.txt)

Robots Dot Text

Greetings humans!  My name is Robots Dot Text.  You can call me robots.txt (ah hah!  Clever, eh?).  As you can see, I am mainly part text document (hence my bland-looking figure; apologies for not being part Word document or Photoshop image), part “dot” (it is hard finding suits that fit a dot-like figure) and part robot.  I am quite a character!

Think of me as a liaison between your website and a search engine robot/spider.  I am like your website’s personal butler: I greet web robots at the door and give them directions as to where they may proceed within your site.

Usually, when you humans have guests over, you do not necessarily like others to go into your human bedroom (unless you are into that thing… not that there is anything wrong with that!). In the same way – you can instruct me as to where you want me to keep the web robots from visiting on your site.  You do not want them to see a page for some reason?  I can make sure that happens.

First and foremost, you may be asking yourself, “Why do I need a personal butler for my website?”  Well, then I may offer a retort of: “Why do aquatic vertebrate animals need water?”  Hah hah hah hah!  Best practice-wise, every site owner needs a personal butler to greet web robots at the door and tell them to not spider specific pages.  Pages that you should typically tell me to not let robots access are your privacy policy, terms of use and contact form fillout pages.  I would think having your “money” pages show in search results would be much better than your boring privacy policy, yes?

These web robot guests are quite the fickle bunch.  They only seem to respond to a particular greeting.  Think of it as a secret “bro-shake” that we do before they know they should comply with my requests.  It goes like this:

User-agent: *

That is my universal greeting to all web robots.  I can also greet individual web robots if you would like me to.  Here is my personal greeting to my ‘brother-from-a-search-engine-mother’ Googlebot:

User-agent: Googlebot

If I were to say this, only my web robot friend Googlebot would respond to the directions that follow.

If you were to say, “Hey Robots Dot Text, I do not want any web robot guests visiting (and consequently indexing) my bedroom,” or a single-page on your site, I would then instruct the “User-agent: *” this way:

User-agent: *
Disallow: /upstairs/bedroom.html

Web robots are totally cool with me telling them where they can visit in your website and where they can not.  They are quite proper, like me, and will not look down on you if you only want them to see a section of your site.

Let us say, for example, that you do not wish for your web robot guests to see your bedroom… but thinking about it, you do not want them to visit the entire upstairs which includes your “man-cave” (hey, that is what you call it, not me!) and guest bathroom.  This is acceptable as well; I can tell your web robot guests not to visit (and subsequently index) upstairs, or an entire sub-directory of your site, I would then instruct the “User-agent: *” this way:

User-agent: *
Disallow: /upstairs/

If I tell your web robot guests this, they know that visiting upstairs (i.e. your ‘upstairs’ sub-directory) is most definitely not allowed!

I can also promptly tell all web robot guests who come over to turn away and come back at a better time.  This is best when you are performing any maintenance at your place (like a redesign).  Do not feel bad about turning away your web robot guests during this time; it is best for them to see your place when you have everything organized and looking sharp!  If you would like me to tell your web robot guests not to visit (and subsequently not index) all of your pages of your site, I would instruct the “User-agent: *” this way:

User-agent: *
Disallow: /

Those are the basics of my butlerly-duty.  I am hoping that the word I created in the previous sentence makes sense for you since it is not a real word in the English language.  Entschuldigung! Hah hah hah hah!

Well, I am off to greet some more web robots.  I tell you: some of these guests come back almost daily!  It is tough work being me.  But it is quite rewarding!

3 Responses to “Hello My Name is Robots Dot Text (Robots.txt)”

  1. Jason Wright says:

    Interesting spin on robots.txt files. =D

  2. Rob K says:

    lovely spin in fact ;) . I take it robots.txt is the SEO friendly way preventing redirect/404 ‘file-not-found’ errors

  3. Christian says:

    Rob – Not quite. A robots.txt file tells spiders what pages you do not wish for them to index; redirect / 404 errors are handled with 301 redirects. I actually wrote a blog post covering 301 redirects previously – /2009/02/03/seo-best-practices-for-a-site-refresh/

Leave a Reply

Comments links could be nofollow free.