What is a Robots.txt and why do I need one?

"A robots.txt is a small file that tells a search engine which pages to index and which pages to ignore."

Why is it called a robots.txt?

Every search engine has a spider or robot ('bot) that follows links to find and index pages. Google, Bing, Yandex and Baidu all have their own 'bots that follow links to find new pages. They are constantly seeking new pages to add to their index and rate them to display a search result. When you search, the results you get are from the search engine index, not a fresh trawl or the whole internet.

So the 'robot' part is simply to state this is for the robots to read. The .txt is because it is simply a text file with no formatting.

What does it do?

A robots.txt is written to tell the 'bots which pages, or parts of the websites that should be indexed and which parts should not. A 'good' robot will follow these instructions.

A simple robots.txt should tell the search engine not to index the admin pages, privacy policy and other content you don't wish to be indexed. For new sites in development or sites that just don't want to be indexed, they can be instructed to ignore everything.

Why do I need a robots.txt?

It is good practice to have a Robots.txt on your site so you can have some control over who and what indexes your site, and which parts are indexed.

What does it look like?

A simple robots.txt looks like this.

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

This states for any robot (User-agent:*) don't index /wp-admin - that is the front door for editing the site (Disallow), but DO index everything else. Sometimes you can add funny messages as the search engines won't read it. Check out Nike's (below.)

You can see what other sites have included in their robots.txt:




How can I create a robots.txt?

You can create a simple robots.txt with your computer notepad or text pad, or use Google's Robots.txt tool to create and check. You will then need to have it uploaded to your server - ask your techies to fix this, or use a Wordpress plugin to do the heavy lifting.

We check the presence of a robots.txt in every SEO Audit report.