Use and Advantages of Robots.txt With Example

Friday

Use and Advantages of Robots.txt With Example

"robots.txt" file can protect private content from appearing online, save bandwidth, and lower load on your server. A missing "robots.txt" file also generates additional errors in your apache log whenever robots request one.

In order to pass this test you must create and proper install a robots.txt file.

For this, you can use any program that produces a text file or you can use an

online tool (Google Webmaster Tools has this feature).

Remember to use all lower case for the filename: robots.txt, not ROBOTS.TXT.

A simple robots.txt file looks like this:

User-agent: *

Disallow: /cgi-bin/

Disallow: /images/

Disallow: /pages/thankyou.html

This would block all search engine robots from visiting "cgi-bin" and "images" directories and the page "http://www.yoursite.com/pages/thankyou.html"

TIPS:

You need a separate Disallow line for every URL prefix you want to exclude

You may not have blank lines in a record because they are used to delimit

multiple records

Notice that before the Disallow command, you have the command: User- agent: *.

The User-agent: part specifies which robot you want to block. Major known crawlers are: Googlebot (Google), Googlebot-Image (Google Image Search), Baiduspider (Baidu), Bingbot (Bing)

One important thing to know if you are creating your own robots.txt file is that although the wildcard (*) is used in the User-agent line (meaning "any robot"), it is not allowed in the Disallow line.

Regular expression are not supported in either the User-agent or Disallow lines

Once you have your robots.txt file, you can upload it in the top-level directory of your web server. After that, make sure you set the permissions on the file so that visitors (like search engines) can read it.

SEO Herald - Search Engine Optimization

Friday

Use and Advantages of Robots.txt With Example

No comments:

Post a Comment