Wednesday

Setting up 301 Redirects

Change page URLs with 301 redirects

If you need to change the URL of a page as it is shown in search engine results, we recommend that you use a server-side 301 redirect. This is the best way to ensure that users and search engines are directed to the correct page. The 301 status code means that a page has permanently moved to a new location.
301 redirects are particularly useful in the following circumstances:
  1. You've moved your site to a new domain, and you want to make the transition as seamless as possible.
  2. People access your site through several different URLs. If, for example, your home page can be reached in multiple ways - for instance, http://example.com/home, http://home.example.com, or http://www.example.com - it's a good idea to pick one of those URLs as your preferred (canonical) destination, and use 301 redirects to send traffic from the other URLs to your preferred URL. You can also use Webmaster Tools to set your preferred domain.
  3. You're merging two websites and want to make sure that links to outdated URLs are redirected to the correct pages.
To implement a 301 redirect for websites that are hosted on servers running Apache, you'll need access to your server's .htaccess file. (If you're not sure about your access or your server software, check with your webhoster.) For more information, consult the Apache .htaccess Tutorial and the Apache URL Rewriting Guide. If your site is hosted on a server running other software, check with your hoster for more details.

(Source: https://support.google.com/webmasters/#)



Creating an User Friendly 404 Page for Our Site

A 404 page is what a user sees when they try to reach a non-existent page on your site (because they've clicked on a broken link, the page has been deleted, or they've mistyped a URL). A 404 page is called that because in response to a request for a missing page, webservers send back a HTTP status code of 404 to indicate that a page is not found. While the standard 404 page can vary depending on your ISP, it usually doesn't provide the user with any useful information, and most users may just surf away from your site.

If you have access to your server, we recommend that you create a custom 404 page. A good custom 404 page will help people find the information they're looking for, as well as providing other helpful content and encouraging them to explore your site further.

(Note: This article covers guidelines for creating the content of your custom 404 page. For information on configuring your server to display your new 404 page, check your server or web hoster documentation. You should still make sure that your webserver returns a 404 status code to users and spiders, so that search engines don't accidentally index your custom 404 page.)

Because a 404 page can also be a standard HTML page, you can customize it any way you want. Here are some suggestions for creating an effective 404 page that can help keep visitors on your site and help them find the information they're looking for:

  • Tell visitors clearly that the page they're looking for can't be found. Use language that is friendly and inviting.
  • Make sure your 404 page uses the same look and feel (including navigation) as the rest of your site.
  • Consider adding links to your most popular articles or posts, as well as a link to your site's home page.
  • Think about providing a way for users to report a broken link.
  • No matter how beautiful and useful your custom 404 page, you probably don't want it to appear in Google search results. In order to prevent 404 pages from being indexed by Google and other search engines, make sure that your webserver returns an actual 404 HTTP status code when a missing page is requested.
  • Use the Enhance 404 widget to embed a search box on your custom 404 page and provide users with useful information to help them find the information they need.
  • Use the Change of Address tool to tell Google about your site's move.
(Source: https://support.google.com/webmasters/#)

Tuesday

What are alt attributes useful for?

What are alt attributes useful for?

The alt attribute is defined in a set of tags (namely, img, area and optionally for input and applet) to allow you to provide a text equivalent for the object.
A text equivalent brings the following benefits to your web site and its visitors in the following common situations:
  • nowadays, Web browsers are available in a very wide variety of platforms with very different capacities; some cannot display images at all or only a restricted set of type of images; some can be configured to not load images. If your code has the alt attribute set in its images, most of these browsers will display the description you gave instead of the images
  • some of your visitors cannot see images, be they blind, color-blind, low-sighted; the alt attribute is of great help for those people that can rely on it to have a good idea of what's on your page
  • search engine bots belong to the two above categories: if you want your website to be indexed as well as it deserves, use the alt attribute to make sure that they won't miss important sections of your pages.

What should I put in my alt attribute?

The generic rule for the content of the alt attribute is: use text that fulfills the same function as the image.
Some more specific rules:
  • if the image is simply decorated text , put the text in the alt attribute
  • if the image is used to create bullets in a list, a horizontal line, or other similar decoration, it is fine to have an empty alt attribute (e.g., alt=""), but it is better to use things like list-style-image in CSS
  • if the image presents a lot of important information, try to summarize it in a short line for the alt attribute and add a longdesc link to a more detailed description

(Source: http://www.w3.org/)

Friday

How to Fix URL Canonicalization

Search engines consider URLs with and without "www" as two different websites. Test your site for potential URL canonicalization issues. Canonicalization describes how a site can use slightly different URLs for the same page (for example, if http://www.example.com and http://example.com displays the same page but do not resolve to the same URL). If this happens, search engines may be unsure as to which URL is the correct one to index.

Solution:

In order to pass this test you must consider using a 301 re-write rule in your .htaccess file so that both addresses (http://example.com and http://www.example.com) resolve to the same URL.
- If you want to redirect http://www.example.com to http://example.com, you can use this:

RewriteCond %{HTTP_HOST} ^www\.example\.com$
RewriteRule ^/?$ "http\:\/\/example\.com\/" [R=301,L]


- If you want to redirect http://example.com to http://www.example.com, you can use this:

RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

Note that you must put the above lines somewhere after RewriteEngine On line.

Learn more about canonicalization issues at: http://www.mattcutts.com/blog/seo-advice-url-canonicalization/

Use and Advantages of Robots.txt With Example

"robots.txt" file can protect private content from appearing online, save bandwidth, and lower load on your server. A missing "robots.txt" file also generates additional errors in your apache log whenever robots request one.
In order to pass this test you must create and proper install a robots.txt file.
For this, you can use any program that produces a text file or you can use an
online tool (Google Webmaster Tools has this feature).
Remember to use all lower case for the filename: robots.txt, not ROBOTS.TXT.
A simple robots.txt file looks like this:

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /pages/thankyou.html
This would block all search engine robots from visiting "cgi-bin" and "images" directories and the page "http://www.yoursite.com/pages/thankyou.html"

TIPS:
  • You need a separate Disallow line for every URL prefix you want to exclude
  • You may not have blank lines in a record because they are used to delimit
    multiple records
  • Notice that before the Disallow command, you have the command: User- agent: *.
    The User-agent: part specifies which robot you want to block. Major known crawlers are: Googlebot (Google), Googlebot-Image (Google Image Search), Baiduspider (Baidu), Bingbot (Bing)
  • One important thing to know if you are creating your own robots.txt file is that although the wildcard (*) is used in the User-agent line (meaning "any robot"), it is not allowed in the Disallow line.
  • Regular expression are not supported in either the User-agent or Disallow lines
    Once you have your robots.txt file, you can upload it in the top-level directory of your web server. After that, make sure you set the permissions on the file so that visitors (like search engines) can read it.

Google’s Updated Link Scheme Guidelines

Any links intended to manipulate PageRank or a site's ranking in Google search results may be considered part of a link scheme and a violation of Google’s Webmaster Guidelines. This includes any behavior that manipulates links to your site or outgoing links from your site.

The following are examples of link schemes which can negatively impact a site's ranking in search results:
  • Buying or selling links that pass PageRank. This includes exchanging money for links, or posts that contain links; exchanging goods or services for links; or sending someone a “free” product in exchange for them writing about it and including a link
  • Excessive link exchanges ("Link to me and I'll link to you") or partner pages exclusively for the sake of cross-linking
  • Large-scale article marketing or guest posting campaigns with keyword-rich anchor text links
  • Using automated programs or services to create links to your site
Additionally, creating links that weren’t editorially placed or vouched for by the site’s owner on a page, otherwise known as unnatural links, can be considered a violation of our guidelines. Here are a few common examples of unnatural links that may violate our guidelines:
  • Text advertisements that pass PageRank
  • Advertorials or native advertising where payment is received for articles that include links that pass PageRank
  • Links with optimized anchor text in articles or press releases distributed on other sites. For example:  
There are many wedding rings on the market. If you want to have a wedding, you will have to pick the best ring. You will also need to buy flowers and a wedding dress.

  • Low-quality directory or bookmark site links
  • Keyword-rich, hidden or low-quality links embedded in widgets that are distributed across various sites, for example:
        Visitors to this page: 1,472
        car insurance
  • Widely distributed links in the footers or templates of various sites
  • Forum comments with optimized links in the post or signature, for example:
        Thanks, that’s great info!
        - Paul
        paul’s pizza san diego pizza best pizza san diego
Note that PPC (pay-per-click) advertising links that don’t pass PageRank to the buyer of the ad do not violate our guidelines. You can prevent PageRank from passing in several ways, such as:
  • Adding a rel="nofollow" attribute to the <a> tag
  • Redirecting the links to an intermediate page that is blocked from search engines with a robots.txt file
The best way to get other sites to create high-quality, relevant links to yours is to create unique, relevant content that can naturally gain popularity in the Internet community. Creating good content pays off: Links are usually editorial votes given by choice, and the more useful content you have, the greater the chances someone else will find that content valuable to their readers and link to it.

If you see a site that is participating in link schemes intended to manipulate PageRank,let us know. We'll use your information to improve our algorithmic detection of such links.




Monday

How to Use Robots Meta Tag to Block Access to Your Site?

The noindex meta standard is useful if you don't have root access to your server, as it allows you to control access to your site on a page-by-page basis.

Multiple content values
We recommend that you place all content values in one meta tag. This keeps the meta tags easy to read and reduces the chance for conflicts. For instance:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

If the page contains multiple meta tags of the same type, we will aggregate the content values. For instance, we will interpret

<META NAME="ROBOTS" CONTENT="NOINDEX">
<META NAME="ROBOTS" CONTENT="NOFOLLOW">

The same way as:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

If content values conflict, we will use the most restrictive. So, if the page has these meta tags:

<META NAME="ROBOTS" CONTENT="NOINDEX">
<META NAME="ROBOTS" CONTENT="INDEX">

We will obey the NOINDEX value.

Unnecessary content values
By default, Googlebot will index a page and follow links to it. So there's no need to tag pages with content values of INDEX or FOLLOW.

Directing a robots meta tag specifically at Googlebot
To provide instruction for all search engines, set the meta name to "ROBOTS". To provide instruction for only Googlebot, set the meta name to "GOOGLEBOT". If you want to provide different instructions for different search engines (for instance, if you want one search engine to index a page, but not another), it's best to use a specific meta tag for each search engine rather than use a generic robots meta tag combined with a specific one. You can find a list of bots at robotstxt.org.

Casing and spacing
Googlebot understands any combination of lowercase and uppercase. So each of these meta tags is interpreted in exactly the same way:

<meta name="ROBOTS" content="NOODP">
<meta name="robots" content="noodp">
<meta name="Robots" content="NoOdp">

If you have multiple content values, you must place a comma between them, but it doesn't matter if you also include spaces. So the following meta tags are interpreted the same way:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">

If you use both a robots.txt file and robots meta tags
If the robots.txt and meta tag instructions for a page conflict, Googlebot follows the most restrictive. More specifically:
If you block a page with robots.txt, Googlebot will never crawl the page and will never read any meta tags on the page.
If you allow a page with robots.txt but block it from being indexed using a meta tag, Googlebot will access the page, read the meta tag, and subsequently not index it.
Valid meta robots content values
Googlebot interprets the following robots meta tag values:
NOINDEX - prevents the page from being included in the index.
NOFOLLOW - prevents Googlebot from following any links on the page. (Note that this is different from the link-level NOFOLLOW attribute, which prevents Googlebot from following an individual link.)
NOARCHIVE - prevents a cached copy of this page from being available in the search results.
NOSNIPPET - prevents a description from appearing below the page in the search results, as well as prevents caching of the page.
NOODP - blocks the Open Directory Project description of the page from being used in the description that appears below the page in the search results.
NONE - equivalent to "NOINDEX, NOFOLLOW".
A word about content value "NONE"
As defined by robotstxt.org, the following direction means NOINDEX, NOFOLLOW.

<META NAME="ROBOTS" CONTENT="NONE">

However, some webmasters use this tag to indicate no robots restrictions and inadvertently block all search engines from their content.
To know more information about  Robots meta tag Visit at: https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag


(Source: http://googlewebmastercentral.blogspot.in)