How to Identify and Remedy Duplicate Content Issues on Your Website
When search engines find more than a single version of a page to be indexed, such a phenomenon is referred to as duplicate content. In such scenarios, search engines are perplexed and are unable to process the right page to show for a query. To generate a high-quality user experience, it is important for websites to avoid duplicating content.
What Causes Duplicate Content?
There are three main reasons behind the duplication of content.
1- Printer Friendly
In many occurrences, a website is prone to have a duplicate content problem where its web-page has a page for printing. For instance, there can be two URLs.
http://www.abccompany.com/casestudy2
http://www.abccompany.com/printer/casestudy2
2- Session IDs
To deliver a better user experience, a website usually keeps a track of the sessions of a user. This is done to generate optimal personalisation so users are able to get recommendations according to their interactions with the website. For instance, an-ecommerce website can recognise that a user likes to shop for sports equipment and may offer recommendations for their favourite sports gear.
Sometimes, a session ID can get attached to the URL of a website. As a consequence, duplicate version of a web-page may exist.
http://www.abccompany.com/casestudy2
http://www.abccompany.com /casestudy2?sessionid=45625
3- Parameters in URL
Sometimes, a URL has multiple parameters for tracking various segments, like IDs for analytics or campaigns. At other times, it is possible that a website’s Content Management System is modifying and incorporating any custom parameters. For instance,
http://www.abccompany.com/casestudy2
http://www.abccompany.com /casestudy2?source=organic
http://www.abccompany.com/casestudy2?campaignid=4124
Identify Duplicate Content
To identify duplicate content, we recommend two tools: Screaming Frog and Google Webmaster.
Screaming Frog
Go to http://www.screamingfrog.co.uk/seo-spider/ and download the screaming frog web crawler. It provides an easy way for crawling up to 500 pages. This software comes with various features and facilitates the search of duplicate content.
Page Titles
To identify a duplicate page title, go to the tab “Page Titles” and initiate the search by clicking on “Duplicate”. Similarly, you can also do the same by clicking on “Meta Description”.
URLs
Go to the URL tab and click “Duplicate” it to see if there is more than a single instance of a URL version.
Remedy Duplicate Content
Bear in mind that the Search Engine Optimisation (SEO) of your website can be severely affected with the presence of duplicate content.
1- Canonical Tag
In your website code, you can incorporate a canonical tag. This tag informs search engine the page version which “you” desire to return for a search query. The tag is placed in the header of a web-page. It is one of the best techniques to use when you have more than a single version of a web-page.
2- Meta Tags
By using metatags, you can always direct search engines to avoid indexing a specific web-page.
<html>
<head>
<title>Use Metatags</title>
<Meta Name=”Robots” Content=”noindex, nofollow”>
</head>
Meta tags are useful when a website admin intends to make a page public, but does not want it to be indexed. For example, the terms and conditions web-page is usually not indexed.