What You Don’t Know About Duplicate Content Can Kill You

How to deal with duplicate content right now

How to deal with duplicate content right now

All of us create and duplicate our own content unintentionally. Content can be fully or partially scraped by others. Duplicate content can cause your pages to not rank well on search engines, be removed from search results and even lead to legal complications.

Here’s my advice on how to identify and deal with duplicate content.

What is duplicate content?

Google Webmasters (now Google Search Console) defines duplicate content as “substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”

Search engines do a great job of showing the best possible content in response to a search query and of identifying the original content source as well. Adhering to Webmasters guidelines will make it easier for search engines to understand what page holds the original content and help it rank accordingly. Having too much duplicate content on your site will lead to loss in rankings and organic traffic. Having other sites duplicate your content could have your content eliminated from search results, especially if the other sites have greater authority, higher number of links pointing to their sites and they have provided no link attribution indicating yours is the original content.

Needless to say, when dealing with Hispanic SEO, it is quite easy to generate duplicate content in two different languages or by geo-targeting. I have covered some of these issues in my International SEO article.

The first thing to understand are the different types of duplicate content: deliberate or malicious, and non-malicious.

Non-malicious duplicate content could happen on discussion forums that may generate a separate desktop and mobile page, in online stores’ product definition that may be repeated on several different and distinct pages, in comment pagination, in lack of definition for the preferred domain, and even when offering printer-only versions of your website pages. This type of duplicate content is not penalized by Google, although I highly recommend avoiding it as much as possible. What you don’t consider malicious, search engines may.

Malicious or deliberate duplicated content occurs when a website owner attempts to manipulate search results to rank better or increase traffic. This is penalized by Google with removal from search results.

Google Panda and duplicate content. Discover how to deal with duplicate content right now.

Google Panda and duplicate content

The most malicious duplicate content of them all: Content scraping

Let’s say you are researching a topic you want to write about and happen upon a wonderfully written piece of content. Who hasn’t? What to do? What to do? As long as you do not scrape the article and you do provide proper attribution, you may cite it. Providing proper attribution will allow you to avoid being accused of plagiarism. Plagiarism is passing someone else’s work off as your own. This is different from copyright infringement, that is using someone else’s protected work by Copyright law without permission which exposes you to being taken to civil or criminal court. Is it really worth it?

Even better yet, ask the content owner for permission. This is very easy to do. Usually there’s an email, a contact form, or a Twitter handle where you can simply write: “Hey! I love your content. Can I properly cite it? What type of attribution would be acceptable to you?” What do you think can come out of this? A resounding yes and the likelihood that they will share your content on their own network. Pretty nifty, huh?

How can I properly cite somebody else’s content and why?

You may think that writing the source alone provides clear attribution, but look at your article again. Does it read as if it’s yours and at the very end, there’s a footnote with the source? Then the attribution is not clear.

Another benefit of proper attribution is the respect earned from your own readers who will see you as an honest content developer. A third benefit is the appreciation of the person you have cited. Who knows? They may invite you to guest blog one day.

Are you afraid that your readers will exit your site to read the source instead? Then make your content even more engaging. Provide more value. When was the last time you were reading a great article who cited somebody else with a link and you clicked on it? Probably not recently. The number of people who will leave your site because of a citation/attribution is minimal and is important to remember that people will leave your site eventually, anyway.

My recommendation for proper attribution of content is to enclose the text in quotes, indicate within the paragraph who said it and add a link to the source. There is no need to link to the source every time you mention them but you should mention them each time you are citing something they said. Therefore, it is clear you are not misappropriating content and passing it off as yours.

How to provide proper attribution to content from others

How to provide proper attribution

If you decide to reword content but still share someone else’s original concept, add the name of the source to the paragraph and link it to the source page from there. Here’s an example:

Example of proper attribution. How to deal with duplicate content

Another example of attribution

Still worried that people might leave your site or that you will lose Google juice if you adhere to these practices? Let me assure you, you will get the exact opposite.

Penalization for duplicate content may not be too severe right now but wait a couple of years and try to save your site from another “Panda.” As far as people leaving your site, think about yourself, do you click on every link a page offers or do you keep on reading the article? If you are still afraid, improve your content, provide greater value to your reader and get rid of your fears.

If you decide to copy the whole article (and I strongly advise that you do not, especially if it’s one of my articles), I highly recommend citing the source at the very beginning and at the end. And, for your own protection as well as respect to the person that wrote the article, canonicalize the URL by pointing to the original URL. Then you can offer the content to your readers while letting search engines know which is the original article. Don’t know what “canonicalize” means? Don’t worry. Keep reading.

<link rel=”canonical” href=”” />

If you have a WordPress site and have the Yoast SEO plugin installed, add the URL to the canonical field.

Has somebody scraped my articles?

There are several ways to find out if your content has been scraped. A Google search, is usually my very first check, but you can create a Google Alert out of that search to be alerted when somebody infringes your copyright or scrapes your site. Another great tool is Copyscape. Their free version will allow you to identify if your content has been duplicated on other sites.

Notice of infringing content removed by Google

Infringing content removed by Google

Here’s a result for one of my articles, copied in its entirety by the first site. Yes, I have requested they take it down. I sent them an email and copied their hosting company. Did they? No, and a month passed by. What are your rights then? Submit a request reporting them to Google thanks to the DMCA (Digital Millennium Copyright Act). Does it work? Check it out on your own. 😉 A short time after  I reported them, the page does not show up on search anymore and there’s a very nice footnote from Google about it.

The notice also gets posted to the Chilling Effects database. Chilling Effects is a project of the Berkman Center for Internet & Society at the Harvard University and collects notices of copyright infringement from the web.

Have I given you a great reason not to infringe copyright and to quote, provide proper citations and use lots of “according to…” and loving links instead? Good. Nothing that is not yours should read as if it was yours.

Karma’s a Bitch

How does scraping somebody else’s content affect my life? Simple: infringing copyright is a crime and you may end up in civil court. Have you scraped content and would like to find out if Google has filtered you out? Add “&filter=0” at the end of the search query.

Content scraping SEO penalties

Content scraping SEO penalties

Non-malicious duplicate content and how to address it

We all have duplicate content… duplicated by ourselves!!! Some of these types of duplicate content are very common. One of the most common types are the printer-friendly pages. You know, those pages that pop up without any page formatting so people can print the page? If not properly addressed, these are exact copies of the original content. Always make sure that your printer-friendly page does NOT get indexed on search engines.

Do you understand URLs?

A rose by any other name is a rose. This is a great analogy to understand URLs or Unique Resource Locators. The URL is the true address of a page and it’s where search engines can find the content you have so painstakingly developed.

In real life, every home has an address that’s unique to that home. We can append modifiers to it, like “the last one on the block” or “the one with the blue door” but people know how to resolve these directions and end up on the same address. The problem is that search engines need a bit more direction than that. Let me show you some examples of URLs that we know are the same but search engines understand them as different: and and and and and and and

Session ID’s, URL parameters, page printer-friendly versions and even a backslash at the end of an address are interpreted as a different URL by search engines, if proper directions have not been given. To complicate matters more, think about those pages that can be displayed under a couple of categories, if the category is part of the URL. For example, an article that can be found under social media and SEO. There are many more situations where this type of content duplication occurs as this is not an exhaustive list.

Canonicali…. what???

Here’s comes the concept of canonicalization. A tongue twister on its own (try to conjugate the verb really fast!), it ends up being much forgotten by developers and SEOs alike. Not an easy concept unless you have some technical knowledge but I’ll try my best.

Canonicalizing a URL is the equivalent of adding signals for search engines that state, no matter what the address looks like, if this content is the one displayed, then this is the address of the original and indexable content because it’s the original version.

The good news for those on large platforms like WordPress, Shopify, Drupal, Joomla, there are plugins and apps that can help with it. Otherwise, you need to add the canonical tag to the head section of the page. This indicates to search engines which is the original version of the page.

<link rel=”canonical” href=”” />

Canonicalization is key on learning how to deal with duplicate content. A word of caution: do not use canonicalization when a re-direct is needed and search engines may choose to ignore your canonical tag.

What are canonical url tags? Find out!

What are canonical url tags?

And the duplicate content saga continues

URLs are not the only way of unintentionally creating duplicate content. Repeating paragraphs all over your site to emphasize a concept is a great way to tell the search engine that you don’t know which page is the most relevant for it.

Another great way of generating duplicate content is adding your bio to all of your article footers, even though so many website owners feel proud to see their bio there. Do you think it’s a good idea to disseminate the bio that is on your site to everybody that requests your bio? Absolutely not. I make it a point to create a different bio for publishing on other sites. Some of them a bit more alike than others, but definitely different than the one I publish on my site. This is why it really upsets me when somebody scrapes my website bio to add to the sites where I collaborate. Yes, it’s a shortcut and you may think nothing of it. But if somebody is collaborating with you for free, shouldn’t you just ask for them to also provide you with the bio they want published?

Now, let’s tackle content syndication. We all want to see our content shared all over the web. Hey! Let’s plaster it everywhere, what do we care? NOT! When you syndicate your content you are creating copies of it, exact duplicates. Mmmm.. which one is the original one that should be indexed by search engines? I wonder. Back to canonicalization? But how can you control other people pointing at your URL? Do they even know how to do that? Maybe you can ask for their article to carry a no-index tag. And maybe you should only syndicate a particular, different version of your article. Add a slight spin to it and syndicate.

Duplicate titles, descriptions and snippets are another great way to generate duplicate content. Think about it for a minute. If I show you two articles with the same title and description, which one will you choose? They must be the same, correct? But search engines add other factors in order to determine that they are one and the same like the URLs, and thus consider the page to be its own duplicate.

If you have a Google Search Console account, you can identify most of these pieces of duplicate content under Search Appearance >> HTML improvements.

What Duplicate Content Boils Down To

Going back to the physical address analogy, there could be many ways for someone to indicate how to get to the same house, but search engines are not people and they will think each unique description is a different house altogether.

Generating confusion for search engines is not where you want to be. First, because search engines will display only one of your “many pages” as they are identical in content and the search engine has a hard time determining which one is the most relevant. Second, people may link to the different URLs and this reduces the authority of your page.

I suggest you begin by addressing a list of duplicate content with the implementation of canonical tags, using 301 re-directs when needed, linking back to the original content, utilizing Google’s URL parameters tool and Bing’s Ignore Parameters Tool, improving your URL structure and avoiding the creation of duplicate content whenever possible.

I hope this has been a helpful little guide on how to address duplicate content. It is no means exhaustive. There are many other amazing SEOs that have written about it in much more depth. But feel free to ask questions in the comments section below. I’ll do my best to address them as best I can.

Doing the impossible quote

Doing the impossible quote

Next Quote? funny inspirational quotes on every post!

5 International SEO Tips You Can Steal – ISEO

SEO can get a little confusing.  After all, there’s SEO, local SEO, International SEO and on and off-page SEO – do you know the difference? Well don’t worry, this article will give you a snapshot of the basics of International SEO (ISEO) and help you leverage it to better generate traffic, leads and drive sales.

If you are ready to tackle more than one country on your sales efforts then it’s time to consider developing a strong International SEO strategy and execution.

5 international SEO tips you can steal by Target Latino

5 international SEO tips you can steal by Target Latino – Original Photography Courtesy Ryan McGuire

Assessing your international SEO current status

Begin by assessing the level of effort needed as you may already be driving organic traffic from the countries you are intending to sell to. Many companies decide to start their sales efforts to Latin American countries when they notice an increase in traffic and sales requests from them. Others begin with globalization as a goal as they want to capture their share of international traffic and sales.

Check your traffic analytics   

First, take a look at your international traffic and find out where it’s coming from and their language.

Google analytics can provide you with a wealth of information. You can identify the language and the geographic location of your website visitors by accessing the Language and Geo reports under Audience.

While you are in Google Analytics, you might as well check for traffic volume by country and under Acquisition by type of traffic:  referral, organic search, direct and social. We can safely assume that you are not running any search engine marketing (SEM) or Pay-Per-Click (PPC) campaigns yet, but if you are, then add this to your international SEO assessment as well.

Here you can see the estimated traffic from a website which receives the majority of its traffic from Mexico, Argentina, Spain and the United States:

How to assess International SEO traffic guide. Don't miss out! - Alexa rankings for Spanish Website

International SEO traffic assessment – Alexa rankings for Spanish Website

(Source: Alexa Report and rankings)

This website that is very well positioned in the countries listed, but we still need to figure out if these the countries we want to do business with, what the current traffic volume is to the website and, above all, what is their online market size.

Discover the Market’s Search Size

The next step in this assessment is the analysis of the potential search market. That is, how many people are searching for these products or services in average every month.

Keyword research

You can use Google AdWords and segment by country and language to see how many  searches your products and services are getting each month:

Average number of monthly searches for Zapatos Nike in Mexico - 5 International SEO Tips

Average number of monthly searches for Zapatos Nike in Mexico

Most people would translate “Nike Shoes” as “zapatos Nike” and  assume the size of this market is too small in Mexico as there’s only 1,600 average monthly searches for the branded term “zapatos Nike.”

Google Translate for Nike Shoes to Spanish brings us only the most common translation

Google Translate for Nike Shoes to Spanish brings us only the most common translation

This would be a huge mistake for those attempting to gauge the size of the Mexican market for these shoes, as the appropriate main keyword to search for in Mexico is “tenis Nike,” a term that has an average of 74,000 searches per month. And this is just the tip of the 900K monthly searches for sports apparel in Mexico.

Needless to say, International SEO planning and execution demands an understanding of language and culture coupled with literary flair for increased engagement and technical knowledge for SEO implementation.

International SEO planning and execution demands an understanding of language and culture

International SEO planning and execution demands an understanding of language and culture

Knowing the size of the total market will help you estimate your current market share of the online search market for your particular product or service.

Don't miss: What is inbound marketing and why should you care?

International SEO: 5 Tips to Get you Started

Ok. You liked what you saw and you are ready to begin planning for International SEO. There’s several factors to consider before you dive right into the everyday SEO activities.

1Country or Language Targeting?

This is a defining factor for your international SEO or ISEO implementation. It’s best to think through it now and now wait until you have to re-engineer your whole site.

When you deal with a multilingual (or even with one additional language with country nuances) there’s the question of developing a strategy around a language or a country.

Both strategies have advantages and disadvantages.

Targeting by Language

On the pro side, the overall cost of implementation is lower because you can have local hosting, no country specific top level domains (ccTLD’s,) and a significantly lower expense in language SEO and content developers.

The disadvantages to targeting by language are somehow related. First, for search engine optimization, ranking becomes more difficult  – not impossible – as Local SEO has taught us that search algorithms are generally built around geographical proximities to the searcher, and at a global scale this is no different. Second, as we saw on the Nike Shoes example above, dialects and local language usage and preferences need to be taken into consideration to be able to rank even on countries that share the same language.

Targeting by Country

The advantages to targeting by country for International SEO begin with the biggest disadvantage for targeting by language. Search engine optimization can be implemented to its fullest simply by following Google’s recommendations, and with in-country hosting coupled with ccTLDs it becomes easier to rank better than a lower number of in-country competitor’s. This also allow us to perform better keyword targeting and keyword alignment, provide local signals to search engines and increase the trust factor amongst our visitors (although this last factor needs to be further evaluated in Latin America as people may trust a non-localized TLD best.)

Of course, if properly implemented, you can leverage off your ranking success from country-specific International SEO solid execution to grow other countries much faster.

This sounds too good to be true, right? Well, here’s the cons. If you are not careful enough on its implementation, you may end up duplicating content. And if you work on SEO, you know of the impact of duplicate content when it comes down to Google or any other search engine. For those of you who would like to learn more, I recommend this Moz article on Duplicate Content.

We will expand on the use of language targeting and country targeting on a future article on hreflang implementation.

The other disadvantage of country targeting for International SEO is the cost. Not only will you eventually need hosting local to the countries and several ccTLD’s, but the maintenance of several websites and development of SEO optimized content for each country/language.

There are many more factors to take into account when deciding on one versus the other, but the ones mentioned above, are of paramount importance.

If you are not extremely serious about going global, you may cut corners by targeting by language.

“Targeting by country is true localization, competition is reduced, and crucial trust is gained. Yeah, it will cost you a lot more in the short term, but in the long term you may just win a market that thinks your brand is homegrown.” – by Michael Bonfils at Search Engine Watch

The truth is, the more targeted you become, the better you can address the needs and cultural singularities of your audience.

2Developing your URL structure

Now, that you have made a decision between country targeting or language targeting, it’s time to define your website’s url structure.
Country targeting

Businesses who choose to go with a country specific URL structure should ideally have their top-level domain (TLD) specific to their country (Country Code Top Level Domains or ccTLD).

For example: for the Mexico online property, and; for their Brazilian online property.

These can be implemented on 100% separate sites and hosting servers or, less ideally, in sub-folders or sub-domains where the ccTLDs then point to.

Language targeting

There are also three choices here and their order of preference is the exact same, although you can definitely get away with organizing in sub-folders for language targeting. As a matter of fact, even a WordPress multi-site is a group of sub-folders or sub-domains – usually depending on your hosting WordPress installation settings – that you can use for language targeting.

Here’s another one of our clients that was implemented with language targeting on WordPress and different TLDs: hosts the English version of the site, and; hosts the content in Spanish for the U.S. and Latin America.

And here is the Apple website, that uses the domain with sub directories for various countries and regions in a mix of country and language targeting: (LA for Latin America, includes Argentina, Paraguay, Costa Rica, Dominican Republic and others)

This allows them to roll out certain products by country, language and region. Notice how the iWatch is available in Mexico but not the rest of Latin America.

Apple Latin America website language region targeting and Mexico country targeting example

Apple Latin America website language region targeting and Mexico country targeting example

3Use of the rel=”alternate” hreflang=”x” annotations

It’s very important to let Google know that there are pages that correspond to each other – meaning, they have the same content – but they are localized for a different language and/or country. You don’t want Google to see you as duplicating content willy nilly. We will expand on the use of hreflang on a future post.

4Localization Signals to Consider

Both local SEO marketers and global SEO marketers should care about localization. Understanding local culture, colloquialisms, and use of language are key when developing optimized content for these locally or internationally targeted websites.

Of paramount consideration then, are the different signals you can provide with in-country addresses, currency, international phone numbers, etc. You may also setup Google Webmasters to geolocate in your target countries (beware, you can only set one country at a time) and dedicate efforts to in-country link earning – emphasis on earning. 😉

Here are the results of a strong International SEO strategy and execution with language targeting – with a twist – for Latin America.

SEM Rush organic traffic from top 20 organic search results for

SEM Rush organic traffic from top 20 organic search results for

SemRush report for http:// showing organic traffic from top 20 organic search results from 2014 to April 2015

5Other Things to Consider for international SEO

As a brand or a business owner the focus is on running the business, not spending months mastering SEO. As a result, many businesses tend to outsource their SEO efforts to digital agencies who may not always be the best option.

When focusing on optimizing for ISEO, some brands will select Language Service Provides (LSPs) that will sometimes offer ISEO services.

The ideal combination for International SEO is a digital agency with an understanding of culture that specializes in linguistics.

This is because:

It will have a number of linguists based on your ‘target locations’. They will understand your business and know how to deliver your message to generate the best impact locally.

Most digital agencies wouldn’t be able to spot these translation subtleties, because they don’t have the staff to do it. LSPs have linguists and be able to turn out highly engaging pieces of content but lack the technical expertise SEO demands nowadays.

An agency that specializes in both, has linguists who specialize in SEO and know which words are more commonly used in their respective country. They can help you find keyword niches that you’d otherwise never have known about. This type of SEO agency will also be better equipped to help you “earn” natural backlinks to your websites and assist you with International SEO strategy development and implementation.

I hope this has given you a deeper insight into International SEO and when your business should use it. If you have any questions, feel free to contact us today for more information.

Read more about our International SEO services.

6 essential Pinterest SEO tips to develop natural backlinks to your site

Quote of the Day -Never dim anyone else s light so that you can shine

Quote of the Day -Never dim anyone else s light so that you can shine

Next Quote? funny inspirational quotes on every post!