Content is the lifeblood of internet traffic. Be it commercial, informational or navigational surfing, every searcher is only online to find and consume content.
What makes content visible is the extent to which it satisfies the underlying search intent behind sets of recurring queries, and this is why great content is the key to all SEO success.
The SEO problems arising from content theft stem from the way search engines manage the relative visibility of websites in their index. The engines place a priority on the match between documents and queries, but this can become of benefit to content thieves since they can eat into your visibility by duplicating information that should be unique to your website.
The losses from such duplication are of enormous ramifications, but for the sake of simplicity, I have classified them into five major groupings which are;
1. Lost Backlink Opportunities
Backlinks are critical to SEO success because of Google’s PageRank algorithm. Using PageRank, google determines the authority of a website based on the number of backlinks it has gained around a particular topic. Since backlinks are only acquired through content, a prolific content theft operation could cause backlinks you deserve to be ascribed to other websites. This can slow down the accumulation of PageRank and the growth of your sites’ domain authority. In this case, your competition would be gaining ranking power on your own efforts.
2. Diminished keyword relevance
Google processes over 40,000 search queries per second and about 3.5 billion per day. All of those queries come with permutations of text that are mapped to different search intents. Your content is your tool for capturing keyword relevance based on the kind of topics you cover, but with content theft, you end up having to split this keyword relevance with a host of different websites, some of which may have a higher Domain rating that yours. When this happens, instead of ranking for 1000 keywords, you may end up only being relevant for 650. This is not a good situation to be in, but this is what content thieves can do to a website.
3. Lost potential for Organic & Referral traffic
This is a consequence of the two points listed above. Organic traffic is traffic that you get from search engines, while referral traffic are the visits you get from backlinks. If you lose keyword relevance to content scrapers, you would also lose search engine visibility and your organic traffic will drop. In the same vein, if you lose backlinks to valuable content that you’ve worked hard to create, you would also lose out on the visits you should have gotten through those links. All in all, the number of web visits would decline steadily until you do something about the plagiarism campaigns being launched against your site.
4. Risk of Algorithmic penalties
In 2011, Google launched an algorithm (Panda update) that enabled it to penalize websites with thin content, autogenerated posts and plagiarized work. While this is a positive development, it also comes with certain risks for websites with content that is widely duplicated on higher authority domains. So if your site is relatively new or just started publishing around a topic, depending on your site health and crawl budget utilization, scenarios could arise where Googlebot is unable to index your content first even though you are the original owner. This can cause your site to be marked as a plagiarizer by the panda algorithm, leading to ranking problems for all other types of content published on your domain.
5. Slower aggregation of positive user engagement signals
In 2007, Google was assigned a patent called “Modifying search rankings based on implicit user feedback”. The user feedback Google uses to modify rankings are click through rate (number of impressions/number of clicks on a search results page), the bounce rate, time on page, scroll depth, pages per session, direct visits, bookmarks etc..
If you are losing keyword positions, organic traffic and referrals, you will lose a slice of visitors who would not bounce, who would scroll far down the page, who would spend time on your site, who would bookmark your pages and become direct visitors. This means that all of the positive engagement signals they could have contributed to your rankings would be perpetually lost to those who are stealing your content. This is why you must take action against content thieves.
How content theft occurs
Now that you know the implications of content duplication across domains, you may be curious about how to stop it. The truth is that you cannot stop theft with a one size fits all approach, rather, your content protection strategy must be tailored to the different tactics that can be used to pillage your intellectual property. The different content duplication tactics are of three main types which are;
- Manually theft
- Automatic theft and
- Indirect theft
Manual content theft is done by right clicking and copying content on a page, downloading videos and saving crisp images. This type of content theft is equally hurtful, but it’s slow pace diminishes the scale of impact because of the time gap between when the content is produced, when it is copied and and when the infringing actors are able to republish it. Manual theft is the most common type but it is also easy to mitigate against. For example, you can disable the right click button on your front end using plugins like Right click disabled for WordPress or WP Content Copy Protection & No Right Click
Automatic content theft
This is usually the most hurtful because it enables your posts to be duplicated as soon as they are published. This is incredibly risky because of the likelihood that the plagiarizer’s site is more crawlable than yours. In this scenario, the search engines will index the plagiarized version before the original on your website. If this continues to occur, Google’s panda algorithm might penalize your site, and you will just wake up to a sudden loss of traffic. Such advanced content duplication is carried out by using scrapers that mine a target website for useful information. To prevent content scrapers from stealing your content, you might consider blocking the bots in your htaccess files or by blocking out their IPs altogether. To detect the scraper bots as user-agents or through their IP addresses, you may need to conduct a log file analysis on your server.
Indirect content theft
This usually occurs through your RSS feeds. This allows a recipient site to access your content and automate it’s republication. In some instances, AI powered content spinners like Chimp Rewriter or X-spinner are used to rewrite the stolen copy in order to make it more unique. To prevent this from continuing to happen, you could switch to a summary RSS feed. This will prevent the full availability of all posts on your site to the plagiarizer.