When people talk about duplicate website content, they're referring to when the same content shows up on more than one page of your website or when you use content on your website that can be found on other sites on the web. This includes, but is not limited to:
In 2015, RavenTools did an anonymous study of 888,710 websites their software crawled that year and found an average of 29% of all pages were considered duplicate.
How does this happen?
Variations of a URL can happen a couple of different ways. One way is if you're using any kind of click tracking software, or another way is if your site assigns a unique session ID to individual users.
In both cases, additional characters will be created and added to the end of the original URL. An example would be www.website.com/product. When adding click-tracking software or providing a search function on your site, you’ll notice the URLs will display something like this: www.website.com/product?cat=unitedstates&state=idaho. Both URLs contain the exact same content, page title, meta descriptions, etc.
These types of duplicate content issues can be avoided by not allowing URL parameters to be added. Ask your website developer if this is something your CMS (content management system) can handle.
HTTP vs. HTTPS (or www vs. non-www)
Fairly self-explanatory. https://www.website.com, http://www.website.com, www.website.com and website.com are all the same home page, with the same content. Avoiding this kind of duplicate content can be remedied in Google Webmaster (Search Console) by setting your preferred domain.
What About Social Media Pages?
If you have Facebook, Twitter, Google+ and Instagram business pages, good for you! If you copied and pasted the content on your about page to all of these platforms, demerits for you.
While there may not be a "penalty" for this, adding the same content to all of these platforms is never a good idea. Each platform speaks to individual audiences, and your biographical content should be crafted for each audience and what they care about.
It’s wrong, immoral, illegal, and just plain lazy.
Don’t do it. Just don’t.
However, there is also accidental plagiarism that can affect your search engine rankings. Writing about a hot topic, or a topic that has been written about by many people increases the likelihood that you may unknowingly plagiarize other's content. The best way to avoid this mistake is to run your content through a plagiarism checker like Grammarly. My rule is to stick to an unoriginal text score of 5% or lower. (If you’re wondering, this blog scored a 1%)
Syndicating content refers to letting third-party websites publish your content. While this is a great tactic for building your site’s creditability with search engines, if not done properly, puts you at risk for diluting the search engine results and not having either version found.
If you choose to have your content syndicated, make sure to ask the publishing website to use "no index" or "no follow" tags or to use cross domain canonical tags that point back to your original content’s URL. This will let Google and the other search engines know what to index and where to give credit for the original content which will give it more authority, therefore a better chance at ranking high in the search engine result pages.
Never forget that in the age of web marketing, content is one of your most valuable currencies. Spend the time to avoid the pitfalls of duplicate content.