Data Feeds and Duplicate Content
Hi Jill,
I have a question for you. I know you are busy, so I’ll try to keep it to the point.
A
lot of my clients pull data/products from semi-public databases to
populate their websites, similar to how real estate agents show
listings of homes on their sites. I’ve been ensuring that each client
has unique valuable and professionally edited content whenever possible
on the rest of the site, but I’m afraid that if I make the portions of
the site that use the semi-public data accessible to the search
engines, they will find duplicate information on other sites and my
client site(s) would not be indexed.
So, based on that “fear,” I
have blocked off access to the robots (as much as can be done) to avoid
them indexing the pages that have these data feeds and the
corresponding details.
So my question is...
Should I go
to the extra length to derive unique content to each product or would I
just be spinning my wheels due to the engines detecting the similarity
of the pages anyway?
Thank you for your time and I have always enjoyed your articles.
Regards,
Alexander
Jill's Response
Hi Alexander,
So many people have a misunderstanding of the whole duplicate content issue.
It’s
fine to allow the search engines to index that content. The search
engines wouldn't drop or refuse to index an entire site just because
some pages had information that was also contained on other pages.
That's a common scenario that they know how to deal with appropriately
(for the most part).
The worst that will happen is that the
search engines simply won’t index just those particular duplicated
pages, or if they did, that the indexed pages wouldn't show up in the
search results for their optimized keyword phrases. However, if you
block the search engines from indexing any of the content via
robots.txt, they definitely won’t index that content.
What I
would recommend is wrapping your own unique content around the
database-pulled content. In other words, you’d add some copy before
the listings (or whatever happens to be in the data feed), and perhaps
after the feed info as well. This would provide you with the best
chance of having those pages indexed and possibly showing up in the
rankings for the keyword phrases for which you choose to optimize.
Hope this helps!
Jill