What to do about Site Cloning?

What's the difference between content cloning and content syndication? Is it in the intent? Is it in the process? Or is it something else entirely? Personally, I see this breaking down into three main areas:

First, there quite a few sites out there which are re-syndicating content from blogs, sites, etc. Some – such as Planet MicroISV from Baruch Even – are a great service to the community. The site monitors a huge swath of the software development community and shares it with others. I personally make a point of visiting it on a regular basis and appreciate it.

Next, there are splogs (Ugh… I hate that name.). These sites simply take the content of others, rebrand it as their own, and attempt to get you interested in clicking on their AdSense words. I have a number of keyword monitors on Technorati and these regularly make it into the list. I'm not sure what Google, et al are doing to fight this sort of garbage, but they're smart people, so they'll come up with something. Regardless, the line between these and Planet MicroISV are purely intent, not technical at all.

Finally, there is a third category. These are supposedly legitimate people and companies which take another site's content and claim it as their own. Sometimes they adjust the colors, layout, etc, sometimes they don't. Sites like this concern me. If you believe – like I do – that Google, et al will figure out a way to raise the bar – these sites can cause fundamental problems for your/my/everyone's organizations. The obvious next step is to start comparing the dates of files, sites, etc, but the short term effect could be disasterous. For many organizations, if their site takes a hit in terms of rankings or – worse yet – simply disappears their businesses would simply fold in no time.

So as web developers helping customers upgrade and update their sites, what do we do? Is there a way to combat this? Do we simply contact the ISP's of the offenders to get them shut down? How do we demonstrate which site actually owns the content?

Related Posts