Why would you complain with Google? It's doing what it's supposed to do: index t...

Why would you complain with Google? It's doing what it's supposed to do: index the web. Should we expect Google to know where the data was originally published? I do, because it's an important metric in determining PageRank besides the number of inbound links. But that doesn't mean Google shouldn't index it. It might simply be intended as a mirror of the first site.

In a recent court case in the Netherlands, some company A filed a complaint against a website because it ran a story about another company B that went bankrupt and mentioned the plaintiff in an unrelated story on the same page. Searching for the plaintiff's company name and "bankruptcy", Google would show you a summary that looked like company A had gone under. Does that make the website responsible? Is it Google's fault? The judge decided the website should take responsibility and fix it. I think that you're wasting your time when you're searching Google to find out whether your company has gone bankrupt.

In the case of spamxyz.org: report it as spam. In the case of the Dutch court case: use your common sense. In the case of newspapers crying about summaries in the search results: use robots.txt. But people should stop pointing at Google to fix all the problems on the internet. They could do a lot better, but there are plenty of scenarios in which you don't want Google to do decide on their own whether they should show a site in their search results or not.

PS: regarding Wikipedia data: http://en.wikipedia.org/wiki/Wikipedia_database