Spam, spam, risk factors and spam at Google…
The New York Times had a long but interesting piece on Friday on how J.C. Penney (JCP) has apparently been gaming Google’s search results. And, wouldn’t you know it, Google (GOOG) tweaked the language it used to describe that very problem in the risk factors in the 10-K it filed the same day.
It came in a revised risk factor that, for all the references to varieties of spam, reminded us more than a little of the infamous Monty Python spam skit (or listen to an excerpt). The new language not only offers a glimpse into the company’s cat-and-mouse game with those who would subvert Google’s search results for fun and profit, it also seems to suggest that the company is more worried than before about holding its own. (That pretty much fits with the experience of anyone who’s tried to search for information about a common product — but isn’t interested in buying it online.)
While the company as recently as its 10-Q on October 29 warned that
—Index spammers could harm the integrity of our web search results, which could damage our reputation and cause our users to be dissatisfied with our products and services.
There is an ongoing and increasing effort by —index spammers— to develop ways to manipulate our web search results. For example, because our web search technology ranks a web page’s relevance based in part on the importance of the websites that link to it, people have attempted to link a group of websites together to manipulate web search results. We take this problem very seriously because providing relevant information to users is critical to our success. If our efforts to combat these and other types of index spamming are unsuccessful, our reputation for delivering relevant information could be diminished. This could result in a decline in user traffic, which would damage our business.
Now, Google offers a broader warning against —Web spam and content farms and continues on this more somber note:
—Although English-language web spam in our search results is less than half of what it was five years ago, and web spam in most other languages is even lower than in English, we have nonetheless seen an increase in web spam in recent months. As part of our efforts to combat web spam, we recently launched new indexing technology that makes it harder for spam-like, less useful web content to rank highly. We have also improved our ability to detect hacked websites, which were a major source of web spam in 2010. We face new challenges from low-quality and irrelevant content websites, including —content farms,— which are websites that generate large quantities of low-quality content to help them improve their search rankings. In 2010, we launched two algorithmic changes focused on low-quality websites.
That pretty much matches what the New York Times describes in its J.C. Penney piece, and others have observed too; the new indexing algorithm has gotten some attention as well. (And yet… a quick search for articles about the new indexing algorithm turns up some of what looks like the kind of spam they’re trying to avoid.)
The rest of Google’s new risk factor is mostly the same boilerplate about potential damage to the company’s business, plus this tidbit:
—In addition, as we continue to take actions to improve our search quality and reduce low-quality content, this may in the short run reduce our AdSense revenues, since some of these websites are AdSense partners.
Fun stuff. For what it’s worth, it doesn’t look like the Google filing goes into the company’s grievances against Microsoft (MSFT) over its Bing search results.