Even if a knowledge worker were to enter a perfectly-formed query into Google’s search box, there is still a high degree of probability that he would not find what he was looking for. While, in the past, providing the best, most accurate search results and winnowing the wheat from the chaff had more to do with not providing out-of-date and/or non-relevant information than anything else, today that is no longer the case.
In the Web’s earliest days, once marketers figured out how to game the search engines of the time with keywords that had nothing to do with the content on the associated Web page, search had become fairly useless. Indeed, it was the arrival of Google on the scene in 2000 <check date> that made Web search useful again. Search engines including Google continue to be overwhelmed by various forms of Webspam; as soon as Google and others vanquish one form of spam, however, another rears its ugly head.
Paul Kedrosky, founder of GrokSoup, one of the first hosted blog platforms, an analyst on CNBC and currently publisher of a financial blog , tried to use Google to help him select a new dishwasher in December of 2009. His conclusion (which was dispatched as a tweet): “To a first approximation, the entire web is spam when it comes to appliance reviews.”
In “Dishwashers, and How Google Eats Its Own Tail,” an article that was published in InfectiousGreed, on December 13, 2009, Kedrosky explains how Google “has become a snake that too readily consumes its own keyword tail” as marketers exploit Google’s technology to rise to the top of the search results page, regardless of whether they deserve to be there or not.
As a result of this attack of sorts, Web search’s usefulness is once again in question as low-quality, unreliable, and frequently plagiarized content continue to supplant legitimate entries in search results.
It appears that getting the right answer or response from Google has never been harder, and even Google is taking notice of the problem. On January 21, 2011, writing in the Official Google Blog, Matt Cutts, the head of Google’s Webspam team, wrote that there had been “a spate of stories” [examples include "On the Increasing Uselessness of Google," "Why We Desperately Need a New (and Better) Google," and "Content Farms: Why Media, Blogs & Google Should Be Worried"] concerning Google’s less than stellar search quality.
Google is fighting the war on Webspam on three fronts and this is no mean feat. Today, Webspam comes from three main sources: 1.) individual Web pages with repetitive spammy words and phrases, 2.) hacked Web sites, and 3.) content farms (sites that either produce shallow content of their own or copy other sites’ content). The topic of Webspam could fill a book unto itself but it is worthy of this brief mention because it brings Information Overload to a whole new level in terms of making it harder for knowledge workers to find what they are looking for.
If we were to repeat our study of the failure rate for searches, Webspam would most likely result in an increase in the number of searches that actually failed but were not recognized as a failure by the searcher. This is a dangerous trend if it continues unabated.
The problem has even given rise to a new type of search engine. Blekko, “a better way to search the web,” uses slashtags to cut out spam sites and my tests of the search engine show that it provides significantly less spam in its results.
Google has heard “the feedback from the web loud and clear” and believes that the company “can and should do better.” It announced plans (via Cutts’ post) to combat all three types of Webspam, including making changes to its algorithm and giving users a means of providing “explicit feedback” on spammy sites.
Jonathan B. Spira is CEO and Chief Analyst at Basex.