» Archive for the 'Search' Category

Poor craftsmen blame their tools

Thursday, August 25th, 2011 by Cody Burke

Search 101?

Blaming search tools for failures in search is common, and more often than not without merit.  It often seems as if Google has failed us when we search for content that we know exists, yet the search results taunt us by refusing to reveal what we are looking for.

The truth however, is that we are simply incredibly bad at using search tools.  Two items came to my attention this week that demonstrates our ineptitude at finding what we are looking for.  The first was a statistic derived from research by Dan Russell, a search anthropologist at Google, who was interviewed by Alexis Madrigal for the Atlantic.  Russell studies how computer users perform searches, and found that 90% of computer users do not know how to use CTRL + F (PC) or Command + F (Mac) to search through a document or Web page.

The time that is surely wasted by not using this simple technique is staggering.  (Just to make sure we all are out of the 90%, this shortcut to the Find function allows you to search locally for keywords in a text document, PDF, e-mail, or Web page.  Try it out and join the 10%.)

The second piece of the puzzle comes from reports on the findings of the Erial (Ethnographic Research in Illinois Academic Libraries) project.  The project was conducted over two years and employed both anthropologists and library staff members to examine student research habits and attitudes towards libraries and librarians.

The findings from the study will begin to be published this fall, but Steve Kolowich, writing for Inside Higher Education, notes that the researchers found that students were even worse at search than they had anticipated.  In his article, What Students Don’t Know, Kolowich discusses some of the findings, including the following;

- Only 7 out of 30 students observed at Illinois Wesleyan conducted what a librarian would consider a reasonably well-executed search.  (Presumably this encompasses such criteria as the use of appropriate search engines, Boolean logic, and authoritative sources.)

- Students mentioned Google more than twice as many times as other available databases in interviews, yet had no understanding of Google’s search logic and were not capable of formulating searches that would return good results.

- Of the students who did turn to non-Google scholarly databases for research, 50% ended up using a database that would not have been recommended by a librarian for their search needs.

- Only 3 of 30 students that were observed formulating search queries used additional keywords to narrow search queries.

- The researchers observed that students would end up with either too many or too few search results, and would often simply change their research topic to one that was easier to find information on.

It is staggeringly clear that there is a need to teach basic search techniques to students, who, without instruction, will likely go on to become part of the 90% of computer users not using simple page search.  We have created powerful search tools and technology (compared to what previous generations had available) that are capable of so much, yet we are clearly neglecting to teach proper usage and even basic understanding of search mechanics.  Even much needed advancements in search technology will not help if we are not even capable of using the tools we have now to their full potential.

Cody Burke is a senior analyst at Basex.  He can be reached at cburke@basex.com

Google +1: Does Search Need to be Social?

Thursday, April 7th, 2011 by Cody Burke

The searcher can see that one user has given the first search result a +1, and can click on the +1 button themselves to recommend any of the search results.

If you have a public Google profile, “Google +1″ is a new feature that can be enabled on your account.  Essentially a clone of Facebook’s “Like” button, +1 is an attempt by Google to harness social search, an elusive concept that purports to improve one’s search results by factoring in the opinions and preferences of one’s social contacts.

Google +1 allows a user to recommend Web pages by clicking on the +1 button next to the site after conducting a search.  There are further plans to have Web sites include the +1 button on actual sites, much like Facebook’s increasingly ubiquitous Like button.  If the feature is enabled and the user is logged into his Google account, search results will show relevant pages that the user’s contacts have given the +1 approval to.

Google’s motivation for providing this service is somewhat self-serving: by gathering even more information on user behavior and preferences, the company will be able to further refine its targeted advertising.  In theory, there should also be a benefit for the user, as more relevant search results are elevated as other users, specifically one’s contacts, recommend them.  Additionally, the new feature may help to clamp down on Web spam in search results, by pushing relevant and useful results higher in the search results.

Improving search is critical because unsuccessful searches represent a huge and costly problem; ca. 50% of all searches end in failure, and of those searches that the knowledge worker believes to have succeeded, a further 50% fail because they result in the knowledge worker unknowingly using incorrect or out of date information.  Searching also takes up ca. 10% of the knowledge workers’ day, representing a significant consumption of time.

The ability of +1 to improve search results is not yet clear, and may be severely limited by the relatively small number of people who have public Google accounts and who choose to participate.  Facebook has had success with its “Like” feature but the company also has a huge user base (600 million as of January 2011), which has allowed it to export the “Like” metaphor around the Web.

It is important to note that there are virtually no privacy controls with +1.  Sites that a user gives approval to are viewable by anyone, because the feature requires a user to have a public Google profile.  This means users must keep in mind that there is a record of their +1 sites, which is accessible via a tab in their public Google profile.  Even Facebook, for all its failings, has privacy settings to prevent a completely unknown person from seeing a history of a given user’s “Likes.”

In what appears to be a complete lack of foresight on the part of Google, the +1 feature was announced on the same day that the company settled its privacy complaint with the FCC over its Buzz service, which drew outrage for automatically connecting people based on a user’s Gmail contact list.  While +1 is different in the sense that users must have opted in to a public Google profile, some lingering privacy concerns remain.

Similar to Google’s previous lackluster entries into social software that include Buzz, Lively, Wave, and Orkut, +1 is a gamble for Google.  We as consumers and knowledge workers hope that it can improve Google’s search results and decrease the search failure rate, but if past social experiments by Google are any indication, it may be a challenge to get +1 to catch on.

Cody Burke is a senior analyst at Basex.

Watson Puts the Current State of Search in Jeopardy

Thursday, February 17th, 2011 by Jonathan Spira

Does anyone have a map?

Is the robot apocalypse closer than originally thought?  After watching IBM’s Watson computer crush Jeopardy’s all-star contestants this week, one must surely wonder.

More to the point, what are the various practical applications of Watson now that its dominance, at least as a quiz-show contestant on “Jeopardy!”, is unquestioned?

Watson was designed by IBM researchers, who suggested the idea of a “Jeopardy!” challenge because they considered the game show to be an excellent test of a computer’s ability to understand information expressed using natural language.

Watson could neither see nor hear, nor was it actually on the stage; an avatar that resembled a giant iPad represented it there.  Watson received all of its information electronically in the form of text at the same point at which the human contestants saw the answer.

Not surprisingly, the computer power behind Watson took up most of a large room and, due to the heat generated by the IBM servers, required extensive cooling as well.

Competing on “Jeopardy!” is a huge challenge, even for the most intelligent and well-read individual.  The provided answers are sometimes vague, make use of clever word play, and require extensive knowledge of a vast range of topics, from history and current events to art and pop culture.  Entering the clues into a standard search engine would result in complete failure.

Far from being a simple search tool, Watson is designed to use natural language processing and thousands of algorithms to understand the phrasing, grammar, and intent of a question, and then find the answer that has the highest probability of being correct.  Just as human players must rely on their own knowledge, Watson uses a content library of thousands of documents, reference materials, encyclopedias, books, and plays and is not connected to the Internet or other online databases.

Let’s look at a few of the questions that Watson had difficulty answering.

On Monday, the first day of the match, Watson was incorrect in the category “Olympic Oddities”, when the provided answer was “It was the anatomical oddity of U.S. gymnast George Eyser who won a gold medal on the parallel bars in 1904.”

Watson responded “leg”, which was ruled incorrect because Watson did not specify that it was the lack of Eyser’s leg that was the oddity.  David Ferrucci, the manager of the Watson project at IBM Research, explained that Watson might not have understood “oddity,” and “The computer wouldn’t know that a missing leg is odder than anything else.”  He also noted that given enough time to absorb more material, Watson would probably make that connection.

Another interesting incorrect response (in the form of a question) occurred Tuesday night.

In the category U.S. Cities, the provided answer was “Its largest airport was named for a World War II hero; its second largest, for a World War II battle.”  Watson incorrectly wrote “What is Toronto”, while both humans correctly wrote “What is Chicago,” referencing O’Hare and Midway airports.  The problem for Watson, according to Ferrucci, was that Watson had learned that the categories in the show often have weak relationships to the questions, and thus Watson placed only a weak emphasis on “U.S. Cities”.  In addition, he noted that there are cities in the U.S. named Toronto, and that Toronto has an American League Baseball team, all snippets of information that may have led Watson astray.  Watson only gave “Toronto” a 32% of being correct, and only bet $947, a tiny sum that did not expose it to the risk of losing its lead in the game.

Despite a few other missteps and miscues Watson dominated the matches played displaying an amazing ability to understand what was being asked and to quickly respond correctly.

How did Watson reach a final response?  Watson considered thousands of possibilities and ranked them.  The top three were displayed, with the probability it assigned to each.  That would be a welcome addition to a Google search, an honest appraisal of how likely each result was of being correct, or even better, its usefulness.

With Watson, at least on “Jeopardy!”, we moved one step closer to the type of search envisaged in science fiction (“Computer.  Does the king of the Watusis drive an automobile?”).  We won’t have the computer from Star Trek here tomorrow, or even next year, but the technology developed for Watson does bring us that much closer to developing search tools that will resolve search’s Achilles heel, the fact that it delivers results as opposed to answers.

Jonathan B. Spira is CEO and Chief Analyst at Basex.  Cody Burke is a senior analyst at Basex.

Search and the Quest for the Perfect Dishwasher

Thursday, January 27th, 2011 by Jonathan Spira

Search. Sit. Stay. I get my commands mixed up occasionally.

Even if a knowledge worker were to enter a perfectly-formed query into Google’s search box, there is still a high degree of probability that he would not find what he was looking for.  While, in the past, providing the best, most accurate search results and winnowing the wheat from the chaff had more to do with not providing out-of-date and/or non-relevant information than anything else, today that is no longer the case.

In the Web’s earliest days, once marketers figured out how to game the search engines of the time with keywords that had nothing to do with the content on the associated Web page, search had become fairly useless.  Indeed, it was the arrival of Google on the scene in 2000 <check date> that made Web search useful again. Search engines including Google continue to be overwhelmed by various forms of Webspam; as soon as Google and others vanquish one form of spam, however, another rears its ugly head.

Paul Kedrosky, founder of GrokSoup, one of the first hosted blog platforms, an analyst on CNBC and currently publisher of a financial blog , tried to use Google to help him select a new dishwasher in December of 2009.   His conclusion (which was dispatched as a tweet): “To a first approximation, the entire web is spam when it comes to appliance reviews.”

In “Dishwashers, and How Google Eats Its Own Tail,” an article that was published in InfectiousGreed, on December 13, 2009, Kedrosky explains how Google “has become a snake that too readily consumes its own keyword tail” as marketers exploit Google’s technology to rise to the top of the search results page, regardless of whether they deserve to be there or not.

As a result of this attack of sorts, Web search’s usefulness is once again in question as low-quality, unreliable, and frequently plagiarized content continue to supplant legitimate entries in search results.

It appears that getting the right answer or response from Google has never been harder, and even Google is taking notice of the problem.  On January 21, 2011, writing in the Official Google Blog, Matt Cutts, the head of Google’s Webspam team, wrote that there had been “a spate of stories” [examples include "On the Increasing Uselessness of Google," "Why We Desperately Need a New (and Better) Google,"  and  "Content Farms: Why Media, Blogs & Google Should Be Worried"] concerning Google’s less than stellar search quality.

Google is fighting the war on Webspam on three fronts and this is no mean feat.  Today, Webspam comes from three main sources: 1.) individual Web pages with repetitive spammy words and phrases, 2.) hacked Web sites, and 3.) content farms (sites that either produce shallow content of their own or copy other sites’ content).  The topic of Webspam could fill a book unto itself but it is worthy of this brief mention because it brings Information Overload to a whole new level in terms of making it harder for knowledge workers to find what they are looking for.

If we were to repeat our study of the failure rate for searches, Webspam would most likely result in an increase in the number of searches that actually failed but were not recognized as a failure by the searcher.  This is a dangerous trend if it continues unabated.

Search. Sit. Stay. I get my commands mixed up occasionally.

The problem has even given rise to a new type of search engine.  Blekko, “a better way to search the web,” uses slashtags to cut out spam sites and my tests of the search engine show that it provides significantly less spam in its results.

Google has heard “the feedback from the web loud and clear” and believes that the company “can and should do better.”  It announced plans (via Cutts’ post) to combat all three types of Webspam, including making changes to its algorithm and giving users a means of providing “explicit feedback” on spammy sites.

Jonathan B. Spira is CEO and Chief Analyst at Basex.

Google Instant: New Interface, Same Old Results

Thursday, September 9th, 2010 by Jonathan Spira

Google made a big splash today with the launch of Google Instant, a somewhat glitzy update to the current Google search interface.  The question that needs to be asked, however, is whether the new functionality actually makes for a better search tool.

Add water. Search.

Speaking at the launch, Sergey Brin commented without a trace of irony that “[W]e want to make Google the third half of your brain.”  Marissa Mayer, the company’s vice president for search products and user experience, noted that “[T]here’s even a psychic element to it.”

Google Instant starts to show search results as one types.  In some respects, Google Instant appears to know what to type before one types it (although this in itself is not a new feature, Google has had a “type ahead” suggestion option for some time).

As one starts to type, results appear.  For example, in my test, using the search query “BMW 335d review,” when I typed the letter “b,” “bank of america” and “best buy” appeared as choices and search results for Bank of America appeared below.  Once I added the letter “m” the system seemed to anticipate I was going to type “bmw” and results for BMW appeared.

In some cases, the correct or best answer or content appears before one finishes typing.  Indeed, by the time I had finished typing “bmw 335d,” the word “review” appeared in grey and the appropriate results were immediately below the search bar.

The order of word entry in a multi-word search query does not seem to influence the final result but the interim results displayed as the entry is formulated will vary, of course.

An initial review of Google Instant does indicate that the new feature improves the user experience; however, while it may make searching faster, Google Instant does little if anything to provide better search results.  What it does do is provide “search-before-you-type”, by conducting searches and displaying results based on the most likely word combinations, determined as one types.  This improves the user’s ability to refine a search without hitting enter, so one can see if a mistake in wording occurred, or if one needs to add terms to narrow the search parameters.

Google has had the “type ahead” feature that completes the search query for quite a while now, and the basic mechanics that dictate suggested words appear unchanged.  The interface has been tweaked, so instead of simply having the suggestions appear below the query in a drop down menu, the most likely one appears in light grey text, completing the query.  Users can select that text if it fits their search, or use the arrow keys to select the suggestions from the drop down list that changes as the word is being typed.

The real upgrade to the user interface is the display of instant search results that appear under the search box.  These suggestions can serve to guide the knowledge worker into making a more precise search query (i.e. formulate the search query better) and narrow in on the exact result that is sought.

According to Google, the average search query takes nine seconds to type and Google Instant can cut that down by two to five seconds.  Put differently, for the 65 million knowledge workers in the U.S., if each uses Google twenty times per day for a search, each knowledge worker would save 70 seconds but that would add up to 1,263,889 man hours per day that Google Instant would be saving.

Despite the potential time savings, the main problems with search are not really addressed in this update.  One of the fundamental problems with search is that a query returns results and not answers.  Indeed in many cases, the results page only contributes to the problem of Information Overload as the knowledge worker still has to sift through the results to find what may or may not be the “right” answer.

Although a user can use the instant results to narrow down a search more quickly by not having to hit enter and refresh the results manually, the underlying search process has not changed, so our concerns regarding Google’s search engine remains the same.

Ultimately, the new feature is a cosmetic upgrade that will likely be very popular and will make the search experience appear to be better and less painful.  Behind the scenes however, the same old problems remain.  The way search tools function today – even with Google Instant on the scene –  serves to exacerbate the problem of Information Overload and does little to alleviate the problem.  It is a useful evolution of the user interface, but it does not go any deeper than that.

Jonathan B. Spira is CEO and Chief Analyst at Basex.

In the briefing room: Comintelli Knowledge XChanger

Thursday, June 24th, 2010 by Cody Burke

The battle to find the right piece of content at the right moment is a never ending quest for the knowledge worker.

Calling all cars...

While most companies have organized their various internal content stores and many have contracted for authoritative external content from sources such as Factiva and LexisNexis, this is only half the battle.

All of this progress notwithstanding, a knowledge worker often has to search through multiple systems to find exactly what he is looking for.  Frequently, he may not end up with the best and most up-to-date content because the individual searches produced results different from those an aggregated search would have presented.

Comintelli, a Swedish company founded in 1999, addresses this challenge with its Knowledge XChanger offering.  The solution aggregates content from both internal and external sources and then classifies, organizes, and presents relevant items to knowledge workers.  The content is packaged and delivered to work groups in a role-based and customized format so that only the most relevant information is presented.  Additionally, users select topics and enter search terms to further drill down on an area and refine the result set.

Knowledge XChanger allows knowledge workers to publish information through an easy-to-use browser-based interface or via e-mail.  In addition, the system supports commenting, voting, and chat around content.

Users can personalize how they receive information by using automatic e-mail alerts and/or via a customized start page.

When the user does perform a search, he is tapping into content that has been drawn from vetted and authoritative sources, which could include internal sites or select external sources such as news sites as well as from content providers such as Factiva.

A particularly valuable feature in Knowledge XChanger is the ability to find experts on a given topic.  The system uses Knowledge Points, a customizable feature that assigns points to users based on activities, to determine expertise.  For instance, a user may receive points for every time he reads an article, searches on a term, or comments on content.  Users can search for individuals who have expertise in a given area.

Tools such as Knowledge XChanger are key components on the road to the development of true Collaborative Business Environments.  In addition, by aggregating and delivering timely and relevant role-based content to the knowledge worker, the system tackles several aspects of Information Overload relating to search and information management.

Finally, by supporting expertise location with the system’s ability to associate individuals in an organization with topics they have knowledge and interest, Comintelli has taken a big step in improving knowledge sharing and collaboration by connecting knowledge workers to each other and jump-starting the collaboration process.

Cody Burke is a senior analyst at Basex.

Searching for Needles in Haystacks: How our brain sabotages our searches

Thursday, January 28th, 2010 by Cody Burke

In a recent study funded by the U.S. Department of Homeland Security (DHS) and reported in LiveScience, researchers found that subjects’ expectations of finding something had a direct effect on their success rates for finding the items in question.

Found it yet?

Found it yet?

In the study, subjects looked at X-ray scans of checked baggage and tried to identify the presence of guns and knives.  In the first trial, a gun or knife was present in 50% of the bags, and subjects only missed the weapons 7% of the time.  In the second trial, the guns and knives were in only 2% of the bags, and the subjects missed the weapons 30% of the time.  In short, when something is harder to find, our accuracy in identifying it drops significantly.

This is a trick our brain is playing on us as it becomes bored when we do not find what we are looking for and stops paying attention, meaning we then miss things when they do appear.

While the implications for airline security are obvious and somewhat chilling, the implications for the enterprise are also worth examining.  Knowledge workers spend ca. 15% of their day searching for content.  Applying the lessons learned in the DHS study, we can assume that if a search query returns fewer correct results in relation to incorrect results, the knowledge worker’s accuracy in picking out the relevant items will decline.

Conversely, just as in the DHS study, if the correct to incorrect ratio is better, meaning there is a higher number of correct results, then the knowledge worker is much more likely to find more of them.

For knowledge-based organizations and providers of software to these groups, the lessons from this study are clear: search tools must be improved to provide better ratios of relevant, useful results.  Today’s search tools focus on returning large sets of results and the answers to a search query may very well lie somewhere within these.  However, the low signal-to-noise ratio virtually ensures low accuracy even if one were to comb through every last result.

Search results need to be highly contextual and limited in volume to ensure accuracy and provide a favorable ratio of correct to incorrect results.  This keeps the knowledge worker engaged and not feeling that he is looking for a needle in a haystack; this, in turn, increases the probability of identifying the needed content.

Cody Burke is a senior analyst at Basex.

Search: How to Find What You Are Looking For (or 5 Tips for Better Search)

Thursday, December 17th, 2009 by Jonathan Spira

50% of all searches fail in a manner that the person doing the search recognizes as a failure. 


What is it that you are looking for, my dear?

A far more significant problem is that 50% of the searches believed to have succeeded failed, but the person doing the search simply doesn’t realize it.  As a result, that person uses information that is at best out of date but more often incorrect or just not the right data.  When the “bad” information is then used in a document or communication, there is a cascading effect that further propagates the incorrect information.

In an age where Information Overload costs the U.S. economy ca. $900 billion per annum, finding the right information has become far more critical.

To increase the odds that you will find what you are looking for, we’ve prepared five simple search tips that should result in better and more accurate results, regardless of where you are searching.

1.)    Boolean logic
Search engines typically use a form with a search box into which one types the search query.  To control the search results, use Boolean logic by typing AND or OR.  Many search engines including Google default to AND when processing search queries with two or more words.  To exclude words, use NOT (java NOT coffee, java -coffee).  For increased relevance, use NEAR (restaurants NEAR midtown Manhattan).

2.)    Options
Most search engines include options (on Google, these are found by clicking on Advanced Search).  Use options to narrow down the field you are searching.  Examples include file format (.ppt, .doc, .pdf, etc.) or Web site (basex.com).

3.)    Search tools
When it comes to search, one size does not fit all.  Use a variety of search tools beyond Google.  Try search visualization tools such as Cluuz and KartOO on the Web and KVisu for behind the firewall.

4.)    Meta search engines
A meta search engine runs several searches simultaneously.  Tools that may be helpful include Clusty and Dogpile.

5.)    Archived (out-of-date) materials or nonexistent Web sites
The Wayback Machine on the Internet Archive is useful for both older versions of Web pages and sites that have disappeared over time.

Jonathan B. Spira is CEO and Chief Analyst at Basex.

In the briefing room: Simplexo

Thursday, November 19th, 2009 by Cody Burke

Knowledge workers spend a good part of their day in search of information; therefore it is no surprise that having the correct search tools is of paramount importance.

Simplexo search simultaneously addresses structured and unstructured data.

Simplexo search simultaneously addresses structured and unstructured data.

The limitations that exist in many search tools, combined with poor search techniques, lead to the frequent use of outdated information, the recreation of content that exists but can not be found, and the waste of significant amounts of time.

Failed searches are a very visible symptom of Information Overload.  It is generally acknowledged that 50% of searches fail outright but few realize that 50% of the searches that people believe to have succeeded actually failed too, in that they presented stale, incorrect, or simply second-best information.  This last figure is far more insidious because the knowledge workers are unaware of the searches’ failure and blithely proceed to use the incorrect information in their work.

Training and proper search techniques can make a huge difference in improving search results but equally important are the tools the knowledge worker uses.  In most cases, information is stored in separate silos, and search tools need to be able to reach across those boundaries.  A search query that reaches across multiple information stores at once provides far more complete and relevant results than multiple separate searches.

Simplexo is one company that is addressing these issues with its Simplexo Enterprise search offering.  The company accesses the native indexing capabilities of databases and existing software such as SharePoint and Outlook to index unstructured data.  Indexing takes place in real-time and runs continuously, which helps ensure that the most current information is presented in search results.  Data is de-duplicated to remove extra copies of information, reducing the overall amount of content that must be indexed.

Simplexo uses a dual index approach that looks at both structured data in real-time and unstructured data during processor idle time.  The system examines and retrieves unstructured data from sources such as Web pages, e-mail, text files, PDF files, and Open Office files, as well as data from structured sources such as databases, business applications, CRM applications, and information portals.  The wide net that Simplexo casts when searching has the potential to improve knowledge worker efficiency. The ability to look in multiple silos and repositories with a single search query is extremely helpful in ensuring that information that is buried in a far flung repository or inbox folder is taken into consideration.

In addition to indexing, Simplexo supports native integration into platforms such as browsers, Office, Outlook, AutoCAD, and Lotus.  This integration is key to enabling the knowledge worker to remain in one work environment.  Simplexo also supports mobile devices via Simplexo Mobile for iPhones and Windows Mobile 6 devices, with plans to add BlackBerry support in the future.

Enterprise search solutions such as Simplexo take a realistic view of the challenges faced by knowledge workers when searching for information.  They recognize that relevant information can reside in any repository, be structured or unstructured, and is in need of continuous indexing to remain up-to-date.

Cody Burke is a senior analyst at Basex.

Information Overload – It Isn’t Just Too Much E-mail

Thursday, August 20th, 2009 by Jonathan Spira

One might assume that pinpointing the sources of Information Overload is relatively black and white, i.e. it’s just too much e-mail. In reality, nothing could be farther from the truth.

The problem of Information Overload is multifaceted and impacts each and every organization whether top executives and managers are aware of it or not.  In addition to e-mail, Information Overload stems from the proliferation of content, growing use of social networking tools, unnecessary interruptions in the workplace, failed searches, new technologies that compete for the worker’s attention, and improved and ubiquitous connectivity (making workers available anytime regardless of their location).  Information Overload is harmful to employees in a variety of ways as it lowers comprehension and concentration levels and adversely impacts work-life balance.  Since almost no one is immune from the effects of this problem, when one looks at it from an organizational point-of-view, hundreds of thousands of hours are lost at a typical organization, representing as much as 25% of the work day.

So what else besides e-mail overload is at issue here?  Here’s a quick rundown.

- Content
We have created billions of pictures, documents, videos, podcasts, blog posts, and tweets, yet if these remain unmanaged it will be impossible for anyone to make sense out of any of this content because we have no mechanism to separate the important from the mundane.  Going forward, we face a monumental paradox.  On the one hand, we have to ensure that what is important is somehow preserved.  If we don’t preserve it, we are doing a disservice to generations to come; they won’t be able to learn from our mistakes as well as from the great breakthroughs and discoveries that have occurred.  On the other hand, we are creating so much information that may or may not be important, that we routinely keep everything.  If we continue along this path, which we will most certainly do, there is no question that we will require far superior filtering tools to manage that information.

- Social Networking
For better or worse, millions of people use a variety of social networking tools to inform their friends – and the world at large – about their activities, thoughts, and observations, ranging down to the mundane and the absurd.  Not only are people busily engaged in creating such content but each individual’s output may ultimately be received by dozens if not thousands of friends, acquaintances, or curious bystanders.  Just do the math.

- Interruptions
We’ve covered this topic many times (http://www.basexblog.com/?s=unnecessary+interruptions) but our prime target is unnecessary interruptions and the recovery time (the time it takes the worker to get back to where he was) each interruption causes, typically 10-20 times the duration of the interruption itself.  It only takes a few such interruptions for a knowledge worker to lose an hour of his day.

- Searches
50% of all searches fail and we know about the failure.  What isn’t generally recognized is something that comes out of our research, namely that 50% of the searches you think succeeded failed, but the person doing the search didn’t realize it.  As a result, that person uses information that is perhaps out of date or incorrect or just not the right data.  This has a cascading effect that further propagates the incorrect information.

- New technologies
We crave shiny new technology toys, those devices that beep and flash for our attention, as well as shiny new software.  Each noise they emit takes us away from other work and propels us further down Distraction Road.  It’s a wonder we get any work done at all.  Even tools that have become part of the knowledge workers’ standard toolkit can be misused.  Examples here include e-mail (overuse of the reply-to-all function, gratuitous thank you notes, etc.) and instant messaging (sending an instant message to someone to see if he has received an e-mail).

Jonathan B. Spira is CEO and Chief Analyst at Basex.