The volume of news we create, and by news I mean news reporting, not newsworthy events, is mind-boggling. As someone who studies this as a form of Information Overload, the sheer quantity of reporting never ceases to amaze me – and it is this quantity that makes it difficult to search and find items of interest.
Starting with early news programs such as the Camel News Caravan with John Cameron Swayze and You Are There with Edward R. Murrow, and moving into the current 24-hour news cycle that started with the advent of CNN, a good part of the population has come to depend on television to keep current.
This is all well and good but, unlike newspapers, which today are easily searchable online and previously somewhat searchable using microfiche and microfilm, it was not possible to easily go back and find TV news spots on a particular topic.
Brewster Kahle, the founder of the Internet Archive, wants to change that. For the past few years, the archive has been digitizing news broadcasts from twenty outlets and over 1,000 programs. This means every last second of every CNN broadcast as well as all sixty minutes of 60 Minutes.
Currently the archive has collected over 350,000 news programs that are fully text searchable, thanks to the archive’s using the closed-captioned text that accompanies the videos.
To use the system, I clicked through to TV News on the Internet Archive’s website. There I found a dedicated page with a cloud tag showing recently accessed topics. One can search all sources or narrow the search down by network (e.g. CBS, NBC, ABC, CNN, etc.) or program name (e.g. BBC World News, ABC World News Tonight, Meet the Press, Nightline, and more).
I searched on (fasten your seatbelts here) “Information Overload.” The system returned 98 results, displayed horizontally (out of 248), with a video above and the relevant text excerpt below. The video was almost always queued to start at the right place, i.e. where the person was starting to mention “Information Overload” and I was generally satisfied with the choice.
The interface itself is easy to use and features a handy timeline for moving quickly to a specific spot in time, for example, looking at a spike in coverage in December 2011.
The Internet Archive has already digitized almost every Web page known to man as well as millions of books, much in the style of the ancient Library of Alexandria. Thankfully, the archive has not resorted to stopping travelers at Egypt’s borders to confiscate their books so they could be copied to the library (anecdotally, the travelers sometimes received the copies back, not the originals). All that the Internet Archive has really needed were very large hard drives and a mission to preserve our history as told to us by newscasters.
Jonathan B. Spira is CEO and Chief Analyst at Basex and author of Overload! How Too Much Information Is Hazardous To Your Organization.