» Archive for the 'Information Records Management' Category

In the briefing room: Alfresco 3.2 Records Management

Thursday, February 18th, 2010 by Cody Burke

The sheer volume of content that is generated by an organization in this day and age is nothing short of staggering.

Now where is that file?

Now where is that file?

Even more daunting is the task of managing that information for compliance and record keeping. The records that an organization must keep clearly qualify as content, and in today’s volatile economy and regulatory climate, records management functionality in an Enterprise Content Management system is not simply a nice-to-have capability, but a necessity.

To meet the need for managing the lifecycle of content and information, Alfresco recently announced version 3.2 of its ECM solution. Significantly new in the release is a records management module, Alfresco RM, as well as some advanced e-mail archiving functionality.

The Records Management module meets the Department of Defense (DoD) 5015.2 certification and it is thus far the only open source solution to achieve this. The RM module enables administrators to set up storage policies that manage the retention of data so that records that are no longer needed for compliance may be deleted or archived. The module also defines rules for moving content, so that the most current versions of records are kept in easily accessible storage locations, such as faster drives, while archived material is stored on slower drives. This not only keeps records organized, but also speeds up the process of accessing the content.

As an integrated component of Alfresco ECM, the RM module uses the same single repository as the rest of the suite. The module also features support for complex transfers, role-based permissions, legal holds, and saved searches to speed up searching for content.

The new e-mail archiving functionality in Alfresco 3.2 leverages new IMAP support, which allows users to access content via an e-mail folder in any IMAP client. Through the folder, content can be added into the central repository by drag-and-drop. The new functionality also supports the ability to configure attachment handling, such as pulling out attachments from e-mail and archiving them, or keeping them embedded in the e-mail.

The ability to manage records and integrate tightly with e-mail clients for archiving adds to Alfresco ECM’s appeal as a solution for organizations that need to manage content and ensure compliance. Managers seeking an integrated approach would do well to consider this solution.

Cody Burke is a senior analyst at Basex.

Information Overload To Swamp The National Archives

Friday, January 2nd, 2009 by Cody Burke

As we prepare for the transition from the Bush administration to the incoming Obama administration, the National Archives, charged with preserving and indexing presidential records, has launched an emergency plan to deal with the looming deluge of electronic records. The incoming data will be comprised of around 100 terabytes of content, which is 50 times the amount the Clinton administration left behind.

Much of the content is in the form of e-mail, which has exploded in use over the last eight years, and is expected to make up to 20-40 terabytes. This number can only be expected to grow in future administrations. There is no doubt that Obama and his team, judging by his admitted addiction to his BlackBerry, will generate even more data that will have to be archived and indexed.

The effort to incorporate the records has been complicated by foot dragging by the current administration; they have been reluctant to provide details about the size and format of the records they will be turning over. The format issue is particularly relevant, many of the records are expected to be in unfamiliar formats, and without preparation the National Archive will struggle to process them. Specifically, there are large quantities of digital photographs and the White House’s “records management system” that provides the index for the text-based records.

In addition to the size of the records, there are concerns that the proprietary format of the records will become obsolete. This has already happened at NASA, there are millions of files on 8″ and 5 1/4″ floppy diskettes, various obsolete tape cartridges, and NASA’s earliest photographs of the earth that are mostly inaccessible with today’s technology.

Despite a new $144 million computer system, concerns abound about the ability of the National Archives to absorb the content. In the words of Paul Brachfeld, the archives’ inspector general, “Just because you ingest the data does not mean that people can locate, identify, recover and use the records they need.”

While only a fraction of the information turned over to the National Archives will be of any interest to researchers and the public, by law it must all be archived, and without a plan in place to index and store the data, it may be rendered utterly useless. As the amount of electronic content increases, this may prove to be only the tip of the iceberg.

Cody Burke is a senior analyst at Basex.

The Myth of the Paperless Office

Friday, February 15th, 2008 by Jonathan Spira

Pundits have been proclaiming the imminent arrival of the paperless office since the 1970s.  So far, they’ve been wrong.  If anything, we print more today than we did back then.

Yet some still believe in this; those who engage in looking for the paperless office may be engaging in a Sisyphean task.

The New York Times ran a story on February 10 by Hannah Fairfield entitled “Pushing Paper Out the Door.”  It speaks of paper-reducing technologies in homes and offices, citing families who scan their bills and opt for on-line statements.  To Fairfield I say “not so fast.”

Indeed Fairfield quotes Brewster Kahle, the founder of the Internet Archive: “Paper is no longer the master copy; the digital version is. Paper has been dealt a complete deathblow.  When was the last time you saw a telephone book?”

To respond directly to Kahle, last week.

My friend and fellow analyst Amy Wohl famously commented, around 1978, that she thought that the paperless office was “about as useful as the paperless toilet.”

But the article had a bigger problem.  While it provided an excellent look on the move to digitization, it completely ignored the elephant in the room, namely a compatibility conundrum that has been with us since the first computer was turned on over 60 years ago.

Books, pamphlets, and broadsides printed hundreds of years ago are accessible without any special equipment.  For that matter, the Rosetta Stone (dating from 196 B.C.E.), was also readable upon its discovery in 1799.

Contrast this with millions of files on 8″ or even 5 1/4″ floppy diskettes, various obsolete tape cartridges, and NASA’s earliest photographs of the earth – all mostly inaccessible with today’s technology.

One might presume that the technology revolution of the late twentieth century had increased our ability to preserve our history and cultural artifacts.  In actuality, we have failed.

Moving all of our papers to digital form without a plan to ensure accessibility not only 5 years from now but 50 years and 100 years and beyond is not making information MORE accessible but risking that it will become LESS accessible.

Jonathan B. Spira is CEO and Chief Analyst at Basex.

Electronically Stored Information – or What to Do With Those 2.5 Billion E-mails

Friday, December 8th, 2006 by Jonathan Spira

Newly-amended Rules of Civil Procedure concerning electronically stored information (ESI) take effect in all federal courts on December 1, 2006.  This impacts a large number of constituencies, ranging from CIOs to records managers to lawyers and judges.

These rules came about because 95% of records are created and stored electronically and, as a result, all discovery is now e-discovery.  It will be no surprise to regular readers that the volume of ESI is exponentially greater than the volume of stored paper records (which we will refrain from calling SPRs).  E-mail is perhaps the main culprit.  A single knowledge worker can easily generate 25,000 e-mails per year; a company with 100,000 employees could find itself with 2.5 billion e-mails in its archives.

The mass of e-mail is one of several reasons for the new rules.  Another is that electronic information is dynamic; a lot of it changes without operator intervension.  In addition, ESI is much more difficult to destroy when compared with SPRs.  (oops! sorry)

Perhaps of greater significance is that electronically stored information may need to be retrieved or restored – and therein lies the rub.

But there’s more.  (Disclaimer: I am only going to be able to scratch the surface here – an in-depth analysis of the new rules and their impact on knowledge economy companies would require far more space than allotted here.)

New Rule 26(f) requires parties in proceedings to discuss “any issues relating to preserving discoverable information.”  The need to preserve evidence has to be balanced, according to an Advisory Committe Note, with the need to continue routine activities because halting the operation of a computer system could put a company out of business today.

Rule 34 adds ESI as a separate category of information, stipulating that the Rule covers all types of information, regardless of how or where it is stored.  The Advisory Note makes it clear that this is not mean to create a “routine right of direct access to a party’s electronic information system, although this might be justified in some circumstances.”

But for me the most interesting part is found in Rule 26(b)(2)(B), covering inaccessible data.  As noted many times in this space, what we create today may not be accessible at some point in the future.  The Rule addresses ESI that is considered inaccessible, given the undue burden and expense of retrieving it if this is even possible.  A party is not required to provide discovery of ESI from sources that it identifies as not reasonably accessible (due to the undue burden and cost that would be incurred) unless the requesting party can demonstrate “good cause.”

Unfortunately, the Rule does not define the term “inaccessible” which may turn out to be a good thing.  (It also fails to identify sources, which we can infer to be a place where data (or information) is stored.)  What is “inaccessible” today might – with a change in technology – very well become accessible tomorrow; moreover, what IS indeed accessible today might not be accessible tomorrow.  Sources that might be considered inaccessible at a given point might include backup media and deleted data.  One thing is clear: if a document or data point is regularly accessible in the normal course of business, a party may not make it “inaccessible” to avoid producing it.

I’ll conclude with a look at the safe harbor provision for ESI loss.  New Rule 37(f) prohibits a court from imposing sanctions for failing to provide ESI lost as a result of normal system operations.  This means that, since IT systems may routinely modify, overwrite, and delete information, companies need not worry about such lost ESI.  The only time a company must intervene is if litigation is reasonably anticipated, at which time a litigation hold would serve notice to all records custodians that the normal operation of the system must be suspended.  Keep in mind that, in the knowledge economy, everyone is his own records custodian.

This general overview is not meant as a substitute for the advice of a qualified attorney; only an attorney can provide you with the legal advice you need to ensure that you are complying with the new rules.

Jonathan B. Spira is CEO and Chief Analyst at Basex.