Archive for the ‘Repositories’ Category

Certification & Outsourcing

March 24, 2006

The last hour of the course was spent discussing the Trusted Digital Repositories (PDF) certification document and the related matter of outsourcing digital preservation to another institution.

Two of the tutors were part of the group that have drafted the TDR document and another of the tutors was involved in the peer review process. They gave some background to the document which grew out of a recommendation from a 1996 Task Force on Archiving of Digital Information. They explained the basic difference between auditing a repository and certifying a a repository, essentially saying that an audit is an internal or external process evaluating you against ideals. An audit is about continual improvement and does not necessarily have a pass/fail process. Certification, on the other hand, does result in a pass or fail, usually of an audit-like process. Certification may have elements of both processes such as ISO 9000 mandatory elements.

The authors of the TDR document were tasked to created a standard certification process or framework that can be implemented across domains or types of digital repositories.

The tutor who has been involved in the peer review process criticised the draft document on the following grounds:

  • It sets the bar too high for many archives
  • The drafting of the document was not led by archivists
  • The authors claim that they are an international body, but in fact they are mostly N. American with some Europeans but no-one east of the Netherlands
  • The OAIS standard is flexible but the TDR certification which is based on OAIS is very specific about wht the archive should achieve

However, before we throw the document out on the above criticisms, it is worth remembering that it’s only a draft and that it still offers a very useful tool for internal audit and self-improvement. As one of the tutors listed all the reasons why we should care about certification, I realised that none of them specifically apply to AI but are really aimed at public institutions who have obligations to to funding bodies, regulatory bodies, service purchasers, and external depositors. Of course, certification would still give a number of IRP staff a warm glow inside, smug with the satisfaction of international recognition of our work and the service we provide to our organisation, but I think the real value in the TDR document is that it provides practical guidance on how we can improve. Also, having audited our digital preservation archives using the TDR document, we may come to the conclusion that we’re just not up to or interested in developing an archive for long-term preservation and that we might want to outsource some or all of the work to another institution. Preferable one that is certified!

Outsourcing was touched upon at the very end of the hour and we were offered some step-by-step guidelines on approaching the possibility of outsourcing. Here they are, straight from the PPT slides:

We can’t all be experts in everything
We need to carry out some tasks we are not well-equipped to do
We may have resource reasons (money but no space)
We may have policy reasons
It may be cheaper

Understand your needs
Define your problem before you look for a solution
Otherwise you will buy the answer to someone else’s question
Specify mechanisms to monitor and measure performance
Look at the DPC document on outsourcing

Outsourcing digital preservation requirements may be an attractive option – particularly for smaller organisations
But these are also the most vulnerable in terms of what they should expect to ask for and receive
Having a system of certified repositories can help to provide assurance
The checklist of requirements can help organisations find a good match between what they think they asked for and what they receive.

That’s it really. I should add that the University of London Computing Centre which is a ten minute walk from AI will almost certainly be certified as a TDR because one of the authors of the report, runs the preservation programme there. And, yes, they take on consultancy work and are willing to discuss any outsourcing we might decide we want to do with them.

That’s the end of this blog. I hope you’ve enjoyed reading it and found at least some parts of it thought provoking. As I said early in the week, I wanted to do it because Chris sent me on the course on the condition that I gave a presentation to interested staff when I returned, which I’m happy to do, although a presentation is probably not the way to go about it. Hopefully this blog has provided some background reading (and light entertainment) for us to discuss in the near future.

See you on Monday.


NDAD – The National Digital Archive of Datasets.

March 22, 2006

The last session of the day before getting on with our class project about the Internet Archive was on NDAD.

It’s a service that the ULCC perform for TNA by preserving UK government databases and records no longer in use. Sounds dull but of course these databases include the National Inventory of Woodland and Trees, a survey of British Bats, statistics on how many accidents there are in the home, crime statistics and the names and assets of victims of Nazi persecution who were compensated by the UK government. These databases are all transferred to NDAD who migrate them into sustainable formats, document them with good metadata and make them available to search online. It sounds like a great place to work if you’re interested in the history of computing as some of these databases were the largest of their kind at the time and represent significant historical moments in the history of computing. They also have to deal with all the legacy software and hardware issues, data analysis, system design and digital conversion as well as the development of emulators, data recovery and so on. Hacker heaven and it’s only a ten minute walk from AI.

We assessed NDAD against the OAIS standard and it does pretty well. Since TNA essentially do the selection of the databases, NDAD have no negotiation in the Ingest stage, but together TNA and NDAD are a functioning OAIS archive. And it’s only a ten minute walk from AI! It feels a bit like saying you live only ten minutes from Buckingham Palace.

Institutional Repositories

March 22, 2006

Maybe I should think up snappier title headings to these blogs. Believe me, occasionally I’m sitting in the class wondering how the hell I got here. Though I should say that the quality of the training programme so far has been very high and I’m finding it very engaging. The tutors are decent, down-to-earth people with practical advice. In other news, the stats for this blog suggest that most of ITP and IRP looked at it yesterday. Tomorrow’s stats should be interesting… 😉

Basically, this was a discussion on DSpace and the OCLC. Fedora was mentioned but only briefly. That’s OK, because Fiona, Damon and I attended a conference on Fedora last year. Damon’s an expert so ask him all the questions about Fedora… The implementation of a ‘trusted repository’ is central to digital archiving and the two main course documents are the OAIS standard and the follow up document, Trusted Digital Repositories. The TDR document basically goes through all the attributes and responsibilities that an OAIS compliant have. The report defines a TDR as:

A trusted digital repository is one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future.

It’s a useful document for testing how well your institution is doing.

DSpace is a repository system that’s been developed at MIT. It’s very popular (OK, so that’s a relative term…) in the USA and some UK institutions use it too. From what I could see, it provides a customisable ‘repository out of the box’ and shares some functionality with a Content Management System.

DSpace has three preservation service levels, providing functional preservation through ‘supported’ (1) and ‘recognised’ (2) file formats and bit-level (3) preservation. I don’t think it is ‘OAIS compliant’ but clearly it follows the basic OAIS functional model of Ingest of Submission Information Packages, creation of Archival Information Packages and the creation of Dissemination Information Packages. The example we were shown worked very well for the submission and archiving of a document by an academic writer. From the Fedora conference we attended, I’d got the impression DSpace was a bit crap, but there’s some competition between the two systems so that shouldn’t be surprising. Fedora is a different animal really as it provides a suite of repository services which developers are expected to work with while DSpace is useable out of the box by people without programming skills.

OCLC, The Online Computer Library Centre provides a repository service for other institutions so is really an out-sourcing solution. Accessible over the web with OAIS-like functionality but, of course, still requires that you prepare your collection for Ingest.