Archive for the ‘Access’ Category


March 23, 2006

The last class of the day before getting on with our Internet Archive project (which has been very instructive), was about providing access to archival collections. Good, common-sense advice was dished out which we all pretty much knew but were pleased to hear again:

We preserve because we expect access.
We must be able to derive one from the other.
We don’t have to let one dictate the other.
Access needs may drive decisions at ingest e.g. on metadata.
There are many ways to provide access.

We were advised to ‘preserve enough to tell the story’ and that good preservation refers to ‘preserving meaningful information through time’. Quotes like that can come in handy sometimes.

Of course, depending on the archive’s remit, there are various reasons why we might have to provide access (FOI) or restrict access (DPA). Fortunately only the latter applies to us right now. There was also a brief discussion about redacting certain information before providing access, something that we do in a way with MAV’s products and transcripts for security reasons before they are put on the public database. I don’t knnow if we do it to documents, too. Do we?

Finally, the OAIS standard clearly has a strong element dedicated to ensuring access. It’s based around knowing your ‘designated community’ which may be as broad as ‘people who can read english and use the internet’ (The National Archives) or as narrow as ‘my friends and family’ (imagine an online photo service like Flickr). As usual some of the best advice was about planning ahead, being proactive abouut offering ways to access material, seperating the preservation infrastructure from the access infrastructure and collaborating with other repositories.



March 23, 2006

This was a post-lunch crash course in Intellectual Property Rights, Copyright, Digital Rights Management, Freedom of Information Act, Data Protection Act and Legal Deposit. No depth but some useful highlights and another interesting case study from the ADS concering the mis-use of images and how they dealt with it.

Basically, the ADS received a collection of data, including images, from the excavation of Christ Church, Spitalfields. The images in this very interestining collection show the remains of bodies buried in the 18th century crypt. These images were found by a web site for necrophilia enthusiasts and some images were copied from the ADS site and republished on the sex-with-dead fan site. They also provided a link through to the ADS for enthusiasts to grab more images for themselves.

This was their mistake, because the ADS noticed an unexpected spike in the use of its website and traced it back to the link from the other website. It was the first time they’d had to deal with the mis-use of their digital collections and sought advice from the JISC legal team. They were advised a six-point plan spread over 70 days. The first was simply to contact the website, tell them they had broken the licence they agreed to on the ADS website and that they take the images down or else face further legal action. And they did take them down. End of story.

This was a satisfactory result for the ADS because despite the mis-use of the images, it would have been a long and difficult legal process had the web site not taken them down.

We’ve been thinking about such things for ADAM and intend to introduce a ‘handshake’ agreement prior to the download of ADAM images. Having seen how the ADS handle this ‘contract’ with its users, I’m now inclined to just have users agree to a licence when they first enter an ADAM session rather than each time they click to download. Legally it would appear to cover us. Our present system is based on authentication into the AI Intranet and then trusting that the AI staff member will respect the terms and conditions that are displayed with each image, but we think we can do better than this with little inconvenience to users. There will also be more changes to the way ADAM handles rights management and licence agreements.

The main piece of advice that the ADS gave from this example was that archives should not wait for the abuse of their content before forming a response but rather formulate a strategy for dealing with a potential incident so we can react quickly, methodically and legally. Wayne, Claire and Tim will know more about whether we’ve had to deal with this already. I’m not aware of such a strategy being in place though. In late May, an IPR expert from the Open University will be giving a one-day workshop on IPR issues for AI staff, something we intend to run each year. Having spent just an hour touching on such issues, I feel a day’s course would be well spent ensuring IS staff are informed of the risks and responsibilities involved in this area of our work. Not least because the European Copyright Directive, which applies to the UK, now makes breaking copyright protection a criminal offense rather than a civil offense, so theoretically someone could go to jail whereas it used to be that the individual/organisation would be fined based on the ‘loss’ (financial, of reputation, of relationships, etc) to the rights owner.

A whole hour discussing file formats!

March 22, 2006

I departed from earth this afternoon. I’m not sure where I went but this session on file formats and then a further session on digital records management took me places I never thought I’d go.

The title of this class was ‘File Formats: Matters to Consider’, and I found it fascinating.

First, we were shown where file formats fit in the hierarchy of the IT system:

Semantic Layer
Actions Layer
Format Layer (Alright!)
Filesystem Layer
Media Layer

Then an anecdote about how some file formats and their creating applications are better used for some tasks and not for others. The tutor knew someone who wrote a novel in Excel because he didn’t have any other software to hand and I guess curiosity didn’t get the better of him either.

We did a quick exercise in what features to look for in a file format for preservation purposes. Not too difficult:

Open, documented, widely used and therefore supported, interoperable over different Operating Systems, lossless/no compression, metadata support, etc. etc.

Another anecdote was that ten years ago, two men wrote a book detailing over 3000 graphic file formats. As the number of formats grew, it was revised and issued on a CD-ROM. Now it’s updated on the web. I’m sure Tim would love it.

I’ll state this here: ADAM handles two graphic file formats for a reason. They are both open, documented, widely used, well supported, interoperable and have metadata support. The list of supported graphic file formats may double or triple over time, but 3000+ demonstrates what an industry digital archives are having to deal with.

If you want guidance on file formats (and who doesn’t?), then look no further than these fine institutions:

FCLA Digital Archive
Harvard University formats registry
ERPAnet file formats
Library of Congress (my favourite).

We finished up by looking at the conversion of file formats, something which presents problems when you want to preserve the original integrity of the file’s content but in a more suitable or non-obsolete file format.

I could go on about file formats but let’s face it, we’ve both had enough for one day. Let’s talk at length in the ‘breakout’ area when I get back, OK?

Institutional Repositories

March 22, 2006

Maybe I should think up snappier title headings to these blogs. Believe me, occasionally I’m sitting in the class wondering how the hell I got here. Though I should say that the quality of the training programme so far has been very high and I’m finding it very engaging. The tutors are decent, down-to-earth people with practical advice. In other news, the stats for this blog suggest that most of ITP and IRP looked at it yesterday. Tomorrow’s stats should be interesting… 😉

Basically, this was a discussion on DSpace and the OCLC. Fedora was mentioned but only briefly. That’s OK, because Fiona, Damon and I attended a conference on Fedora last year. Damon’s an expert so ask him all the questions about Fedora… The implementation of a ‘trusted repository’ is central to digital archiving and the two main course documents are the OAIS standard and the follow up document, Trusted Digital Repositories. The TDR document basically goes through all the attributes and responsibilities that an OAIS compliant have. The report defines a TDR as:

A trusted digital repository is one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future.

It’s a useful document for testing how well your institution is doing.

DSpace is a repository system that’s been developed at MIT. It’s very popular (OK, so that’s a relative term…) in the USA and some UK institutions use it too. From what I could see, it provides a customisable ‘repository out of the box’ and shares some functionality with a Content Management System.

DSpace has three preservation service levels, providing functional preservation through ‘supported’ (1) and ‘recognised’ (2) file formats and bit-level (3) preservation. I don’t think it is ‘OAIS compliant’ but clearly it follows the basic OAIS functional model of Ingest of Submission Information Packages, creation of Archival Information Packages and the creation of Dissemination Information Packages. The example we were shown worked very well for the submission and archiving of a document by an academic writer. From the Fedora conference we attended, I’d got the impression DSpace was a bit crap, but there’s some competition between the two systems so that shouldn’t be surprising. Fedora is a different animal really as it provides a suite of repository services which developers are expected to work with while DSpace is useable out of the box by people without programming skills.

OCLC, The Online Computer Library Centre provides a repository service for other institutions so is really an out-sourcing solution. Accessible over the web with OAIS-like functionality but, of course, still requires that you prepare your collection for Ingest.