Friday, October 1, 2010

Preparing and planning a large archives digitisation project

Archives digitisation is currently underway in our Imaging Studio, with two full-time members and two part-time members of Library staff dedicated to preparing and digitising the items. We will talk more specifically about the work being carried out on these materials on this blog in the near future, but first we present an introduction to the setup and planning of the project.

Once the theme was chosen (Modern Genetics and its Foundations), and relevant collections identified (see previous blog post), we realised that we had quite a large job on our hands. The scope of the project was bigger than anything we had done before: 620 boxes of material, containing around 800 pages each, adds up to around half a million pages to be digitised.

Based on a series of tests, we estimated the project would take 2 years to complete – starting with the preparation of the material in advance, with photography coming into play a few months later. Two full-time staff would be focused on imaging the material, with two part-time member of staff preparing, tracking and assessing the items.

There are a range of logistical issues to bear in mind when planning and starting up a project of this nature. The boxes are stored in the basement stores, and had to be retrieved for a period of some months while the material was being worked on. We divided the collections into batches of a size that could be imaged in a period of 4-6 weeks and retrieve and return each batch as a unit, tracking all movements on a spreadsheet. The tracking spreadsheet also records information such as location of each box in the batch, notes from the preparer for the archivists, photographers, and/or conservation staff, and the percentage of items in each box that can be OCR’d among other things.

We put a notice on our website of the entire schedule of archives to be digitised, so readers could see at a glance what would be unavailable and when. The catalogue records are also amended to show where material cannot be reserved. Each time a batch is retrieved, checked out, checked in, or dates altered, this has an impact on the website and two different cataloguing systems (the Archives and Manuscripts Catalogue, and the Library Catalogue), so communication with the departments responsible for retrieval and metadata was key.

The preparation staff were trained in advance by the conservation team so they could carry out basic stabilisation and first-aid work on the materials if required for digitisation. The photographers ran multiple tests on different equipment and with different cameras to ensure the workflow was efficient and appropriate to the formats of the material, the anticipated end use of the material, and to ensure proper QA could be accommodated. Preparation and imaging takes place in the Imaging Studio - ensuring that all staff are in close proximity and able to communicate easily with each other. The Imaging Studio was refitted with desks, shelving and equipment to make sure all the boxes in process at any one time could be accommodated. A further planning issue was in determining how to assess and record different levels of sensitivity of information contained in the archives. We are currently developing a policy for access to archives that takes account of online display, and this has informed the workflow for assessment.

This project required liaison between several different departments and stakeholders in the Library in order to set up a suitable workflow. In future, we hope that workflow issues will be streamlined further by procuring a Workflow Tracking System that will serve to centralise tracking and monitoring of all digitisation projects. We anticipate that this pilot project will enable us in future to plan effectively for much larger digitisation projects as we work towards the digitisation of all suitable material held in the Wellcome Library.

No comments:

Post a Comment