IntroductionThe manuscript database has been produced after the digitisation of about 33000 A4 pages of the condition survey form and about 35000 colour slides of the manuscript bindings. This database is available online under an experimental interface but requires permission from the Monastery to be used.
ScanningA document scanner was used to scan all A4 pages of the survey form. We used the Microtek ArtixScan 2010 scanner because of its speed, compliance with TWAIN and relatively low price. Quality of output was not a priority when choosing this scanner because the digitised A4 pages were only considered as an intermediate step from paper to digital data and therefore basic quality would be enough. The scanner by far fulfilled our requirements in terms of quality. We scanned at 300dpi in grayscale (colour pencils were not used during the survey).
Database developmentThe database was developed following the principles of the paper form. The large number of fields of the form (about 1000) presented a challenge when mapped on the relational model. We avoided to use abstract definitions in the database to simplify later querying and instead we increased the number of tables. Although this created complexity in terms of linking the tables together, we adopted strict naming convention rules which reveal the cross links to the user without having to investigate each table's relationships. Since the completion of this work we have investigated more efficient ways of storing complex binding descriptions as explained here. More information about the structure of the database can be found here and here.
InputtingAn in-house utility was used to semi-automatically read the A4 forms and insert the relevant information to the database. The utility was optimised for speed and would go through all the form fields examining whether information exists on each field. If information existed then the utility would automatically suggest an entry and wait for confirmation from the user. Otherwise it would proceed to the next field. More information about the inputting process can be found here and some background information on the efficient design of the paper form for later scanning can be found here.