St. Catherine's boxing: spotting wrong dimensions

I spent a few hours today reviving the St. Catherine's boxing project and especially the Java code that has been dormant since this publication and while waiting for the library renovation to be completed.
Petros Koufopoulos, Stuart Welch, Andrew Honey, Nicholas Pickwoad and myself have had a few meetings over the past few months trying to finalise the exact dimensions of boxes in relation to the dimensions of books and also the exact dimensions of the cupboards in the main library.

We are now reasonably confident that these dimensions are optimised and therefore I started looking at the data from the St. Catherine's database to extract the book dimensions. This was straight-forward. I then fired up my NetBeans "booksorter" project and without much trouble a solution for sorting and stacking the boxes was ready. I did notice some strange looking books in the resulting chart, so I started looking more careful at the measurements.

These measurements were first noted on paper which was then digitised. We had two points of human error:

  • when a measurement was written down on the paper
  • when a measurement was typed into the computer

Some measurements were missing a digit or had a zero added to them resulting in strange book dimensions. These were easily spotted as soon as I sorted by height, width and depth. Having digitised all the survey forms, it was easy for me to go back and double-check.

Sorted!

Not quite.

Years ago a cedar tree was being felled and we received a request on the likely dimensions for book boards so that the tree was cut in the right-size pieces. I remembered that the aspect ratio of the boards at that time had little variation and I thought I will check this with the book height and width data as well.

I quickly produced a chart of ratios of height/width on a spreadsheet and I plotted a chart (Figure 1). Sure enough the aspect ratio was relatively similar (between 1 and 1.6), but I could spot suspicious outliers. It would have been extremely difficult to identify these by looking at measurements. I got the shelfmarks of the books, checked these dimensions and indeed they were wrong. In one case the height had been swapped with the width. In another a figure was missing. After correcting the outliers the new chart (Figure 2) looked a lot more reassuring. The border points in the new chart have been checked and they are indeed strange shaped books.

Can we be sure that no other errors have been made? Of course not, but at least we have eliminated the most substantial ones. Watch this space for more updates on the boxing project.