Showing posts with label cataloguing. Show all posts
Showing posts with label cataloguing. Show all posts

Thursday, June 15, 2017

Preserve the Mess

Many years ago, I attended a Digital Humanities conference, Toronto 1989 I think it was, and heard a paper by Jocelyn Small about using digital tools to manage large datasets.  She was talking about images, but her ideas applied to any data.

One of her key slogans was, "Preserve the Mess."  This approach is now completely normalized by Google search, Google Mail, etc., and we all take it for granted.  But it's worth remembering that this was a major conceptual breakthrough.

Before this approach, everyone thought that the way to find stuff was to use subject indexes.  And subject indexing is expensive, difficult, subjective and structurally imperfect.  What subject headings would you use for the Mahābhārata, for example? I think most people would agree that it is difficult to impossible to arrive at a simple statement of the subject matter of the Mbh that is actually worth having.  Of course, we can all play nothing-buttery, "the Mbh is nothing but a family quarrel," but that's not a serious approach to the problem.  If we pervade the epic with our keywords and subject index terms, we are trying to make the text more accurate than it is, and our exercise is culture-bound and subjective.

"Preserving the mess" means that we leave the data alone.  Rather, we put the intelligence and power into our tools for accessing the data.  We use fuzzy-matching, pattern recognition, machine learning, but all applied to the raw data which is not itself manipulated or changed.

A published version of Small's ideas appeared in 1991:

As she says, p. 52,
Thus Principle Number One is Aristotelian: "Do not make your datum more accurate than it is. This principle may be rephrased as, "Preserve the Mess."

Tuesday, December 17, 2013

Tools for cataloguing Sanskrit manuscripts, no.1



In the post-office today I saw this piece of board that's used as a size-template to quickly assess which envelope to choose.  This is a formalized version of the same tool that I used for the many years that I spent cataloguing and packing Sanskrit manuscripts at the Wellcome Library in London.  I made a piece of board with three main size-outlines, for MSS of α, β, γ sizes.  Anything larger than γ counted as δ.  Palm-leaf MSS were all ε.

It was nice to see the same tool being used for a similar job, in an Austrian post-office!