Saturday, April 30, 2016

anecdotally approximating the needs of digital humanity work

(The following is akin to the prologue of a future paper in the digital humanities, a motivator for my research and work.)

Biographically speaking, I am the embodiment of a digital humanities researchers, as I aspire to spend about half of my time in the digital world as a software engineer, and half of my time in the humanities, mostly working on problems of socio-economic history and its impact on religious thought. Moving in both of these worlds makes the disconnect between their tools and capabilities noticeable in a way that I find instructive for the efforts of tooling the digital humanities.

Assume I open my email inbox in the morning and find an email from my fellow software engineer Pace, who has difficulties with a piece of software that I am responsible for. For historical reasons, software engineers have termed these problems "bugs", and the process of bringing this issue to my attention is termed a "bug report". In her bug report, Pace will describe what she did, what the expected outcome was, and what happened instead. Software engineers have a standard method for dealing with these issue: we boil the "bug" down to a small program that verifies that the expected inputs produce the expected outcomes---we call that a "unit test"---and then tweak and modify the existing software until the unit test passes, that is, the program no longer exhibits the erroneous behavior. In doing so, good software projects draw upon their existing unit tests---long-running projects will have tens of thousands of these. Passing all unit tests ensures that eliminating one bug did not introduce any other issues: Primum, non nocere.

Assume that in the afternoon, when I check my other inbox, I find an email from fellow digital humanities researcher Paige. Paige just found a problem with one of the transcribed sources that I shared with her from the archival work that I did for my dissertation. The problem may be very small. Perhaps we can even sort out what the difficulty was and rectify it---the source may now be digitized and accessible on the web. But I have no quick and safe way to incorporate that correction into my dissertation, though it exists as a set of LaTeX files that are themselves easy to change. I have no representation of the argument that my dissertation is making. I have no way of verifying that incorporating Paige's correction is local and will not affect other claims that I make.

Of course, that situation is not very different from the one that I found myself in while writing the dissertation to begin with. It seems statistically implausible that there are no flawed steps in an argument spanning a dozen chapters, a couple hundred of pages and consisting of tens of thousands of words. No manager would hire a developer who wrote a program of that size and had checked its correctness only in their head.

Don't get me wrong: I love lemmatized POS-tagged corpora of major writers as much as the next guy does. But the main product of the humanities is arguments, and crucially counter-arguments, narratives that go against what the public already believes to be the case. Should we not devote the same care to them as software developers toward their code?


No comments:

Post a Comment