Can We Map Space & Place in Historical Narratives?

by Jesse W. Torgerson Prefatory note: This is the prosaic introduction to what will be an ongoing series of posts tagged as “Narrative and Geography.” Subsequent posts concern the question of  mapping ‘space’ v ‘place’; how we set up our database; and, what we consider ‘geography’ in the Chronography. The … Continue…

Notes on the margins: how to extract them using image segmentation, Google Vision API, and R

One of the biggest discoveries of the past year for me was the trove of documents available online through the activities of Internet Archive: there is a variety of books from the 19th and early 20th century, scanned, converted into pdf, and even into plain text form (after Optical Character Recognition – OCR – was done on them).  With text available as txt file, it would seem easy to apply various text mining tools to extract information.  This easiness is deceptive: the technology used to recognize text gets in the way.  This summer I was working on extracting text printed in the margins of John of Gaunt’s Register. This was part of Gary Shaw‘s project on the travel of bishops in medieval England.  Below is a summary of the problems I discovered and the solutions I applied.

Continue…