Connecticut Digital Humanities Conference: Traveler’s Lab Presentations

By Vasilia Yordanova

On February 22, students and professors collaborating on the CDER project and other digital humanities research projects at Wesleyan presented their work at the 2025 Connecticut Digital Humanities Conference at Central Connecticut State University. Will Markowitz, Arushi Khare, Akram Elkouraichi, and Professor Torgerson spoke about their progress on the CDER project, beginning by explaining the goal of accumulating data from across many sources into one digital platform to allow for easier access to relationships between data and to facilitate interaction between academic disciplines. Students’ roles in the project are researching sources and incorporating them in the platform, and compiling data in the platform, with Professor Torgerson’s guidance.

Will explained the NodeGoat platform, which the CDER project uses to accumulate and centralize data in an accessible digital format. Then, Akram discussed linked data and his work on modeling the Istanbul walls. Will spoke about relational data and examining relationships between different kinds of data, including geographical relationships (as the project is space-based). He demonstrated how these relationships appear visually in NodeGoat. Arushi described the work of linking seals from the Dumbarton Oaks collection in Georgetown to an open database of people living in Constantinople and to official buildings where offices would have been stationed. The goal is to link information extracted from sources back to those sources in NodeGoat. 

Arla Hoxha and Zaray Dewan spoke about their work on the Chronicles project, and students and professors working on the Life of Milarepa and Chinese language theaters in North America also presented on their recent work at the conference.

Chronicles Methodology

By: Diana Tran

Fall 2024

Methodology: 

Event Types Categorization 

Tagging the chronicle entry involves splitting it into different events. Events are specific instances within the text that have a clear start and end. X happened. Then Y happened, and so on. The methodology of tagging events was created by Daniel Feldman and Arla Hoxha. To reiterate, an event is a specific moment within a chronicle entry based on the text. To organize events and transform these events from qualitative data to quantitative data, the sub-object ‘event type’ was created. 

An event is suggested to only have one event type labeled per event. The event types created by Daniel Feldman were: 

  • Astronomical phenomenon
  • Natural Disaster
  • Death of aristocracy
  • Death of religious leader
  • General Assembly
  • Religious
  • Political event
  • Military Campaign

A discussion of the event types revealed problems with the methodology of the labeling. For example, religious leaders were oftentimes aristocracy, thus labeling for death of religious leader came with the implicit labeling of death of aristocracy – a double labeling that is unnecessary and added fluff to the data once mapped out. A general assembly would be considered as a specific political event, yet it was labeled separately. The largest issue came to be the political / religious event types that implied an event may only be religious or political. The separation of state and church is a modern concept that, when applied to medieval times, applies a modern bias to the data. The issue of the split of religious and political event mimics the issue that comes with labeling the death of aristocracy and death of religious leader – one follows the other, making the manufactured split superficial and inaccurate. 

A proposed change in the methodology to address these concerns would be to scrap the labels political event, religious, and general assembly. It would also combine death of aristocracy and death of religious leader into a general label of death. The labels of astrological phenomenon and natural disaster would also be combined into the new label phenomenon which would encapsulate both.The new labels, in accordance with the parameters of only one word describing each event type, are tentatively named:

  • Mission (External)
  • Petition (Internal)
  • Campaign
  • Phenomenon
  • Birth/Death
  • Office (previously titled Succession)
  • Dispute
  • Celebration
  • Commentary
  • Construction

Mission is a replacement for ‘political’ which specifically is for interactions that occur between two emissaries of sovereign nations, i.e. legates of the Church or ambassadors of Emperor Basil being received by Louis the German. A way to distinguish between Mission and Petition is to ask if the two or more emissaries interacting are from the same nation or different nations? If the answer is of different nations, then the label Mission should be applied. The concept of foreign diplomacy is another way to explain the parameters of this label. ‘General assembly’ is a keyword typically indicating that the meeting is one that deals with foreign embassies as well as domestic ones. 

Petition is the other replacement for ‘political’ which should be applied to interactions that occur between individuals of the same nation, i.e. Bulgars going to Michael (Boris I), their old king or ‘King X held meeting’. The concept of domestic diplomacy is a way to explain the parameters of this label. An assembly or a meeting without the adjective of ‘general’ is typically a sign that the interaction between two individuals are domestic rather than foreign. 

Campaign is the replacement for Military Campaign. Campaign is meant to label events that involve battles and wars between two established states. These two states should not be connected to each other via vassalage at the start of the battle – those should be labeled ‘Dispute.’ A question to ask when conceptualizing events that should be labeled ‘Campaign’ is whether the fighting involved is foreign or domestic. If it is foreign, then ‘campaign’ should be the event type label used. Though it may not be clear without background information, a quick Google search to assess the relationship between two peoples or states during a specific time could be useful in discerning the difference if needed. 

Dispute should be used to label domestic conflicts within empires or between a sovereign state and its vassals. Conflicts with the church should be labeled as ‘dispute’ as the church is an international entity rather than a people or state concentrated within one area. Dispute can additionally be used to label events that deal with disputes that do not have a battle component to them. The verb ‘rebel’ is a typical indicator of when this label should be used, as well as nouns like ‘insolence’.  Of course, the surrounding context is also important. 

Phenomenon is a combination of the previous labels of natural disasters and astronomical phenomenon. Discussion was held to determine that separating the two seemed redundant especially as the labeling method would be similar – rather than the subject – verb order, it would instead be the verbal noun or noun of what is happening. Instead of ‘floods began’ with the emphasis of the verb being ‘began’, it would be ‘flooding’. Another example would be with the phrase: ‘there were earthquakes.’ Instead of ‘Earth quaked’, the labeling would be ‘earthquake.’  Rather than ‘sun eclipsed’, the labeling would be ‘sun/solar eclipse.’ However, if there is a verb that goes well with the phenomenon occurring, exceptions can be made, such as with ‘comet appeared.’ It is based on the surrounding contexts. 

Birth/Death is rather self-explanatory – this label should be used in the case of a birth or death,  It applies to all – the identity of the one who has died or been born– whether they are aristocracy or an individual who held religious office or a commoner – does not affect whether an event receives this label or not.  

Office/Succession is a label that should be used in the case of a change in office whether death related or not. Office/Succession does not need to only be for kingship, but also ascending from kingship to emperorship or bishop to archbishop [for example]. The label should only be used when the change of office is the focus of the event, however. If a person is mentioned to be the heir or successor of X position, that event should not get the label unless the emphasis is on the change of office. As an example:

Despite this event mentioning the concept of succession, the focus of this event is in actually the rebellion of the Sorbs and Siusli – thus, this event should be labeled as ‘Dispute’ for its event type.  

Celebration is a label that should be applied to any holiday, religious [Lent, Christmas] or not, that occurs as well as any feasting due to a successful campaign. It can be considered the new ‘religious’, like how Diplomacy could be considered the replacement for ‘political event’ due to how many holidays are religious in nature. Many events that will have the label of ‘celebration’ will likely be due to holidays, however, this should not dissuade the tagger from not tagging festivities that are not religious or specific to holidays.  ‘Celebrate’ is a common verb used within events that should be tagged as Celebration. 

Commentary is a label that should be applied to when the writing of the text changes to refer to the reader or author themself. A fuller name for this label would be commentary by author, which is also self-explanatory. Keywords that indicate where this label should be used are “I” and “we.” Another indication that this tag should be used is when the author reflects on an individual’s (within the text) actions and expresses their opinion. 

Construction is a label that refers to the construction, reconstruction, or destruction of a building, wall, or anything that can be built (such as a castle). It should especially be used in cases of places that are named, however, unnamed places that are built or deconstructed can also have this label applied to it. 

With this new methodology in event type labeling, the misrepresentations created by the political event / religious labels are diminished by erasing the modern lens previously placed upon the labeling of event types. The focus is put on what happens within an event rather than how an event is conceptualized to the modern tagger. An emphasis is put instead on the observations of the main verb and surrounding context. Additionally, with the old methodology, the deaths of the common people, which are sometimes mentioned as the focus of an event, did not have an accurate label for event type. The issue of aristocracy/religious death is also avoided by having two categories that encompass death for all as well as the succession of either. The rationale behind combining ‘natural disaster’ and ‘astrological phenomenon’ is that both are natural phenomena, thus separating them serves no purpose. The events labeled ‘natural disaster’ and ‘astrological phenomenon’ are near identical in description and presentation within the chronicle as is:

The issues that come with this new methodology are that when it comes to rebellions that are successful in nature, one has to be cautious with the shift that comes. For example, within 898 AF3, other political entities interfere within – the question asked then if the event in which it is mentioned is still a dispute – a domestic conflict – or if the interference of another political entity turns what was a dispute into a campaign? 

There is also trickiness with the ‘Death/Birth’ (which was once only Death)  and ‘Office’ labels, as the previous logic behind office was that births are important to succession thus, births should be labeled under Office. However, the discussion revealed a possible modern bias with assuming that every birth leads to or affects succession, thus, birth became attached to ‘Death’. The question then comes to be if deaths and births should be labeled under the same label, or if another label should be created in lieu of combining. 

Other labels that may be worth adding are building/construction to represent the construction of buildings or walls, as historically speaking, monuments represent political power and may become significant in the upcoming years, as well as ‘author’s note’ for when the author(s) of Annals of Fulda interject their opinions  into the chronicle, such as in 874 AF2, AF3.

This updated methodology aims to address some of the concerns discussed about the previous methodology, though it comes with its own set of problems. It is still in the process of being revised to become more aligned with the goal of Chronicles. Further discussions and research will be conducted over the summer in order to refine these categories with the goal of removing modern bias within these labels while also creating enough labels to encompass the types of events that happen within these chronicles. This is imperative to creating a methodology to quantify qualitative data that is the goal of the Traveller’s Lab.

Fall 2024 — Quarter 1 Meetings

It is Fall 2024 and the Travelers Lab is BACK in full swing.

On October 10th Alex Williams presented on the CDER project as a part of the Quantitative Analysis Center’s lunch series Data Insights: Student Speaker Series with Turning Messy Historical Datasets into a Relational Database: the Constantinople Project. Constantinopolitana: Database of East Rome (CDER) is a second-generation (first gen here) Travelers Lab project to create a space-based digital encyclopedia of the capital of the medieval East Roman Empire, Constantinople. In preparation for an NEH Grant Proposal, Alex Williams is reuniting two idiosyncratic databases on the nearly hundred thousand named historical persons from this era with one of the main sources for those persons. Those sources are indexed in a (now-digital) catalogue of the tens of thousands of lead seals they used to certify the traces of these individuals’historical documentary work in the (literal) Byzantine bureaucracy. Williams shared her methodology, initial results, and discussed plans for additional linked data and possibilities now open for future research.

On October 18th, surrounded by the aromas of delicious sandwiches (for those who could make it in-person), the Travelers Lab heard from Truman Burden and Gabriella (Gabi) DeKoven. These two new lab members presented along with Prof. Gary Shaw on their new project, Business Trips in and from Late Medieval England. Using (and deducing) records of trips from the not-so-famous (e.g. King John’s houndskeep and game master;  Eleanor of Acquitaine’s jester; the Wardens of Merton College, Oxford) to the famous (e.g., Margery Kempe or Christina Markyate) this group is starting a project to track regular trips in (especially) the 13th and 14th centuries. While the methodology and research questions are only just starting to come into focus, the potential contributions of this project for grounding a Database of Medieval Movement are already quite motivating. This project (with contributions from Emmett Gardner) will be the subject of a formal presentation at the New England Historical Association 2024 annual conference at Suffolk University (Boston, MA) on October 26th: “The Medieval Business Trip: Mapping Mobility, Making Society.

Keep at eye on this space for more formal updates from both of these projects!

CDER Project update: Mapping Byzantine Seals and People

By Alex Williams

At the time of the last blog post (May 2024) about Constantinopolitana: A Database of East Rome (CDER) project, we celebrated the creation of a prototype database, consisting of multiple types of objects and artifacts from across Constantinople. Each object type had about 20-100 data points (to read more about this process, check out our last blog post!). Of course, lots of this data is directly integrated into CDER from preexisting datasets, many of them having thousands of entries. 

With permission from the owners, we want to figure out a methodology to add these items into CDER to help find wider connections than one institution would be able to create or develop on their own.  This involves combining large datasets, a more technical task compared to the prototype, as there is lots of data cleaning and preparation involved.

This summer, I worked on combining existing data on Byzantine lead seals with existing prosopographic data (data describing individuals alive during the Byzantine empire). 

Datasets and Process

Byzantine lead seals were used to seal letters sent across the empire. These seals contain information about the sender such as their name, title, and occupation, allowing insights about people and the bureaucracy. Imperial titles, known as dignities, demonstrated a seal owners’ place in the imperial hierarchy. An owner’s office, or occupation, also appears on a seal. 

For this stage of the project, we worked with roughly 16,000 seals from the Dumbarton Oaks(DO) Byzantine Seal collection, in collaboration with Johnathan Shea. This collection has already been digitized. A crucial part of integrating this data into our model was making the seals data into relational data, meaning that some parts of the data (in this case dignities and offices) get tables separate from the seals table. 

We chose to work on data regarding people and seals in conjunction because of the large, preexisting, and already partially connected datasets. Byzantine prosopographic data has a wide overlap with seals, some of which is already documented. This is because the prosopographies use seals as evidence for the existence of people. For example, we might know new information about a person from a seal, or that might be the only record that a person existed. With this project, our goal was not to create insights or establish new ‘readings’ of the seals to create connections with people, but rather to digitize and store existing connections in a database, in addition to creating a model for incorporating other types of data into CDER. 

For the data regarding people, I worked with the Prosopography of the Byzantine World(PBW) database covering 1025-1180 AD. I also did some experimenting (hopefully more to come) with the Prosopography of the Middle Byzantine Period (PMBZ) covering 641-1025 AD.

There is already some data overlap between the DO seals collection and the PBW. A lot of my time was spent cleaning up the data and making it consistent, and understandable to our model, which means that each row has to contain the same columns. Each column must contain its information in similar formats across rows. There were also lots of text analysis strategies used to extract information from how it was originally formatted in a sentence, phrase, paragraph, or description. We also learned methods for importing into Nodegoat (the digital humanities software we are using to store the data), as well as methods for making connections technically. 

Visualizations

Below are some graphs and descriptions of what is possible with this data as a database or dataset, and which point towards the next steps for analysis.

I want to preface that these graphs do not give a perfect, or even good, representation of what’s going on in history, or even in the seals dataset. First, the number of seals is not necessarily representative of the actual number of offices or dignities of a certain type, or the number of people in these roles. Some offices might have been sending letters more often (so there would be an overrepresentation of that office on seals). In addition, visualization is more complicated because of the way seals are dated. We know some of the seals are from a certain set of years, and many of them can be dated to a single century. But for some seals, we are not sure of the exact century, and so they would be dated to two or more centuries. In the graphs below, I wanted to avoid double counting the seals, so the date used is the first date in the estimate. For example, both a seal that we know is from the eighth century as well as one we know is either in the eighth or ninth century would be represented in these graphs as part of the eighth century.

Figure 1

Figure 1 depicts the number of seals over time, with colors representing the offices of the seal’s owner. Figure 2 shows the seal count for the top six overall dignities over different centuries.

Figure 2

 

Figure 3

Figures 3 focuses on the dignity ‘Patrikios’ which was high ranking in the 8th – 10th centuries, losing importance through the 11th. (Shea, 2020; Kazhan, 1991). Figure 3 shows the different offices on seals with the dignity Patrikios. 

A historian can use these visualizations to understand how to develop their inquiry. For example, a historian interested in the hierarchy of a specific office such as the dioiketes could use a visualization such as Fig. 3 to understand which other offices ranked similarly over time. 

Figure 5

Figure 5 shows a section of a network graph for the dignity Patrikios. The largest other nodes are dignities ‘imperial protospartharios’ and ‘anypatos’, as well as the office ‘strategos’, the name for a military general. You can also see tight clusters of seals together in the graph: these clusters are made up of parallel and related seals, some of which are identical. Some broader clusters might be hints to explore certain connections further. The connection with people (white dots in Fig. 5), could allow historians to better date seals, if there are other sources about a specific person, or understand an individual’s trajectory through offices and dignities with more context.

Next Steps

There still are some technical steps to finish up this part of the project, consisting of changing data types and storage methods to store values more efficiently. We are also planning on connecting the seals data to the PMBZ. These connections are slightly more complicated as the PMBZ does not contain direct references to the new format of seals, so bibliographic information on each seal involved has to be extracted, normalized, and then matched to bibliographic information in the DO collection. For the lab as a whole, this semester we are going to work on building structure for other objects (statues), as well as continuing our focus on buildings and incorporating location and geographical information into the data. 

Note: This blog post focuses more on the conceptual aspects of the project. If you are interested in any technical details, reach out to apwilliams@wesleyan.edu. 

Constantinopolitana: A Database of East Rome

Constantinopolitana: A Database of East Rome is a relational database of everything known about Constantinople. The objects, people, buildings, and events are spatially and temporally linked together to create what is essentially a digital map of Constantinople using the research environment nodgoat. To conceptualize potential ways to link the data, the categories places, people, literature, manuscripts, statues, and seals were created. The places category was based on a previous iteration of this project, Constantinople as Palimpsest (CPal), which was originally created using ArcGIS before being transferred to nodegoat [see below].

https://travelerslab.research.wesleyan.edu/files/2024/05/blogmap.png

Places

The places group is concerned with the geography, structures and cisterns of Constantinople. We were tasked with creating a geographic visualization of the Great Palace of Constantinople, the Hagia Sophia and other notable landmarks from medieval Constantinople overlaid atop modern Istanbul. We utilized various sources in order to best approximate sizes and locations, and were able to piece together a map on which the other aspects of the project can be positioned. Then, through the GeoJSON format were able to map the coordinates which could then be uploaded to the database. By the end of the semester, we were able to initiate the first linkage between the other aspects of the project by connecting to the statues database, thereby allowing for a spatial visualization of the statue’s locations. The greatest challenge in the project arose from the limited degree of information regarding the exact position of structures non existent in the modern day.

https://travelerslab.research.wesleyan.edu/files/2024/05/blogpost1.png

We used “Buildings.Spaces” as the main object in nodegoat, which captures the information through name, notes, references, and within buildings. By linking it to the sub-object in Statues.Statues, we were able to make the relationship between statues and buildings.

https://travelerslab.research.wesleyan.edu/files/2024/05/blogpost2-1024x628.png

Over the course of this semester, we plan an expansion and refinement of the existent visualization, and eventually the beginnings of the culmination of our mission through connections to the databases which constitute the project. This expansion of the project will be done in partnership and under the oversight of professor Alice McMicheal from Michigan State University.

by Ruishi Wang and Will Markowitz

People

The “People” object in the database is made up from references to other prosopographical databases covering Constantinople, because we are not doing our own prosopographical research for this prototype, as the goal is to combine existing data. To start, we used data from the PMBZ(Prosopography of the Middle Byzantine Period Online). We began with a list of people received from Ivan Maric, a Byzantine Postdoctoral Research Fellow from the Seeger Center for Hellenic Studies. Those people will be used to build the events that Maric studies into the database. However, to build connections between the different categories, we also needed to find and connect people that other objects referenced. For example, Manuscripts object might have a copyist, and the literature object might have an author. To figure out how we were going to relate different objects in the database, we needed people from each category. Unfortunately the PMBZ only covers the years 641-1025 AD, so we also used four other databases: the PBW(Prosopography of the Byzantine World), Pinakes, VIAF (The Virtual International Authority File), and the DO(Dumbarton Oaks) Byzantine Seals Collection. Cross referencing different scholarly mentions of these people and entering into the database (there are 90 entries as of January 2024) stretched into winter break. The next step for the People category is to add a system of occupation, and at that point it will be pretty much finished for the purposes of the prototype stage.

https://travelerslab.research.wesleyan.edu/files/2024/05/blogpeople-259x300.png

by Alex Williams

Literature

For literature, the first step was to identify an initial group of documents that would be the “test run.” This initial group was based on the manuscript project’s “Dated Lake Manuscripts” spreadsheet. I took the Pinakes URL for each manuscript and created a new spreadsheet that would become the initial repository for the data from Pinakes. In Pinakes (an online database of pre-16th century Greek texts), the “content” section at the bottom of each manuscript’s page was copied into the aforementioned spreadsheet. The Pinakes website is entirely in French, so the automatic Google translate feature on Google Chrome was utilized. Once all data from Pinakes was found, the works were searched for in the Thesaurus Linguae Graecae “TLG” (a digital collection of all surviving Greek literature from antiquity to the present). This led to challenges as the names of both the works and the authors were largely not standardized, as well as some of the titles of the works in Pinakes were not specific enough to determine the match in the TLG. This spreadsheet was then sent to the outside researcher working on this project, Rue Taylor, for assistance in figuring out the inconsistencies. While waiting for Rue, I uploaded the spreadsheet to nodegoat, creating a single data model, “Lit.Literature” that included the title in the original language (either Greek or Latin), the title in English, the author’s name according to Pinakes/TLG, the standardized author’s name (usually found through searching the Pinakes/TLG name on Google), the TLG reference number, and the URLs from both Pinakes and TLG. Not all of these were filled out for each work, however. I think the biggest challenge was the beginning of the project, when I was still working out what made literature and manuscripts different projects. I eventually decided to just get the information and determine the best way to present it on nodegoat and to not focus as much on the abstract concepts, which made the project a lot easier to work with.

https://travelerslab.research.wesleyan.edu/files/2024/05/literature-768x458.png

by Olivia Keyes

Manuscripts

For manuscripts we were creating a dataset to organize three data types regarding manuscripts. The first is the descriptive information regarding the manuscript itself, specifically the date, call number, and primary contents, in short the existence of the manuscript itself. These manuscripts are identified in their sources by a lake number, collected in the pinakes database which can be found through a pinakes URI. The second data type is the location, often the institution, that these manuscripts belong to. As well as the date of creation of these locations, usually religious buildings, which helps us identify a period that a manuscript belongs to, the latitude and longitude are also included. Finally, the manuscript copyist makes up a set of information, identified by their Scribe Name, as in the name attached to the manuscript which they have copied. These three elements to the manuscripts describe the intellectual links between people, institutions, period and geography. The next step to be taken in this area is to integrate “copyists” as a dataset, with “people” as well as “manuscript places” with “buildings” in general, in order to give a more holistic view of the connection between that manuscripts make between people, period and place.

https://travelerslab.research.wesleyan.edu/files/2024/05/manuscripts.png

by Charlotte Seal

Statues

CDER has been looking to create a repository of important statues from Constantinople. In NodeGoat, we have created data models that express relationships between statues and their specific locations within the city. We have been using Sarah Bassett’s The Urban Image of Late Antique Constantinople for entering descriptions about the statues. These include commentary from primary sources and from Bassett herself, who often refers to other scholars in her analysis. Our data model takes note of this difference between primary and scholarly sources in Bassett’s text. The statues are located in various Constantinople landmarks – such as the Hippodrome or Chalke Gate. This has allowed us to seamlessly relate our work to another project that specifically deals with historic spaces in Constantinople. 

https://travelerslab.research.wesleyan.edu/files/2024/05/statues.png

by Zaray Dewan

Seals

We organized the CDER dataset to include most of the elements in the already existing Byzantine seals database at Dumbarton Oaks, organized by Professor Jonathon Shea, who has also been advising this portion of the project. We started our object dataset by including most of the DO database’s information, but as the seals’ connection to the “People” and “Places” objects became evident we focused on the descriptions that linked the seals to those categories. Most seals carry inscriptions and images featuring people holding offices of various kinds, as they were often used in a manner of authentication (of documents, identity, etc.). We began inputting the seals by filtering those which came from offices located within the Great Palace (as suggested by Prof. Shea) and then cross referencing the people depicted to the PMBZ (Prosopography of the Middle Byzantine Period Online) database, as the Dumbarton Oaks database has started to do this with some seals but not the majority. To continue organizing the dataset, the eventual category of “Offices” will be useful in pinpointing more exact locations of offices within the Great Palace and beyond, and we will be expanding to a catalog of coins as well.

https://travelerslab.research.wesleyan.edu/files/2024/05/seals.png

by Arushi Khare

Data Insights: A Student Speaker Series

On April 2, 2024, the Travelers Lab had the honor of presenting to esteemed professors and members of the College of Letters (COL) and QAC departments. Our presentation centered on summarizing the CDER project, providing an opportunity to showcase our research achievements and outline our future objectives.

Olivia began with an overview of the project and its previous iterations, and discussed the issues and challenges with the ways that the data was initially presented, focusing on the geographical inaccuracies and the lack of readability for a wide audience. Will provided context for what Nodegoat is, a relational database that allows for temporal and spatial connections between objects.

2024.04.02 - TLab at QAC 02
2024.04.02 - TLab at QAC 04
2024.04.02 - TLab at QAC 03

To show how this works for the CDER project, a screen recording of last semester’s data model was played. Moving on to the data itself, Zaray pointed out how this project is, at this point in time, creating data from documents and other databases that aren’t necessarily quantifiable in a table. We are also approaching the data input process from a historical perspective, focusing on what historians would be interested in from each category of data. Ruishi then spoke about how we see connections between the data within nodegoat, and showed this [see below] screenshot of the connection visualizer in nodegoat.

https://travelerslab.research.wesleyan.edu/files/2024/05/qac.png

To conclude the presentation, Alex explained the interactions between history and data collection, noting that one of the goals of this project is to create a database that can serve as a model for data usage in a historical context and presentation, and Daniel Feldman, part of the Chronicles: Text to Data project team, alongside Arla Hoxha, Yinka Vaughn, and Diana Tran, provided a comprehensive update on their ongoing project. He discussed their utilization of ArcGIS for analyzin g historical events by extracting information from textual references such as events, individuals, and chronicles.

By Olivia Keyes
Class of 2025

So you want to map a chronicle in Nodegoat?

By Arla Hoxha

Welcome; this is a guide on how to do it, based on my experience with a 9th-century Carolingian chronicle, Annals of Fulda, during Summer 2023, which will also serve as a practical reference.

Assuming you have already completed step 0, that is, cleaning up and uploading your transcript into the new Project, we can get started. Set up your own object types (using the Noadgoat guides) or simply add those used in AH.FULDA (go to Management and add JWT.person, JWT.pleiades etc.). Using existing objects would be best, to save time and avoid confusion. 

Once your objects and transcript are set up, you can start tagging your text with names of people and places, as well as religious festivities, if applicable. If you need to tag something else, communicate with your supervisor and agree on a new object.

On jwt.person:

This one is pretty straight-forward. Every time you see a name in your transcript, tag it. I like to tag by creating new entries (add a new jwt.person entry when I see a new name) as I go. Another way to do it would be to add jwt.person-s beforehand and then tag with existing objects. No best way to do this; depends on what you like most.

On jwt.pleiades:

To set it up: Go to Pleiades (or some other website you are pulling data from) and download the most up-to-date location csv file. This file (I used one with about 40k entries!) will most likely need lots of cleaning up (through excel, R, etc) so it is easier for Nodegoat to process it. Once you have your clean file, you are ready to import (Model-Import-Import csv file). Go to Model-Import-Import Template, then map your csv file to a data model (an object, new or existing). Make sure all the columns of the file correspond to the correct element of your object and run the template. Voila! Ready for use. 

Tagging: Works exactly like person tagging. However, Pleiades is a scholarly portal of ancient locations, which raises two issues: 1) Locations will often appear only in their Latin name which you need to look up to find the location you need in Pleiades and 2) sometimes Pleiades will not have the location you are looking for. In the first case, track down the Latin name and leave a short note/ link to explain where you found it. In the second, use the already set up object with GeoNames locations (this is still not fully functional; we’re working on it!) or use a general/broader location from Pleiades. 

Every once in a while I found that during clean up I had deleted a location that would show up on the text that I would have to manually add to my jwt.pleiades. Keep a Pleiades search tab open, just in case. 

Once smaller elements are tagged, you can start tagging events. What is an event? It’s hard to say, but here’s a semi-coherent way of deciding what to tag together. 

An event should have:

Time: open or singleone date or no date

Place: single or multiple if unspecified

Agent: single, although there can be multiple

Narrative: single

With event tagging you are turning a chronicle entry into bite size pieces of single events. An event does not need to be one action, although it can. Bite-sized does not mean every sentence has to be one event (although, sometimes it is). Most events are a single narrative happening on a specified date (or over an unspecified time frame that captures the whole event) and place. Place is probably the last thing to focus on because a single event can sometimes take place over several locations. An event can contain one or more characters and one or  more actions. Naming is difficult, but choose to be descriptive over detailed. Think ‘how can I say what’s going on using the least amount of words?’ Sometimes centering the title of the event entry on a word in the text can make the task easier as well. Deciding how to tag dates was difficult but we agreed on writing down the exact time in the descriptor if it is given by the text (even if not in date form, i.e.: spring, november, mid august etc) and using a sequence in the date sub-object.

Reconciliation is useful but not flawless. Create a database of objects for it to pull from before using it. I already updated the places reconciliation to draw from Pleiades. Its biggest flaw, I think, is that it cannot detect and generate objects for you. You need to already have the objects you are reconciling and Nodegoat can tag them for you. Or you can go back and manually tag them yourself. Initially, I recommend manual tagging, so you have a library to pull from.

The Fulda project is now completed and it can be used as reference for your own work. When in doubt go back to the model or this guide. This is of course, an incomplete guide; for detailed how-to’s, check Nodegoat’s Guides and Documentation pages. Different projects require different accommodations but I hope this will be helpful to anyone getting started with text mapping in Nodegoat. 

PS/ Important notes:

  1. When there is no place specified leave places tag empty
  2. Tag dates as sequence (i.e.: 871 01, 871 02 etc) based on the order they appear in text, not on actual chronology 
  3. Fill date tag in the description with exact time (if provided by the text)
  4. Religious celebrations are not tagged as separate events (but usually as part of the previous one) because they have their own tags (jwt.religious) and are usually somehow integrated into the other event
  5. When the location specified in the text cannot be found anywhere tag with broader location (if you cant find x village or y palace in Bavaria tag them as Bavaria)

The Event-Based Narrative in the Annals of Fulda: Results

The Fulda project, a quantitative-qualitative analysis of the chronicle Annals of Fulda through the platform Nodegoat, resulted in a fully-fledged database of chronicle entries, people, places, and events. The model used to map events is a novelty in the field of chronicle-studying, and one we hope will continue to be replicated and improved upon. We hope our database will aid scholars reflecting on this time period or thinking about questions of narrative and the anatomy of the chronicle; why are they put together in a certain way? Enriching the model and data and refining our processes are next for our team at Traveler’s Lab. Following the footsteps of previous projects in the Lab, even though we started Fulda from zero, many of our goals were realised during this summer, some of which are outlined in this article. 

The obvious, and maybe the most important achievement was the database itself, with the chronicle fully uploaded to Nodegoat. Anyone with access to the database can find, categorize and visualize elements of the chronicle, such as the people or places in it. The chronicle entries are tagged by which chronicle and manuscript they belong to and the text is fully mapped with object tags. This makes it easy to analyze the chronicle based on the elements that interest a researcher. Through the Nodegoat configuration, it is possible to see the way all the data is linked to each other; what events take place where, who is involved, how many times a name is referenced in a chronicle entry, comparisons between multiple entries or events, and more. 

A great feature of the database is the link to ancient locations through Pleiades. All events have a location tag which allows us to visualize the events in a geographical map. Interconnectivity is one of the best things about this model because not only do we have data on different places and events but we know what happened where and who was involved.

Creating a model for determining events that allows us to follow the logic of the narrative was an important achievement this summer. The process involved much trial and error and remains a work in progress, but we were able to refine the bulk of the process, as explained in the last article. In creating the event objects for the database we made sure to use the text’s language as well as date the events based on the sequence of the narrative rather than historical time (although a descriptor was provided for this, in case a specific date was present in the text). All this was done to shift the focus towards the narrative and follow the logic of the chronicle and what the text deems important, as opposed to our reading of it. We hope this model will inspire and allow for more thorough analysis, that leaves less room for misinterpretation. 

In a future ambition for expanding the project, we hope to use the comparative tools provided by Nodegoat and the construction of the model to run comparisons between different manuscripts as well as the English and Latin versions of the text. Onboarding a scholar of Latin to work with the team is an aspiration which would further enrich the Fulda project. 

As stated before, we hope to expand our model to include data from other Carolingian chronicles, such as The Royal Frankish Annals. We hope to inspire other scholars to use quantitative methods, especially those that centre the narrative, in their research of chronicles from all time periods. 

Although much progress was made during this summer, there is always room for improvement. In the upcoming semester, we anticipate fixing our problem with accessing locations not covered by the Pleiades ancient locations database through an API. We also hope to find ways to automate as much of the text-tagging process as possible. Some of this has already been done, through the Reconciliation system in Nodegoat, but we wish to refine the process further. The use of AI shows big promise in this regard, as was discovered during an experimental session of using OpenAI to perform people text tagging. Implementing and integrating this process into the Nodegoat object tagging is one of our goals for the future.

The Event-Based Narrative in the Annals of Fulda: Methodology

Introduction

In line with other Traveler’s Lab projects, this undertaking was the beginning of a long exploration of using quantitative methods in the study of medieval chronicles by following the logic of the text through its narration, rather than that of chronology.  This project, drawing from the 9th-century Carolingian chronicle, the Annals of Fulda, served as an experimental model that will inspire similar practices in the way we study chronicles. Describing the work of a whole summer, this article will focus on the methods used to study the Annals of Fulda, including the constructed models we hope will have a wider impact.

Methodology

The whole text of the chronicle the Annals of Fulda was parsed, scanned and uploaded to Nodegoat, a web-platform that allows for data modeling and contextualization through spatial and temporal elements. Nodegoat allowed us to create our own objects to map our data (from the text) such as Person (the historical people part of the event) and Places (the geographical area where the event was happening). The text was systematically mapped with tags of Person, Places and Event objects. A new object added to Fulda was the Religious tag, which is used to map religious celebrations such as Easter or Christmas, that occur throughout the text. Starting to map Fulda not having used the platform before was made easier through following the sample of Fulda’s sister project The Royal Frankish Annals, modeled by Daniel Feldman. Therefore, many of the objects were already set up, and only needed to be furnished with the new data. In order to have both projects sharing the same object database we created the Chronicle object to differentiate between them as well as different manuscripts of Fulda

The team had already started thinking about new ways to express events, in a way to make them help us better understand the narrative. The way the project defines the event is different than we might think of them regularly. For instance, an event is not only a battle or a coronation, or an ‘important’ happening; anything can be an event. In fact, everything is. Every couple of sentences focusing on a specific narrative (following certain guidelines for time and place), was mapped as an event. 

Determining what constitutes an event and creating the event dataset was a challenging experience and a process we are refining to date. With the intention of fully capturing the text of the chronicle, we started developing a model where every sentence would be an event, but soon realized that this would not fully capture the scope of the narrative. We then opted for a definition of the event that was more narrative-focused where the events would terminate depending on the change of temporal identifiers as well as agents in the narrative. To avoid bypassing the text (as the short titles do not allow for detail) we decided to add a ‘Passage’ descriptor, where the text of the particular event is disclosed. 

The event object was the most important yet most difficult to develop. We went through a long trial and error process figuring out what descriptors to attach to the object, in a way that was useful but not redundant. The event object is now linked to the chronicle entry (the text of the chronicle by year), person, places objects and has a sub-object denoting time. 

The places object is connected to Pleiades, a database of ancient locations (along with their longitude, latitude, Pleiades id) which we imported to Nodegoat. The location identifiers in places allow us to visualize the ancient locations where the mapped events happened. 

Dating the events was another issue, since only some of them have a time identifier. We decided that instead of following a chronological logic, by using estimates and dates the text provided to date events, we would follow a narrative logic, by not ‘dating’ the events per se. Instead they would be connecting to each other sequentially, as dictated by the narrative determined by the chronicler; sometimes narrative and historical time are not interchangeable. To preserve the information the text provides we added a descriptor for ‘exact dates’, to be used in case the text provided one such descriptor. 

Having now created a database of objects, Nodegoat allows us to use the Reconciliation feature to map objects such as Places and Person to the remaining chronicle entries. Although not a flawless process, Reconciliation allows for semi-speedy execution of an otherwise laborious task. We are still working on ways to automate the process of text-tagging and potentially extend it to other objects, such as events. 

Travelers Lab Presents at the Fall 2020 New England Historical Association Meetings

At the Fall 2020 meeting of the New England Historical Association, held virtually, members of the lab provided the papers for a panel called, “Traveling in the Middle Ages: Using Digital Methods and Spatial Analysis for Historical Research.” Chaired by Ella Howard of the Wentworth Institute of Technology in Boston, it was notable for featuring a co-written paper by Wesleyan and Lab alumnus Connor Cobb ’18 as well as three other lab members. The papers were “Women at the Common Law: Travel and Gender in Thirteenth-Century English Courts”
by Gary Shaw and Connor Cobb, Wesleyan University, “The Camino de Santiago: Student Researchers and Creating a Database for Spatial Analysis,” by Sean Perrone of Saint Anselm College, who organized the panel and is in fact the President of the New England Historical Association, and “Medieval Travel as a Big Data Problem,” by Adam Franklin-Lyons, who is now notably based at Emerson College in Boston. All told, the session exemplified the lab’s collaborative character, the goal of integrating students (and former students) into our work, and our interest in combining innovation in pedagogy, interest in the theoretical and methodological challenges of such work, and an ongoing commitment to classical historical scholarship.