Comparing Chronicles at the IMC Leeds 2025: The Annals of Fulda—Events Over Years as a New History of the Text?

By Jesse W. Torgerson

On behalf of the Comparing Chronicles Team: Churchill Couch, Zaray Dewan, Arla Hoxha, Diana Tran, Tess Usher

The Comparing Chronicles Project took some significant methodological and theoretical steps in 2024-25. The project began in 2023 as an investigation into (1) frameworks of historical time; and, (2) what historians could do with only partially accurate data. By Summer of 2024 the project had turned to nothing less than developing an alternative to the Historical-Critical Method of studying a text. The following narrates a stage on this journey, and a bit of how we got here.

The Comparing Chronicles project was invited by The Flow Project to participate in a panel at the July 2025 International Medieval Congress at Leeds University.

The Flow Project (led by Tobias Hodel at University of Bern and Silke Schwandt at Bielefeld Universität) purues “standardized digital workflows based on existing technology, making it easier for researchers to work with historical sources digitally.” This simple statement contains a significant advance in the Digital History (and Theory) landscape. “Digital workflows” are already a part of historical methods, but without concerted efforts to collaborate with each other scholars find themselves each inventing different but functional versions of the wheel (or, to give a more relevant example, parallel means of extracting machine readable text from handwritten sources).

Panel 544 Digital Data Flows promised difference means of processing medieval documents, and allowed us to present the work done in our Comparing Chronicles project as a unique example of what difference a standardized digital workflow could mean for comparative study of Early Medieval Chronicles from the example of three different versions of the Annales Fuldenses (Annals of Fulda).

The following is an abbreviated version of the remarks made at this panel.

Our Comparing Chronicles Project has been constructed using the web-based relational database tool Nodegoat. We have used the Nodegoat Go license subscription at Wesleyan University Library to create a fully collaborative research environment.

As our team builds this database together we are constantly re-thinking our processes (i.e., our Digital Data Flows) and as a result continually updating our methodology, which takes the form of a new structure to our relational database.

The most significant shift in the past year has been to shift from a TEI-based methodology to one that distinguishes different elements of the text and its structure as distinct but related datasets (or, in Nodegoat’s terms, “Objects”).

Each of the boxes in the above image indicates a distinct dataset. Chronicle is the title of the work (in the next stage of our project it will be a specific manuscript). The Chronicle Entry is the text under each year in that chronicle, in both Latin and (for reference) English, providing the text as it appears in each work (or, eventually: transcribed manuscript).

The next two datasets represent two levels of our distinctive analytical contribution to study of the text. Passage is the division of each annual entry’s text into distinct narrative sections, and the delineation of those narrative sections by the order (“Passage Number”) in which they occur in the annual entry. Thus the “name” of a passage might be AF 2 887 07, where AF 2 means “Annals of Fulda v.2,” 887 is the Chronicle Entry in which the passage occurs, and 07 denotes this as the seventh distinct narrative unit in the entry. We tag the text of the Passage for persons and places (the only use we make of TEI or text-tagging).

Event is where our analysis fully enters the picture. Here we give each Passage a label, in Latin (and translated into English for non-Latin-literate users). The Event name is based on the grammatical phrase which we have isolated as the focus of its narrative. The Event name uses the actual Latin of the text (whenever possible) to designate the central event of a passage. For instance, Celum Apertum is the event name for the 14th passage under 887 in the third version of the Annals of Fulda (or, in our shorthand: AF 3 887 14):

Et mirum in modum, usque dum honorifice Augensi ecclesia sepelitur, celum apertum multis cernentibus visum est, ut aperte monstraretur, qui spretus terrenae dignitatis ab hominibus exuitur, Deo dignus caelestis patriae vernula mereretur feliciter haberi.

If it is helpful, we have also considered calling Event instead Episode. Finally, we give each Event one (or in extreme cases of ambiguity, two) Event Type labels.

The following extended paragraph explains our use of the Event Type (if you are not concerned about it, feel free to skip down), which is not in fact central to our analysis or project. The important point is that these are helpful to our analyses, but they are not what we are analyzing about the chronicle texts. They are imposed analytical categories, which is why we have made them a distinct part of the database (it is possible to study the text without using these types as an analytic). At the same time we have taken great care and gone through many different versions of these through extensive internal debates before settling on a list of eleven. We have found this list to be sufficient for capturing the different sorts of events which the Annals of Fulda uses to fill out its annual entries. These are: Campaign, Birth/Death, Office/Succession, Meeting, Assembly/Council  Travel/Embassy, Dispute, Celebration, Construction,  Commentary, Phenomenon (note, this is what we assigned to AF 3 887 14, above). It needs to be understood that the Event Type is also NOT the goal of the analysis. These distinctions allow us to identify potential similarities between texts or between entries. In the future they will allow us to make some overall statements about about entries (i.e., an entire year’s entry), or about trends or emphases in different texts as a whole. But ultimately they are simply an analytical tool to understand the text, rather than being core to our argument about the text. All of this follows a central tenet of our database structure: to keep our analytical work (in the Passage and Event and then Event Type datasets or objects) separate from the digital text itself (in the Chronicle and Chronicle Entry datasets).

Welcome back. The key to our analytical interests—and so our intervention into the text and the history of the text—is the combination of Passage and Event.

Our data creation and analysis thereof is based on the following premises. Chronicles (chronographic, chronological, and annalistic texts) are rarely attributed to a single author, and even when they are they are rarely the product of (a) a single moment and/or (b) a single person’s unique assemblage of information. Chronicles are much more often (1) collective, collaborative enterprises; (2) written over long periods, re-written, both, and more; and, (3) compiled from excerpted, rewritten, partially rewritten, and/or orally-transmitted pieces of information. As a result, what interests us as historians is not a textual-originalist (i.e., the Historical-Critical Method) approach which seeks to reproduce the text as it emanated from the quill of its author-scribe. Rather, what interests us is whatever happens to be the social conditions of the production of the text. This has led to our analytical interest in two things:

  • any and all informational overlaps between extant versions (i.e., down to the variant manuscripts) of any chronicle text, since each of those overlaps indicate to us shared knowledge
  • shared knowledge in turn indicates the social networks through which that knowledge was shared

Our relational database has shifted its structure over the past two years to allow us to get greater access to these two aspects of the evidence we find the text contains. What we have learned from observing the flow project is to identify where digital tools are and are not useful for this study. Thus far they are useful in two areas: (1) tagging persons and places in each Passage through Named Entity Recognition (nodegoat’s built-in internal process for this is “reconciliation”); and, (2) analysis and visualization of the “informational overlaps” and “social networks” noted above. In the near future we will add (3) Handwritten Text Recognition to our workflow when we turn to using the actual surviving manuscripts for our Latin text rather than the printed critical editions (which we continue to use as we build our database model). Identifying our unique ‘digital flow’ emphasizes how much our work remains tied (by the necessity of our own standards of exactitude) to careful reading practices, double- and triple-checking all of our data as we proceed.

At the time of the IMC Leeds presentation we were able to offer the following.

Having conducted an initial analysis of three different versions of the Annals of Fulda (AF 1, AF 2, AF 3 in our representation), we were able to display the overlap of the entries in each of them (i.e., where the entire text for a year was the same) in the following image.

AF 1 shares all of its entries with both AF2 or AF3, which in turn share a number of their entries with each other but then possess their own distinct entries. This is simply a visualization of the text history as manifest in the MGH critical edition.

What our Comparing Chronicles methodology does to the text is visible in the next image. This is a comparison between the entries for 882 in AF3 (to the left) and AF2 (to the right), which are each represented by the large blue nodes in the bottom corners.

The orange nodes clustered in descending numerical order on the left and right sides each indicate a discreet Event (i.e., narrative episode) in each text. We have activated the persons (bottom, pink) and the Event Types (top, orange) but de-activated the places (blue, floating in the middle) of each version to make it possible to see the encoded relations.

The key element of interest, above, for our purposes are the six Events (purple) where the visualization displays the labels (in English here, rather than in Latin) of the events which we noted as the same event. That is, these five (“Louis III died,” “Comet Appeared,” “Army Returned,” “Northmen burnt Koblenz,” Northmen burnt Trier,” and “Bishop Wala attacked Northmen”) we read as the same core episode even in the text describing or narrating each was different in each text. In numerical terms, AH2 has 18 distinct Events, AH 3 has 20 distinct events, and they share 6.

According to the Historical Critical Method, these are completely divergent textual traditions. According to our Comparative Method, this single year’s entry in each text possesses evidence of a 23-25% overlap in the knowledge network between these two textual communities, as recorded in their respective annals. This is already an exciting an promising result, just from this small test case.

In this upcoming semester we will be applying our most updated methodology to an encoding of a combined Latin-English database of the three versions of the Annals of Fulda, while also applying this same methodology to the Annals of St. Bertin. This work will give us our final methodological prototype, making it possible to visualize what the evidence from a surviving text of a single knowledge network looks like—even when there are three versions of the text spread out amongst the sister monasteries of the monastic house of Fulda—as compared to a distinct knowledge network, that of St. Bertin.

However, our final visualization gives a hint of what is in store. This is a detail from the social network of the Entries and Events in both the Annals of Fulda and the Annales Bertiniani. While these networks are clearly distinct, the Events (in purple) which hang like spiderwebs between the two clouds indicate events which are the same episode. Our methodology already allows us to see that even though these are distinct textual traditions, there are meaningful links in the historical knowledge which they shared and recorded in common.

Annales Fuldenses, Humanist Library of Sélestat MS 11

(Year 855)

The Traveler’s Lab at the Valle Gianni Field School

By Kathryn L. Jasper

Since its inception the Valle Gianni Field School (or Northwest Bolsena Archaeological Project – NBAP) has found new ways to collaborate with undergraduate students in the archaeological excavation of an imperial Roman villa. In the past three years that collaboration has included Wesleyan students trained in the Travelers’ Lab. The Valle Gianni excavation expands the historical narrative of central Italy during and after the Roman period by focusing on a region that has received little attention from archaeologists and historians. The published report of the initial seasons is available from Fasti Online. The excavation is funded by an ARCS research grant from Illinois State University.

Valle Gianni site on Lake Bolsena
Base Aerial Image (Ryan Lange, 2025)

What is the Valle Gianni Excavation?

The archaeological site of Valle Gianni is currently defined by two primary features—the partially exposed remains of a monumental fountain from the Roman imperial period, and some recessed vats associated with wine production. When working on site students experience every part of the excavation process (survey, mapping, excavation, ceramic analysis, etc.). Students thereby actively contribute to recovering the now-lost relationship between the landscape of this region and the centuries of human actors who lived, farmed, and built residences there.

Direct Student Involvement

One of the most exciting and innovative aspects of this project is the direct involvement of undergraduate students in not only the excavation work but in data creation and analysis. The interdisciplinary research-teaching model of the Valle Gianni Field School unites humanities and STEM fields by bringing together the diverse expertise of faculty instructors from four different departments. For students this means an opportunity to not just think about or see the advantages and challenges of interdisciplinary research, but to actually participate in them in real time.

Participating students have the chance to play a role in research outputs, either as co-authors or independently. The Valle Gianni Field School seeks in particular to encourage undergraduates who might not otherwise consider pursuing graduate-level research in developing research expertise and experience. Ultimately, our approach offers a new model for mentorship at Illinois State University, and beyond, through the participation of other schools in the project such as Travelers’ Lab research students from Wesleyan University.

The Role of Wesleyan University Traveler’s Lab Students

The Valle Gianni project requires a database platform which is at once capable of storing diverse types of data while also being capable of visualizing that data in specific ways for analysis. This is a complicated undertaking because of the interdisciplinary nature of the venture, and the resulting range of media and data types which the project generates and accumulates.

To this end, we adopted Nodegoat (see: www.nodegoat.net) to manage our data. Nodegoat gives scholars the freedom to generate and analyze datasets based on their own specific, bespoke data models. Moreover, Nodegoat allows relational modes of analysis, which combine both spatial and chronological data. These features make it ideal as a platform for archaeological data.

Wesleyan students who trained in the Travelers’ Lab have pioneered the integration of Nodegoat with the Valle Gianni project goals. In 2023 two students—Sarah Brown and Sofia Gallegos—worked together to develop the first relational database model for the Valle Gianni data. Then the following year (in summer 2024) Sarah Brown traveled to Valle Gianni and participated in the Summer Field School’s excavation in order to implement that database live for the first time into the excavation workflow, and to incorporate the new data generated that year.

Most recently, in summer 2025 another Travelers’ Lab student from Wesleyan—Alex Williams—spent four weeks on site. Alex developed an advanced understanding of the archaeological process and how data collection functions on the ground.

Advancing Valle Gianni with Relational Databases in Nodegoat

Specifically, in 2025 Alex Williams addressed herself to integrating the two major types of data which the project generates: data describing physical objects, and digital-born data. Digital-born data include geospatial data (vector and raster), digitized photos—such as photos pertaining to photogrammetry used to produce 3-D digital models—and digitized copies of analogue records generated during the survey, excavation, and subsequent analyses (e.g., site notebooks and excavation diaries), satellite images, and semi-automated data recorded in already-structured machine-readable formats (e.g., spreadsheets and databases).

Excavation photographs and a digital topography drawn by Simone Moretti Gianni

Other found physical objects include stone, tesserae, ceramics, tile, metal objects, slag and metal derivatives, but also soil samples, botanical remains, faunal remains (including osteological remains and shells), and glass. See some representative excavation photographs below.

Representative object finds. L-R: glass fragments, bronze coin, iron nail, mosaic tesserae, loom weigh

Alex Williams’ careful attention resulted in successfully reconciling manual recording practices with the semi-automated generation of digital databases and in this tangible way her work significantly improved the Valle Gianni workflow. The contribution of the Travelers’ Lab students to the Valle Gianni data model can be seen in the comparison, schematized below, of the initial data model through 2024 (black outline) and the comprehensively redesigned, rationalized, and simplified model of 2025 (blue outline) as compiled by Alex Williams.

We regard the accessibility of our data as a critical component of our project. The new developments and components of our workflow have added new facets to the curriculum. These ensure that our data is available, accessible, and as easy to use and understandable (“readable”) as possible. Nodegoat actively helps make this possible. The platform will allow us to transfer data to a public-facing website with intuitive user functions.

The development of our work and processes continues. We continue to refine our databases and their relationality, which in turn opens up new opportunities for new student researchers as well as new types of student research: database design is now a part of the program curriculum.

We invite any students interested in helping with this aspect of the project (data management and web design) to apply and inquire about participating in the project, whether during the summer field school or during the academic year. And, of course: all aspiring archaeologists (to right: ISU student Hannah Torkelson) are always welcome to come and help us move dirt around the Italian countryside!

Fall 2025 Event Type Organization

By Diana Tran

In May 2024, we started with Daniel Feldman’s and Arla Hoxha’s event types as a basis to categorize events into ten different types, as shown below:

  •  Mission (External)
  • Petition (Internal)
  • Campaign
  • Phenomenon
  • Birth/Death
  • Office (previously titled Succession)
  • Dispute
  • Celebration
  • Commentary
  • Construction

In Summer 2025, Churchill Couch and Tess Usher worked with Principal Investigator Jesse Torgerson to update the event types. From the original ten event types, they have added an eleventh and updated certain event types to ensure higher accuracy. The new event type categorizations are as follows:

  1. Travel/Embassy, replacing Mission’s purpose
  2. Petition being split into Meeting and Assembly/Council

According to Professor Jesse Torgerson, here are the definitions of  the new event types:

  1. Travel / Embassy: Ultimately this is about tracking the movement and exchanges between those who run political collectives – kings, popes, emperors, etc.. “Travel” here is only being noted not because there is movement implied in the action, but because the action IS movement. E.g., the King going to Worms is in and of itself an event worth noting. Berengar (e.g.) going to Worms is not an event in and of itself (rare exception: a nobleman such as Berengar is ACTING like a King, i.e., putting together a rebellion), but rather will be part of another action such as a Campaign
  2. Meeting: Here the event action is characterized by subjects of the text getting together for the purpose of exchanging something, or making some kind of communication. Examples include people bringing a petition to the King, or the king asking a nobleman to come to court for something not particularly negative. One issue to pursue / keep thinking about. Currently, when one side in a campaign comes to meet with the other side and negotiate something (e.g., terms of surrender) we have tagged this as a Meeting.

Meeting is meant to encapsulate all meetings within the chronicles. These meetings don’t necessarily need to be antagonistic, but need to have two or more Persons involved. Professor Torgerson does acknowledge a possible conflict in interest in the example he gave for Meeting could be counted as Embassy because of the meeting between two warring factions (political collectives) as a result of a campaign

Lastly, 

  1. Assembly/Council: The event action is defined typically by the word ‘assembly’ or ‘council’ within the event itself. It is typically an official meeting organized by an official, such as a king, emperor, or a papal leader. 

Petition was deleted in favor of Assembly/Council and Meeting for the sake of more specificity in our categorization. The move to create Assembly/Council is a callback to Daniel Feldman’s tag general assembly. This time, we’ve broadened the scope of this tag to include council meetings as well (based on context) and the specific keyword ‘assemblyis a common indicator to use this tag. 

A change we’ve made is that one passage can be tagged with two event types by splitting the passage into two events and tagging both events with the relevant event type. 

Thus, the new event types are:

  1. Birth/Death
  2. Office/Succession
  3. Celebration
  4. Construction
  5. Travel/Embassy
  6. Meeting
  7. Assembly/Council
  8. Campaign
  9. Phenomenon
  10. Commentary
  11. Dispute

We are hopeful that this will be our final iteration of the event type categorizations, however, if major issues appear as we continue to utilize this methodology in Fall 2025, we will revisit it.

Fall 2025 Chronicles Methodology

By Diana Tran

A History:

In Spring 2024, we had written up a methodology to show newcomers to the Traveler’s Lab how to tag in Nodegoat. Under this methodology, the tagging would occur in the body of the chronicle entry. The body of the chronicle entry would be fully tagged like so:

The tagger would go through the entire entry and tag the People and Places first before beginning to tag the events, one by one in the chronicle entry. With this methodology, problems would come up when an event was too long to tag and/or was prone to mistakes, especially as the tagger would be looking through thousands of words to tag. Context was lost during long rounds of tagging, so an ‘emperor’ would not be tagged as the correct one, which needed to be rectified by a second round of tagging. We regarded these two problems as major time sinks. After meeting with the NodeGoat developers, they suggested we cease tagging in the body of chronicle entries and instead utilize the cross-reference function more often. 

A Passage: 

Rather than tagging in the chronicle entry, the events would be split into Passages:

Then the individual passage would be tagged, Person/Place, and then Event. The Passage would be numbered  by sequence, and the naming pattern would be the chronicle entry followed by what passage number it was for said chronicle entry. Shown below is what a complete, tagged passage would look like:

and the accompanying chronicle entry:

New System:

Over the summer, Churchill Couch, Tess Usher, and Principal Investigator Jesse Torgerson developed a new methodology of tagging. 

The tagger is instructed to read through the chronicle entry and manually split up the chronicle into passages (preferably on a separate document website) like so:

Once this is done, the Latin taggers can add the Latin version and re-input the entire text file back into the chronicle entry, like so:

The passage would be created with both the English and Latin versions in one passage:

Persons and Places are tagged within the passage. However, there is an ongoing discussion as to whether to tag the Latin or the English version, or both. We won’t be tagging the event within the passage now. The “Date Start” is actually not a chronological thing. The ‘year’ is the chronicle entry year, the ‘month’ is the version of the annual [i.e. AF1 = 01, AF2 = 02], and the ‘day’ is the numbering of the passage [Passage 8 = 08]. Something to note in the titling of the passages should be [0X] for any number that is less than 10 due to a coding/decimal issue that messes up the order. 

The event is going to only be written in Latin and the passage will be linked to the event via the ‘subobject’ within Event as can be seen here:

The event will take the event type. Multiple events can take the same passage, as can be seen in here:

The Latin terminology has different syntax and sentence structure than in English, so what would be one event type in the English translation could be two or three within the Latin syntax.

 

Comparing Chronicles at the IMC Leeds 2025: The Annals of Fulda—Events Over Years as a New History of the Text?

By Jesse W. Torgerson

On behalf of the Comparing Chronicles Team: Churchill Couch, Zaray Dewan, Arla Hoxha, Diana Tran, Tess Usher

 

The Comparing Chronicles Project took some significant methodological and theoretical steps in 2024-25. The project began in 2023 as an investigation into (1) frameworks of historical time, and (2) what historians could do with only partially accurate data. By the summer of 2024 the project had turned to nothing less than developing an alternative to the Historical-Critical Method of studying a text. The following narrates a stage on this journey, and a bit of how we got here.

The Comparing Chronicles project was invited by The Flow Project to participate in a panel at the July 2025 International Medieval Congress at Leeds University.

The Flow Project (led by Tobias Hodel at University of Bern and Silke Schwandt at Bielefeld Universität) purues “standardized digital workflows based on existing technology, making it easier for researchers to work with historical sources digitally.” This simple statement contains a significant advance in the Digital History (and Theory) landscape. “Digital workflows” are already a part of historical methods, but without concerted efforts to collaborate with each other scholars find themselves each inventing different but functional versions of the wheel (or, to give a more relevant example, parallel means of extracting machine readable text from handwritten sources).

Panel 544 Digital Data Flows promised difference means of processing medieval documents, and allowed us to present the work done in our Comparing Chronicles project as a unique example of what difference a standardized digital workflow could mean for comparative study of Early Medieval Chronicles from the example of three different versions of the Annales Fuldenses (Annals of Fulda).

The following is an abbreviated version of the remarks made at this panel:

Our Comparing Chronicles Project has been constructed using the web-based relational database tool Nodegoat. We have used the Nodegoat Go license subscription at Wesleyan University Library to create a fully collaborative research environment.

As our team builds this database together we are constantly re-thinking our processes (i.e., our Digital Data Flows) and as a result continually updating our methodology, which takes the form of a new structure to our relational database.

The most significant shift in the past year has been to shift from a TEI-based methodology to one that distinguishes different elements of the text and its structure as distinct but related datasets (or, in Nodegoat’s terms, “Objects”).

Each of the boxes in the above image indicates a distinct dataset. Chronicle is the title of the work (in the next stage of our project it will be a specific manuscript). The Chronicle Entry is the text under each year in that chronicle, in both Latin and (for reference) English, providing the text as it appears in each work (or, eventually: transcribed manuscript).

The next two datasets represent two levels of our distinctive analytical contribution to study of the text. Passage is the division of each annual entry’s text into distinct narrative sections, and the delineation of those narrative sections by the order (“Passage Number”) in which they occur in the annual entry. Thus the “name” of a passage might be AF 2 887 07, where AF 2 means “Annals of Fulda v.2,” 887 is the Chronicle Entry in which the passage occurs, and 07 denotes this as the seventh distinct narrative unit in the entry. We tag the text of the Passage for persons and places (the only use we make of TEI or text-tagging).

Event is where our analysis fully enters the picture. Here we give each Passage a label, in Latin (and translated into English for non-Latin-literate users). The Event name is based on the grammatical phrase which we have isolated as the focus of its narrative. The Event name uses the actual Latin of the text (whenever possible) to designate the central event of a passage. For instance, Celum Apertum is the event name for the 14th passage under 887 in the third version of the Annals of Fulda (or, in our shorthand: AF 3 887 14):

Et mirum in modum, usque dum honorifice Augensi ecclesia sepelitur, celum apertum multis cernentibus visum est, ut aperte monstraretur, qui spretus terrenae dignitatis ab hominibus exuitur, Deo dignus caelestis patriae vernula mereretur feliciter haberi.

If it is helpful, we have also considered calling Event instead Episode. Finally, we give each Event one (or in extreme cases of ambiguity, two) Event Type labels.

The following extended paragraph explains our use of the Event Type (if you are not concerned about it, feel free to skip down), which is not in fact central to our analysis or project. The important point is that these are helpful to our analyses, but they are not what we are analyzing about the chronicle texts. They are imposed analytical categories, which is why we have made them a distinct part of the database (it is possible to study the text without using these types as an analytic). At the same time we have taken great care and gone through many different versions of these through extensive internal debates before settling on a list of eleven. We have found this list to be sufficient for capturing the different sorts of events which the Annals of Fulda uses to fill out its annual entries. These are: Campaign, Birth/Death, Office/Succession, Meeting, Assembly/Council  Travel/Embassy, Dispute, Celebration, Construction,  Commentary, Phenomenon (note, this is what we assigned to AF 3 887 14, above). It needs to be understood that the Event Type is also NOT the goal of the analysis. These distinctions allow us to identify potential similarities between texts or between entries. In the future they will allow us to make some overall statements about about entries (i.e., an entire year’s entry), or about trends or emphases in different texts as a whole. But ultimately they are simply an analytical tool to understand the text, rather than being core to our argument about the text. All of this follows a central tenet of our database structure: to keep our analytical work (in the Passage and Event and then Event Type datasets or objects) separate from the digital text itself (in the Chronicle and Chronicle Entry datasets).

Welcome back. The key to our analytical interests—and so our intervention into the text and the history of the text—is the combination of Passage and Event.

Our data creation and analysis thereof is based on the following premises. Chronicles (chronographic, chronological, and annalistic texts) are rarely attributed to a single author, and even when they are they are rarely the product of (a) a single moment and/or (b) a single person’s unique assemblage of information. Chronicles are much more often (1) collective, collaborative enterprises; (2) written over long periods, re-written, both, and more; and, (3) compiled from excerpted, rewritten, partially rewritten, and/or orally-transmitted pieces of information. As a result, what interests us as historians is not a textual-originalist (i.e., the Historical-Critical Method) approach which seeks to reproduce the text as it emanated from the quill of its author-scribe. Rather, what interests us is whatever happens to be the social conditions of the production of the text. This has led to our analytical interest in two things:

  1. Any and all informational overlaps between extant versions (i.e., down to the variant manuscripts) of any chronicle text, since each of those overlaps indicate to us shared knowledge
  2. Shared knowledge in turn indicates the social networks through which that knowledge was shared

Our relational database has shifted its structure over the past two years to allow us to get greater access to these two aspects of the evidence we find the text contains. What we have learned from observing the flow project is to identify where digital tools are and are not useful for this study. Thus far they are useful in two areas: (1) tagging persons and places in each Passage through Named Entity Recognition (nodegoat’s built-in internal process for this is “reconciliation”); and, (2) analysis and visualization of the “informational overlaps” and “social networks” noted above. In the near future we will add (3) Handwritten Text Recognition to our workflow when we turn to using the actual surviving manuscripts for our Latin text rather than the printed critical editions (which we continue to use as we build our database model). Identifying our unique ‘digital flow’ emphasizes how much our work remains tied (by the necessity of our own standards of exactitude) to careful reading practices, double- and triple-checking all of our data as we proceed.

At the time of the IMC Leeds presentation we were able to offer the following.

Having conducted an initial analysis of three different versions of the Annals of Fulda (AF 1, AF 2, AF 3 in our representation), we were able to display the overlap of the entries in each of them (i.e., where the entire text for a year was the same) in the following image.

AF 1 shares all of its entries with both AF2 or AF3, which in turn share a number of their entries with each other but then possess their own distinct entries. This is simply a visualization of the text history as manifest in the MGH critical edition.

What our Comparing Chronicles methodology does to the text is visible in the next image. This is a comparison between the entries for 882 in AF3 (to the left) and AF2 (to the right), which are each represented by the large blue nodes in the bottom corners.

The orange nodes clustered in descending numerical order on the left and right sides each indicate a discreet Event (i.e., narrative episode) in each text. We have activated the persons (bottom, pink) and the Event Types (top, orange) but de-activated the places (blue, floating in the middle) of each version to make it possible to see the encoded relations.

The key element of interest, above, for our purposes are the six Events (purple) where the visualization displays the labels (in English here, rather than in Latin) of the events which we noted as the same event. That is, these five (“Louis III died,” “Comet Appeared,” “Army Returned,” “Northmen burnt Koblenz,” Northmen burnt Trier,” and “Bishop Wala attacked Northmen”) we read as the same core episode even in the text describing or narrating each was different in each text. In numerical terms, AH2 has 18 distinct Events, AH 3 has 20 distinct events, and they share 6.

According to the Historical Critical Method, these are completely divergent textual traditions. According to our Comparative Method, this single year’s entry in each text possesses evidence of a 23-25% overlap in the knowledge network between these two textual communities, as recorded in their respective annals. This is already an exciting and promising result, just from this small test case.

In the upcoming semester we will be applying our most updated methodology to an encoding of a combined Latin-English database of the three versions of the Annals of Fulda, while also applying this same methodology to the Annals of St. Bertin. This work will give us our final methodological prototype, making it possible to visualize what the evidence from a surviving text of a single knowledge network looks like—even when there are three versions of the text spread out amongst the sister monasteries of the monastic house of Fulda—as compared to a distinct knowledge network, that of St. Bertin.

However, our final visualization gives a hint of what is in store. This is a detail from the social network of the Entries and Events in both the Annals of Fulda and the Annales Bertiniani. While these networks are clearly distinct, the Events (in purple) which hang like spiderwebs between the two clouds indicate events which are the same episode. Our methodology already allows us to see that even though these are distinct textual traditions, there are meaningful links in the historical knowledge which they shared and recorded in common.

Connecticut Digital Humanities Conference: Traveler’s Lab Presentations

By Vasilia Yordanova

On February 22, students and professors collaborating on the CDER project and other digital humanities research projects at Wesleyan presented their work at the 2025 Connecticut Digital Humanities Conference at Central Connecticut State University. Will Markowitz, Arushi Khare, Akram Elkouraichi, and Professor Torgerson spoke about their progress on the CDER project, beginning by explaining the goal of accumulating data from across many sources into one digital platform to allow for easier access to relationships between data and to facilitate interaction between academic disciplines. Students’ roles in the project are researching sources and incorporating them in the platform, and compiling data in the platform, with Professor Torgerson’s guidance.

Will explained the NodeGoat platform, which the CDER project uses to accumulate and centralize data in an accessible digital format. Then, Akram discussed linked data and his work on modeling the Istanbul walls. Will spoke about relational data and examining relationships between different kinds of data, including geographical relationships (as the project is space-based). He demonstrated how these relationships appear visually in NodeGoat. Arushi described the work of linking seals from the Dumbarton Oaks collection in Georgetown to an open database of people living in Constantinople and to official buildings where offices would have been stationed. The goal is to link information extracted from sources back to those sources in NodeGoat. 

Arla Hoxha and Zaray Dewan spoke about their work on the Chronicles project, and students and professors working on the Life of Milarepa and Chinese language theaters in North America also presented on their recent work at the conference.

CDER Project update: Mapping Byzantine Seals and People

By Alex Williams

At the time of the last blog post (May 2024) about Constantinopolitana: A Database of East Rome (CDER) project, we celebrated the creation of a prototype database, consisting of multiple types of objects and artifacts from across Constantinople. Each object type had about 20-100 data points (to read more about this process, check out our last blog post!). Of course, lots of this data is directly integrated into CDER from preexisting datasets, many of them having thousands of entries. 

With permission from the owners, we want to figure out a methodology to add these items into CDER to help find wider connections than one institution would be able to create or develop on their own.  This involves combining large datasets, a more technical task compared to the prototype, as there is lots of data cleaning and preparation involved.

This summer, I worked on combining existing data on Byzantine lead seals with existing prosopographic data (data describing individuals alive during the Byzantine empire). 

Datasets and Process

Byzantine lead seals were used to seal letters sent across the empire. These seals contain information about the sender such as their name, title, and occupation, allowing insights about people and the bureaucracy. Imperial titles, known as dignities, demonstrated a seal owners’ place in the imperial hierarchy. An owner’s office, or occupation, also appears on a seal. 

For this stage of the project, we worked with roughly 16,000 seals from the Dumbarton Oaks(DO) Byzantine Seal collection, in collaboration with Johnathan Shea. This collection has already been digitized. A crucial part of integrating this data into our model was making the seals data into relational data, meaning that some parts of the data (in this case dignities and offices) get tables separate from the seals table. 

We chose to work on data regarding people and seals in conjunction because of the large, preexisting, and already partially connected datasets. Byzantine prosopographic data has a wide overlap with seals, some of which is already documented. This is because the prosopographies use seals as evidence for the existence of people. For example, we might know new information about a person from a seal, or that might be the only record that a person existed. With this project, our goal was not to create insights or establish new ‘readings’ of the seals to create connections with people, but rather to digitize and store existing connections in a database, in addition to creating a model for incorporating other types of data into CDER. 

For the data regarding people, I worked with the Prosopography of the Byzantine World(PBW) database covering 1025-1180 AD. I also did some experimenting (hopefully more to come) with the Prosopography of the Middle Byzantine Period (PMBZ) covering 641-1025 AD.

There is already some data overlap between the DO seals collection and the PBW. A lot of my time was spent cleaning up the data and making it consistent, and understandable to our model, which means that each row has to contain the same columns. Each column must contain its information in similar formats across rows. There were also lots of text analysis strategies used to extract information from how it was originally formatted in a sentence, phrase, paragraph, or description. We also learned methods for importing into Nodegoat (the digital humanities software we are using to store the data), as well as methods for making connections technically. 

Visualizations

Below are some graphs and descriptions of what is possible with this data as a database or dataset, and which point towards the next steps for analysis.

I want to preface that these graphs do not give a perfect, or even good, representation of what’s going on in history, or even in the seals dataset. First, the number of seals is not necessarily representative of the actual number of offices or dignities of a certain type, or the number of people in these roles. Some offices might have been sending letters more often (so there would be an overrepresentation of that office on seals). In addition, visualization is more complicated because of the way seals are dated. We know some of the seals are from a certain set of years, and many of them can be dated to a single century. But for some seals, we are not sure of the exact century, and so they would be dated to two or more centuries. In the graphs below, I wanted to avoid double counting the seals, so the date used is the first date in the estimate. For example, both a seal that we know is from the eighth century as well as one we know is either in the eighth or ninth century would be represented in these graphs as part of the eighth century.

Figure 1

Figure 1 depicts the number of seals over time, with colors representing the offices of the seal’s owner. Figure 2 shows the seal count for the top six overall dignities over different centuries.

Figure 2

 

Figure 3

Figures 3 focuses on the dignity ‘Patrikios’ which was high ranking in the 8th – 10th centuries, losing importance through the 11th. (Shea, 2020; Kazhan, 1991). Figure 3 shows the different offices on seals with the dignity Patrikios. 

A historian can use these visualizations to understand how to develop their inquiry. For example, a historian interested in the hierarchy of a specific office such as the dioiketes could use a visualization such as Fig. 3 to understand which other offices ranked similarly over time. 

Figure 5

Figure 5 shows a section of a network graph for the dignity Patrikios. The largest other nodes are dignities ‘imperial protospartharios’ and ‘anypatos’, as well as the office ‘strategos’, the name for a military general. You can also see tight clusters of seals together in the graph: these clusters are made up of parallel and related seals, some of which are identical. Some broader clusters might be hints to explore certain connections further. The connection with people (white dots in Fig. 5), could allow historians to better date seals, if there are other sources about a specific person, or understand an individual’s trajectory through offices and dignities with more context.

Next Steps

There still are some technical steps to finish up this part of the project, consisting of changing data types and storage methods to store values more efficiently. We are also planning on connecting the seals data to the PMBZ. These connections are slightly more complicated as the PMBZ does not contain direct references to the new format of seals, so bibliographic information on each seal involved has to be extracted, normalized, and then matched to bibliographic information in the DO collection. For the lab as a whole, this semester we are going to work on building structure for other objects (statues), as well as continuing our focus on buildings and incorporating location and geographical information into the data. 

Note: This blog post focuses more on the conceptual aspects of the project. If you are interested in any technical details, reach out to apwilliams@wesleyan.edu. 

Summer 2024 Chronicles Project Update

By Arla Hoxha

During summer 2024, Lab Manager Arla Hoxha ’25 continued developing Traveler’s Lab Comparing Chronicles Project through a QAC (Quantitative Analysis Center) Summer Fellowship. Throughout the summer, she experimented with different statistical methods and software and explored how new methods might be utilized to better compare different manuscripts from the Annals of Fulda. The summer research process culminated in a poster presentation session summarizing the progress of Chronicles up to that point, as well as visual representations of our work.

The presentation started off by giving a definition of the event unit for a general audience unfamiliar with our work. In recent years, there has been an effort in the field to shift the focus from the chronological development of historical texts and towards the development of the narrative. Events do not always show up in a narrative as they occur chronologically; they are compiled by an author who chooses what events to include and how to arrange them. Studying chronicles using ‘years’ as units reveals little about the chronicle and even less about its author as it is too broad and unable to capture the nuance of meaning in the language and the way text is ordered. The intervention of the Chronicles Project so far is the alternative unit of observation; the event defined as a string, spanning from a few words to paragraphs, with a central theme (event types), consistent named tag entities (characters and setting), terminating with a change of temporal identifiers or agents in the narrative. We expand more on these definitions in our previous methodologies.

Using a narrative-focused approach in the study of chronicles and the unit of the ‘event’ we explore chronicle entries from the Annals of Fulda—the text which continues to be our main focus—and determine how events differ on the manuscript level. The focus of the summer research was reducing events to their differentiating components and using different text analysis tools to compare these components across events to determine event similarity. We experimented with Python text mining libraries such as spaCy to filter the entries. spaCy is an open-source library used for Natural Language Processing (NLP) in Python. The program developed takes a csv file with the event titles, years and manuscripts. It then splits the title into its components, then using spaCy’s tokenization feature it tags each component into parts of speech and then tries to match entries based on named entities, verbs, etc. This method of filtering can also easily be accomplished through Nodegoat. The idea behind using event titles for the comparison is that they are supposed to capture the event and utilize specific verbs from the event that are representative.

An interesting function spaCy allows is comparing words and giving their similarity through a percentage (cosine similarity model), which can be utilized to compare two similar verbs used in an event title—this could be used to match events that are potentially the same although they have been differently labeled, forgoing the issue of human error. ‘Northmen attacked’ and ‘Northmen plundered’ do not have the same label, nor are their passages textually the same. But we can filter events by year and look for different manuscripts, then differentiate between events based on their type categorizations; checking for cosine similarity above a certain value can help us determine whether ‘plundered’ and ‘attacked’ have a similar meaning and therefore ‘Northmen plundered’ and ‘Northmen attacked’ would be understood as referring to the same event. However, this functionality comes with its own problems; it is limited by the library’s vocabulary and its efficiency is undermined by lack of accuracy. Moreover, even though the verbs used in the titles are important this method overemphasizes the way events are titled over other elements. The accuracy can be increased by training a module using data that is specific to our project. For now, we can go through the process of determining if two verbs are similar enough using a human reader, which is a slower but more accurate process. Although this is beyond the scope of our project as of now, it might be interesting in the future to train a module that is specific to the Chronicles Project, which could prove useful in automatizing part of the process of detecting the same event.

Returning to the main topic of comparing events, two events are the same as long as they speak of the same event; that is the same event type (see Diana Tran’s Event Type methodology here), with the same named entities. Whether they are textually the same is less important. We found events that spoke of the same occurrence by filtering (through the method described above as well as Nodegoat filters) for events with the same title happening in the same year but different manuscripts. An instance of this is ‘Pope Hadrian dies’ in year entry 885 for both manuscripts 2 and 3 in Annals of Fulda:

The passage differs between manuscripts, but the central idea captured in the title remains: ‘Pope Hadrian died.’ Despite the first event including more details surrounding the death of Pope Hadrian and the difference in length of the two event entries, both refer to the same event. Beyond the categorization under the same title, the events in both manuscripts are cataloged under event type ‘Birth/Death’ and both have ‘Pope Hadrian’ listed as a principal actor. The way the event is named is helpful in communicating the ideas of the passage and helping us identify them correctly. Here we see how ‘Even Types’ can be a powerful tool in determining event similarity as well as understanding distribution of events across time. We believe that the categorization of events by event types will increase the accuracy of determining ‘same’ events.

Of course, over relying on Event Types presents the caveat of human error that is built into this categorization. We observe that chronicle year entries with largely the same events have different distributions of event type tags:

The data in use is the cross-referenced chronicle transcript from the different manuscripts of the Annals of Fulda stored through the Nodegoat environment. The parsing of the data and visuals were completed using Google Sheets for data management and plotting libraries in R.

What is the importance of determining events that are the same across manuscripts and chronicles? By pointing out the similarities between events we are able to discern their differences as well and start asking questions about authorship and the context in which different manuscripts emerged. The same methodology we have been using on events from different manuscripts of Fulda is to be applied to events that pertain to different but overlapping chronicles in the future.

Through the work done this summer, determining event similarities we found overlap in a few events from different manuscripts. We attempted to tag text from different manuscripts describing the same event under the same event tag, so we tried to list the passage of events from different manuscripts under the same event object entry. Perhaps the most important result from our work this summer was noticing the issues with our old model of event categorization. Our old model required the passage to be written out on the event description; in the case of two events we would have to do so for both passages and so on depending on the number of manuscripts that describe the same event. Our future goal is to be able to cross reference the English text with the original Latin, this would require us to list another passage under the event entry. We realized that this methodology would become increasingly more insupportable. Moreover, the work comparing chronicles in this way made apparent certain redundancies in our data. Most notably, named entities appeared twice; as tags in the chronicle object and cross-referenced in the object descriptors of events.

To reduce the redundancy in our data and to make the process more efficient moving forward, we implemented a mass restructuring. The new objects are as follows:

Chronicle Entry is now the object that contains passages. Passage is the object where named entities are tagged—previously this was done in Chronicle Entry. Now events are linked to the chronicle entry through the Passage object. The passage is cross referenced in the object description of the event, but the event is not tagged in Passage as to reduce redundancy. Named entities on the other hand appear as tags in Passage, but they are not cross referenced in Event, because Passage already contains them, and Passage is cross listed in Event.

Here is an example of the Passage object: ‘in entry’ is the link to the Chronicle entry, which in turn is linked to the object ‘Chronicle’ which stores the references to the source texts, i.e.: the English translations of Annals of Fulda. The text is the passage, as well as the text of the event in question. Passage number shows where the passage/event occurs in relation to the other events in that same year, emphasizing the progression of narrative over chronological progression, one of our main goals that we have previously had issues representing in Nodegoat.

The event no longer has description tags for ‘Places’ and ‘Person’ because those are tagged in Passage and can be cross-referenced through it. This also makes the work of referencing multiple passages under the same Event much easier; instead of writing down multiple events in the description of the Event object, they can be cross-referenced. This will also aid us in the eventual transition to Latin. The only elements directly listed under Event are the event title, Chronicle entry, and in the sub object, the Event Type.

Other smaller but not insignificant improvements to the model were cleaning up the data of the Person and Places objects where each entry was filled out with relevant information that was previously missing, such as birth and death days for Person and coordinates for Places. In the process is also the development of types of locations to categorize Places.

The Event-Based Narrative in the Annals of Fulda: Results

The Fulda project, a quantitative-qualitative analysis of the chronicle Annals of Fulda through the platform Nodegoat, resulted in a fully-fledged database of chronicle entries, people, places, and events. The model used to map events is a novelty in the field of chronicle-studying, and one we hope will continue to be replicated and improved upon. We hope our database will aid scholars reflecting on this time period or thinking about questions of narrative and the anatomy of the chronicle; why are they put together in a certain way? Enriching the model and data and refining our processes are next for our team at Traveler’s Lab. Following the footsteps of previous projects in the Lab, even though we started Fulda from zero, many of our goals were realised during this summer, some of which are outlined in this article. 

The obvious, and maybe the most important achievement was the database itself, with the chronicle fully uploaded to Nodegoat. Anyone with access to the database can find, categorize and visualize elements of the chronicle, such as the people or places in it. The chronicle entries are tagged by which chronicle and manuscript they belong to and the text is fully mapped with object tags. This makes it easy to analyze the chronicle based on the elements that interest a researcher. Through the Nodegoat configuration, it is possible to see the way all the data is linked to each other; what events take place where, who is involved, how many times a name is referenced in a chronicle entry, comparisons between multiple entries or events, and more. 

A great feature of the database is the link to ancient locations through Pleiades. All events have a location tag which allows us to visualize the events in a geographical map. Interconnectivity is one of the best things about this model because not only do we have data on different places and events but we know what happened where and who was involved.

Creating a model for determining events that allows us to follow the logic of the narrative was an important achievement this summer. The process involved much trial and error and remains a work in progress, but we were able to refine the bulk of the process, as explained in the last article. In creating the event objects for the database we made sure to use the text’s language as well as date the events based on the sequence of the narrative rather than historical time (although a descriptor was provided for this, in case a specific date was present in the text). All this was done to shift the focus towards the narrative and follow the logic of the chronicle and what the text deems important, as opposed to our reading of it. We hope this model will inspire and allow for more thorough analysis, that leaves less room for misinterpretation. 

In a future ambition for expanding the project, we hope to use the comparative tools provided by Nodegoat and the construction of the model to run comparisons between different manuscripts as well as the English and Latin versions of the text. Onboarding a scholar of Latin to work with the team is an aspiration which would further enrich the Fulda project. 

As stated before, we hope to expand our model to include data from other Carolingian chronicles, such as The Royal Frankish Annals. We hope to inspire other scholars to use quantitative methods, especially those that centre the narrative, in their research of chronicles from all time periods. 

Although much progress was made during this summer, there is always room for improvement. In the upcoming semester, we anticipate fixing our problem with accessing locations not covered by the Pleiades ancient locations database through an API. We also hope to find ways to automate as much of the text-tagging process as possible. Some of this has already been done, through the Reconciliation system in Nodegoat, but we wish to refine the process further. The use of AI shows big promise in this regard, as was discovered during an experimental session of using OpenAI to perform people text tagging. Implementing and integrating this process into the Nodegoat object tagging is one of our goals for the future.

The Event-Based Narrative in the Annals of Fulda: Methodology

Introduction

In line with other Traveler’s Lab projects, this undertaking was the beginning of a long exploration of using quantitative methods in the study of medieval chronicles by following the logic of the text through its narration, rather than that of chronology.  This project, drawing from the 9th-century Carolingian chronicle, the Annals of Fulda, served as an experimental model that will inspire similar practices in the way we study chronicles. Describing the work of a whole summer, this article will focus on the methods used to study the Annals of Fulda, including the constructed models we hope will have a wider impact.

Methodology

The whole text of the chronicle the Annals of Fulda was parsed, scanned and uploaded to Nodegoat, a web-platform that allows for data modeling and contextualization through spatial and temporal elements. Nodegoat allowed us to create our own objects to map our data (from the text) such as Person (the historical people part of the event) and Places (the geographical area where the event was happening). The text was systematically mapped with tags of Person, Places and Event objects. A new object added to Fulda was the Religious tag, which is used to map religious celebrations such as Easter or Christmas, that occur throughout the text. Starting to map Fulda not having used the platform before was made easier through following the sample of Fulda’s sister project The Royal Frankish Annals, modeled by Daniel Feldman. Therefore, many of the objects were already set up, and only needed to be furnished with the new data. In order to have both projects sharing the same object database we created the Chronicle object to differentiate between them as well as different manuscripts of Fulda

The team had already started thinking about new ways to express events, in a way to make them help us better understand the narrative. The way the project defines the event is different than we might think of them regularly. For instance, an event is not only a battle or a coronation, or an ‘important’ happening; anything can be an event. In fact, everything is. Every couple of sentences focusing on a specific narrative (following certain guidelines for time and place), was mapped as an event. 

Determining what constitutes an event and creating the event dataset was a challenging experience and a process we are refining to date. With the intention of fully capturing the text of the chronicle, we started developing a model where every sentence would be an event, but soon realized that this would not fully capture the scope of the narrative. We then opted for a definition of the event that was more narrative-focused where the events would terminate depending on the change of temporal identifiers as well as agents in the narrative. To avoid bypassing the text (as the short titles do not allow for detail) we decided to add a ‘Passage’ descriptor, where the text of the particular event is disclosed. 

The event object was the most important yet most difficult to develop. We went through a long trial and error process figuring out what descriptors to attach to the object, in a way that was useful but not redundant. The event object is now linked to the chronicle entry (the text of the chronicle by year), person, places objects and has a sub-object denoting time. 

The places object is connected to Pleiades, a database of ancient locations (along with their longitude, latitude, Pleiades id) which we imported to Nodegoat. The location identifiers in places allow us to visualize the ancient locations where the mapped events happened. 

Dating the events was another issue, since only some of them have a time identifier. We decided that instead of following a chronological logic, by using estimates and dates the text provided to date events, we would follow a narrative logic, by not ‘dating’ the events per se. Instead they would be connecting to each other sequentially, as dictated by the narrative determined by the chronicler; sometimes narrative and historical time are not interchangeable. To preserve the information the text provides we added a descriptor for ‘exact dates’, to be used in case the text provided one such descriptor. 

Having now created a database of objects, Nodegoat allows us to use the Reconciliation feature to map objects such as Places and Person to the remaining chronicle entries. Although not a flawless process, Reconciliation allows for semi-speedy execution of an otherwise laborious task. We are still working on ways to automate the process of text-tagging and potentially extend it to other objects, such as events.