Sonification and the Datini Letter Meta-data

Written by Adam Franklin-Lyons (History professor at Marlboro College) and Logan Davis (Research and Development Engineer at Pairity Greater Boston Area Computer Software)

Which means what exactly?  It’s like a visualization, but instead of something you see, it’s something you hear.  Let me start with a little background…

A couple of years ago, we attempted a couple of “sonifications” (rendering complex data in sound) using the metadata from the letters sent by the Datini Company in 14th and 15th century Italy. (We in this context are Adam Franklin-Lyons, professor of history at Marlboro College and Logan Davis, a skilled programming student, now alum, at Marlboro with a strong background in music and sound). The Datini data collection contains over 100,000 letters with multiple variables including origin, destination, sender, receiver, travel time, and others. There is an earlier blogpost with more about Datini and some regular old visualizations from a conference talk. We made a few preliminary experiments, often connecting individual people to a timbre and moving the pitch when that person changed locations. Here is a short version of one of our experiments where three different individuals each “inhabit” an octave of space as they move around – we made both a midi-Piano version and a synth-sound version. The sounds are built using a python sound generator and attaching certain pieces of data (in this case, the locations of three names agents of the Datini company, Boni, Tieri, and Gaddi) to numeric markers that the generator then translates into specific pitches, timbres, decay lengths, etc. What follows here are some of our thoughts about what sonification is, and how you might create your own. This post does not go into specific tools, which can be complicated, but is more of a general introduction to the idea. Hopefully in the future we will include another couple of posts that also talk about the technical side of things.

Despite not being intensively used, you are probably already familiar with the basic idea of sonification. Several well-known modern tools (the Geiger counter is the most widely cited example) use a sonic abstraction to portray data inputs that we can not otherwise sense. (for the Geiger counter, beeps or clicks indicate the quantity of radioactivity emitted. Basic metal detectors work similarly.) In contrast, researchers portray vast amounts of data in visual forms – graphs, charts, maps, videos, and so on. Perhaps this is because of the dominance of visual input for most people, perhaps not. Either way, the goals is the same: how do you take a large quantity of data and distill or organize it into a form that demonstrates patterns or meaningful structures to the person trying to understand the data?

Fields like statistics and data science teach and use visualization constantly, including using many known methods of comparing data sets, measuring variance, or testing changes over time. Researchers have also studied the reliability of different types of visualizations. For one example, visual perception can measure distance much better than area. Thus people consistently get more accurate data from bar graphs than from pie charts. The goals of sonification thus present one important question: what are types of patterns or structures in the data that would actually become clear when heard rather than seen? Are there particular types of patterns that lend themselves well to abstraction in audio rather than visually? (And I will be honest here – I have talked to at least a couple of people who do stats work who have said, “well, there probably aren’t any. Visual is almost bound to be better.” But admittedly, neither of them were particularly “auditory” people anyway – they do not, for instance, share my love of podcasts…their loss.)

Thus, the most difficult aspect is not simply duplicating what visualizations already do well – a sonification of communication practices where the volume matches the number of messages getting louder and louder over the course of a 45 second clip and then drops off more precipitously doesn’t actually communicate more than a standard bar graph. It would take less than 45 seconds to grasp the same concept in its visual form. Visualizations employ color, saturation, pattern, size, and other visual aspects for multiple variables. Combining aspects like attack and decay of notes, pitch level, and volume could potentially allow for multiple related pieces of data to become part of even a fairly simply sonic line. Like visualizations, certain forms of sound patterns will catch our attention better or provide a more accurate rendition of the data. Researchers have not studied the advantages and disadvantages of sound to the same extent, making these questions ripe for exploration.

So what are some examples? There is at least one professional group that has been dedicated to this research for a number of years: The International Community for Auditory Display. Their website has a number of useful links and studies (look particularly at the examples). Although these are not the most recent, there is a good handbook from 2011 and a review article from 2005 that describe some of the successes and failures of sonification.  Many of their examples and suggestions recommend reducing the quantity of data or not overloading the auditory output, much as you would not want to draw thousands of lines of data on a single graph. However, at least a couple of recent experiments have moved towards methods of including very large quantities of data. While promotional in nature, here is a video demonstrating the concept as used by Robert Alexander to help NASA look at solar wind data.

So, how to proceed? First, the work of audification does not escape from the day to day tasks of data science, especially the normalization of data. If your audification cannot reasonably handle minor syntactic differences in data (ie: “PRATO” vs “prato” vs “Prato, Italy”), then your ability to leverage your dataset will be limited, just as it would with visualizations. The work to normalize and the choices you make in the normalization may be made far more efficient with a little leg work in the beginning.

Like visualizations, sonifications should be tailored to the data-set at hand. Then you will have to make choices about which aspects of sound you will relate to which data points. This is the main intellectual question of sonification. What are we voicing? What is time representing? What does timbre (or voice – different wave forms) give us here? Timbre and pitch nicely convey proper nouns and verbs in data-sets. Timbre has a far more accessible (articulated) range of possible expressions for data with higher dimensions (though for a particularly trained ear, micro-tonalism may erase a great deal of that advantage). Decay, in my experience can contain interesting metadata, such as confidence or freshness of a fact; the action of the tone relates to how concretely we know something in the data.

After cleaning, pitch, timbre, decay assignments, etc., you listen. Much of what you will find sonification good for is finding hot-spots in data sets. What stands out? Are there motifs or harmonic patterns that seem especially prevalent? Some of these questions, obviously, will relate to how the data has been coded, but every time we have tried this, there are also at least a few surprising elements. And finally, is it beautiful? (A question becoming more popular in visualization circles, also…) Particularly when intersecting with some of the wild data-set available today, what is the sound world created? Are there tweaks to the encoding that will both make data observations clearer while also making the sound more enjoyable to listen to? When creating an auditory representation of data, you are quite literally choosing what parts are worth hearing.