Storytelling, archives and linked data
Yesterday on Twitter I saw Owen Stephens (@ostephens) post a reference to a presentation titled "Every Story has a Beginning", by Tim Sherratt (@wragge), "digital historian, web developer and cultural data hacker" from Canberra, Australia.
The presentation content is available here, and the text of the talk is here. I think you really need to read the text in one window and click through the presentation in another. I found it fascinating, and pretty inspiring, from several perspectives.
First, I enjoyed the form of the presentation itself. The content is built up incrementally on the screen, with an engaging element of "dynamism" but kept simple enough to avoid the sort of vertiginous barrage that seems to characterise the few Prezi presentations I've witnessed. And perhaps most important of all, the presentation itself is very much "a thing of the Web": many of the images are hyperlinked through to the "live" resources pictured, providing not only a record of "provenance" for the examples, but a direct gateway into the data sources themselves, allowing people to explore the broader context of those individual records or fragments or visualisations.
Second, it provides some compelling examples of how digitised historical materials and data extracted or derived from them can be brought together in new combinations and used to uncover and (re)tell stories - and stories not just of the "famous", the influential and powerful, but of ordinary people whose life events were captured in historical records of various forms. (Aside: Kate Bagnall has some thoughtful posts looking at some of the ethical issues of making people who were "invisible" "visible").
Finally, what really intrigued me from the technical perspective was that - if I understand correctly - the presentation is being driven by a set of RDF data. (Tim said on Twitter he'd post more explaining some of the mechanics of what he has done, and I admit I'm jumping the gun somewhat in writing this post, so I apologise for any misinterpretations.) In his presentation, Tim says:
What we need is a data framework that sits beneath the text, identifying people, dates and places, and defining relationships between them and our documentary sources. A framework that computers could understand and interpret, so that if they saw something they knew was a placename they could head off and look for other people associated with that place. Instead of just presenting our research we’d be creating a whole series of points of connection, discovery and aggregation.
Sounds a bit far-fetched? Well it’s not. We have it already — it’s called the Semantic Web.
The Semantic Web exposes the structures that are implicit in our web pages and our texts in ways that computers can understand. The Linked Data movement takes the basic ideas of the Semantic Web and turns them into a collaborative activity. You share vocabularies, so that other people (and computers) know when you’re talking about the same sorts of things. You share identifiers, so that other people (and computers) know that you’re talking about a specific person, place, object or whatever.
Linked Data is Storytelling 101 for computers. It doesn’t have the full richness, complexity and nuance that we invest in our narratives, but it does at least help computers to fit all the bits together in meaningful ways. And if we talk nice to them, then they can apply their newly-acquired interpretative skills to the things that they’re already good at — like searching, aggregating, or generating the sorts of big pictures that enable us to explore the contexts of our stories.
So, if we look at the RDF data for Tim's presentation, it includes "descriptions" of many different "things", including people, like Alexander Kelley, the subject of his first "story" (to save space, I've skipped the prefix declarations in these snippets but I hope they convey the sense of the data):
story:kelley a foaf1:Person ; bio:death story:kelley_death ; bio:event story:kelley_cremation, story:kelley_discharge, story:kelley_enlistment, story:kelley_reunion, story:kelley_wounded_1, story:kelley_wounded_2 ; foaf1:familyName "Kelley"@en-US ; foaf1:givenName "Alexander"@en-US ; foaf1:isPrimaryTopicOf story:kelley_moa ; foaf1:name "Alexander Dewar Kelley"@en-US ; foaf1:page <https://discontents.com.au/shoebox/every-story-has-a-beginning> .
There is data about events in his life:
story:kelley_discharge a bio:Event ; rdfs:label "Discharged from the Australian Imperial Force."@en-US ; dc:date "1918-11-22"@en-US . story:kelley_enlistment a bio:Event ; rdfs:label "Enlistment in the Australian Imperial Force for service in the First World War."@en-US ; dc:date "1916-01-22"@en-US . story:kelley_ww1_service a bio:Interval ; bio:concludingEvent story:kelley_discharge ; bio:initiatingEvent story:kelley_enlistment ; foaf1:isPrimaryTopicOf story:kelley_ww1_record .
and about the archival materials that record/describe those events:
story:kelley_ww1_record a locah:ArchivalResource ; locah:accessProvidedBy <https://dbpedia.org/resource/National_Archives_of_Australia> ; dc:identifier "B2455, KELLEY ALEXANDER DEWAR"@en-US ; bibo:uri "https://www.aa.gov.au/cgi-bin/Search?O=I&Number=7336927"@en-US .
The presentation itself, the conference at which it was presented, various projects and researchers mentioned - all of these are also "things" described in the data.
I'd be interested in hearing more about how this data was created, the extent to which it was possible to extract the description of people, events, archival resources etc directly from existing data sources and the extent to which it was necessary to "hand craft" parts of it.
But I get very excited when I think about the potential in this sort of area if (when!?) we do have the data for historical records available as linked data (and available under open licences that support its free use).
Imagine having a "story building tool" which enables a "narrator" to visit a linked data page provided by the National Archives of Australia or the Archives Hub or one of the other projects Tim refers to, and to select and "intelligently clip" a chunk of data which you can then arrange into the "story" you are constructing - in much the way that bookmarklets for tools like Tumblr and Posterous enable you to do for text and images now. That "clipped chunk of data" could include a description of a person and some of their life events and metadata about digitised archival resources, including URIs of images - as in Tim's examples. You might follow pointers to other datasets from which additional data could be pulled. You might supplement the "clipped" data with your own commentary. Then imagine doing the same with data from the BBC describing a TV programme or radio broadcast related to the same person or events, or with data from a research repository describing papers about the person or events. The tool could generate some "provenance data" for each "chunk" saying "this subgraph was part of that graph over there, which was created by agency A on date D" in much the way that the blogging bookmarklets provide backlinks to their sources.
And the same components might be reorganised, or recombined with others, to tell different stories, or variants of the same story.
Now, yes, I'm sure there are some thorny issues to grapple with here, and coming up with an interface that balances usability and the required degree of control may be a challenge - so maybe I'm getting carried way with my enthusiasm, but it doesn't seem to be an entirely unreasonable proposition.
I think it's important here that, as Tim emphasises towards the end of his text, it is the human "narrator", not an algorithm, who decides on the structure of the story and selects its components and brings them into (possibly new) relationships with each other.
I'm aware that there's other work in this area of narratives and stories, particularly from some of the people at the BBC, but I admit I haven't been able to keep up with it in detail. See e.g. Tristan Ferne on "The Mythology Engine" and Michael Smethurst's thoughtful "Storytellin'".
For me, Tim's very concrete examples made the potential of these approaches seem very real. They suggest a vision of Linked Data not as one more institutional "output", but as a basis for individual and collective creativity and empowerment, for the (re)telling of stories that have been at least partly concealed - stories which may even challenge the "dominant" stories told by the powerful. It seems all too infrequent these days that I come across something that reminds me why I bothered getting interested in metadata and data on the Web in the first place: Tim's presentation was one of those things.