Digital Network Analysis of Dramatic Texts

Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke

Göttingen Centre for Digital Humanities / Göttingen State and University Library / Herzog August Library Wolfenbüttel / University of Göttingen

Sydney, DH2015, 2 July 2015

(Presentation licenced under CC-BY 4.0.)


  1. Approach
  2. Data Mining
  3. Data Editing
  4. Display & Analysis
  5. Further Research


Basic Ideas

  • following the tradition of structuralist approaches in Literary Studies (Barthes 1972, Lotman 1977, Titzmann 1977, etc.)
  • basing it on automated data analysis
  • long-term objective: provide structural data which can be used, for example, to describe different compositional types of plays


Different Styles of Structural Composition

Examples: two plays written by Goethe

network graph of Goethe's Iphigenie auf Tauris network graph of Goethe's Götz von Berlichingen
Iphigenie auf Tauris (1787) Götz von Berlichingen (1773)


The Digital Spectator

  • combining Literary Studies with Social Network Analysis (many corresponding publications since early 2000s, see Bibliography)
  • specific definition of structure (inspired by Solomon Marcus, 1973): two characters are linked to each other if both are performing a speech act in a given segment of a play (act, scene)


465 Network Graphs

Poster of 465 drama networks

At a glance: 465 German-language dramas from 1731 to 1929 (figshare).



Data Mining → Data Editing → Display & Analysis

Data Mining


  • TextGrid Repository: biggest TEI-tagged corpus of German literary texts (contains 666 dramatic texts, cf. blog post)
  • workflow optimised to work with problematic data (faulty TEI, bad OCR, etc.)

Data Mining

DLINA Corpus 15.07 (»Codename Sydney«)

  • included texts only from 1731 to 1929
  • excluded texts following these criteria:
    • translations of foreign-language play
    • texts w/o actual speakers (e.g., pantomime plays)
    • fragments
    • plays with very defective markup
  • result: 465 dramatic texts (Sydney corpus)

Data Editing

Extracting Structural Data

  • left the original TEI files untouched and only extracted the data we were interested in
  • introduction of intermediary format ("zwischenformat", XML, cf. blog post):
    • validated against a specific RNG schema
    • zwischenformat file created for each drama
    • stores metadata, structural data, documentation

Data Editing

Editing Process

Extracted structural data was still full of bugs:

  • Errors due to automated conversion:
    • OCR errors
    • ...
  • Intrinsic problems:
    • variation of character names
    • ...

Complete editing rules including examples can be found on our blog.

Data Editing


Correction of structural bugs with crowd-editing approach:

Screenshot Gamification

Display & Analysis

One homepage for each of the 465 dramas linking to four types of visualisation + source files
(all individual pages listed here):

  • networks (sticky-node and static)
  • matrixes
  • amounts
  • intermediary format files

Display & Analysis

Example: G. E. Lessing's "Emilia Galotti" (1772)

Analysis, thumbnail 1 Analysis, thumbnail 2 Analysis, thumbnail 3 Analysis, thumbnail 4

Display & Analysis

Skit: The biggest chatterboxes in German literature

List of most talkative characters in German theatre plays

Cf. corresponding blog post.

Display & Analysis

Network size (median) by decade (1730–1930):

Network size (median) by decade

Cf. blog post "200 Years of Literary Network Data".

Display & Analysis

Network density (mean) by genre and century:

Network density (mean) by genre and century

Upcoming blog post "Network Values by Genre".

Further Research

  • more statistical data
  • bigger (German-language) corpus
  • foreign-language corpora
  • to sum it all up: using literary network data to evaluate and contribute to traditional Literary Studies

Bibliography (1/3)

Literary Theory

  • Roland Barthes, The Structuralist Activity, in: Roland Barthes, Critical Essays, Evanston, Il., 1972, 213–220.
  • Jurij M. Lotman, The Structure of the Artistic Text, Ann Arbor 1977.
  • Solomon Marcus, Mathematische Poetik, Frankfurt/M. 1973.
  • Michael Titzmann, Die strukturalistische Tätigkeit. Theorie und Praxis der Interpretation, München 1977.

Social Network Analysis

  • Stanley Wasserman & Katherine Faust, Social Network Analysis. Methods and Applications, New York 1994.
  • John Scott & Peter J. Carrington (eds.), The SAGE Handbook of Social Network Analysis, London et al. 2011.

Bibliography (2/3)

Literary Studies & SNA

Bibliography (3/3)

Literary Studies & SNA (cont'd)

  • James Stiller, Daniel Nettle & Robin I. M. Dunbar: The Small World of Shakespeareʼs Plays, in: Human Nature 14 (2003), 397–408.
  • James Stiller & Matthew Hudson, Weak Links and Scene Cliques Within the Small World of Shakespeare, in: Journal of Cultural and Evolutionary Psychology 3 (2005), 57–73.
  • Peer Trilcke, Social Network Analysis (SNA) als Methode einer textempirischen Literaturwissenschaft, in: Philip Ajouri, Katja Mellmann & Christoph Rauen (eds.): Empirie in der Literaturwissenschaft, Münster 2013, 201–247.