Digital Network Analysis of Dramatic Texts

https://dlina.github.io/

Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke

Göttingen Centre for Digital Humanities / Göttingen State and University Library / Herzog August Library Wolfenbüttel / University of Göttingen

Sydney, DH2015, 2 July 2015

(Presentation licenced under CC-BY 4.0.)

ToC

  1. Approach
  2. Data Mining
  3. Data Editing
  4. Display & Analysis
  5. Further Research

Approach

Basic Ideas

  • following the tradition of structuralist approaches in Literary Studies (Barthes 1972, Lotman 1977, Titzmann 1977, etc.)
  • basing it on automated data analysis
  • long-term objective: provide structural data which can be used, for example, to describe different compositional types of plays

Approach

Different Styles of Structural Composition

Examples: two plays written by Goethe

network graph of Goethe's Iphigenie auf Tauris network graph of Goethe's Götz von Berlichingen
Iphigenie auf Tauris (1787) Götz von Berlichingen (1773)

Approach

The Digital Spectator

  • combining Literary Studies with Social Network Analysis (many corresponding publications since early 2000s, see Bibliography)
  • specific definition of structure (inspired by Solomon Marcus, 1973): two characters are linked to each other if both are performing a speech act in a given segment of a play (act, scene)

Approach

465 Network Graphs

Poster of 465 drama networks

At a glance: 465 German-language dramas from 1731 to 1929 (figshare).

Approach

Workflow

Data Mining → Data Editing → Display & Analysis

Data Mining

Corpus

  • TextGrid Repository: biggest TEI-tagged corpus of German literary texts (contains 666 dramatic texts, cf. blog post)
  • workflow optimised to work with problematic data (faulty TEI, bad OCR, etc.)

Data Mining

DLINA Corpus 15.07 (»Codename Sydney«)

  • included texts only from 1731 to 1929
  • excluded texts following these criteria:
    • translations of foreign-language play
    • texts w/o actual speakers (e.g., pantomime plays)
    • fragments
    • plays with very defective markup
  • result: 465 dramatic texts (Sydney corpus)

Data Editing

Extracting Structural Data

  • left the original TEI files untouched and only extracted the data we were interested in
  • introduction of intermediary format ("zwischenformat", XML, cf. blog post):
    • validated against a specific RNG schema
    • zwischenformat file created for each drama
    • stores metadata, structural data, documentation

Data Editing

Editing Process

Extracted structural data was still full of bugs:

  • Errors due to automated conversion:
    • OCR errors
    • ...
  • Intrinsic problems:
    • variation of character names
    • ...

Complete editing rules including examples can be found on our blog.

Data Editing

Outlook

Correction of structural bugs with crowd-editing approach:

Screenshot Gamification

Display & Analysis

One homepage for each of the 465 dramas linking to four types of visualisation + source files
(all individual pages listed here):

  • networks (sticky-node and static)
  • matrixes
  • amounts
  • intermediary format files

Display & Analysis

Example: G. E. Lessing's "Emilia Galotti" (1772)

Analysis, thumbnail 1 Analysis, thumbnail 2 Analysis, thumbnail 3 Analysis, thumbnail 4

Display & Analysis

Skit: The biggest chatterboxes in German literature

List of most talkative characters in German theatre plays

Cf. corresponding blog post.

Display & Analysis

Network size (median) by decade (1730–1930):

Network size (median) by decade

Cf. blog post "200 Years of Literary Network Data".

Display & Analysis

Network density (mean) by genre and century:

Network density (mean) by genre and century

Upcoming blog post "Network Values by Genre".

Further Research

  • more statistical data
  • bigger (German-language) corpus
  • foreign-language corpora
  • to sum it all up: using literary network data to evaluate and contribute to traditional Literary Studies

Bibliography (1/3)

Literary Theory

  • Roland Barthes, The Structuralist Activity, in: Roland Barthes, Critical Essays, Evanston, Il., 1972, 213–220.
  • Jurij M. Lotman, The Structure of the Artistic Text, Ann Arbor 1977.
  • Solomon Marcus, Mathematische Poetik, Frankfurt/M. 1973.
  • Michael Titzmann, Die strukturalistische Tätigkeit. Theorie und Praxis der Interpretation, München 1977.

Social Network Analysis

  • Stanley Wasserman & Katherine Faust, Social Network Analysis. Methods and Applications, New York 1994.
  • John Scott & Peter J. Carrington (eds.), The SAGE Handbook of Social Network Analysis, London et al. 2011.

Bibliography (2/3)

Literary Studies & SNA

Bibliography (3/3)

Literary Studies & SNA (cont'd)

  • James Stiller, Daniel Nettle & Robin I. M. Dunbar: The Small World of Shakespeareʼs Plays, in: Human Nature 14 (2003), 397–408.
  • James Stiller & Matthew Hudson, Weak Links and Scene Cliques Within the Small World of Shakespeare, in: Journal of Cultural and Evolutionary Psychology 3 (2005), 57–73.
  • Peer Trilcke, Social Network Analysis (SNA) als Methode einer textempirischen Literaturwissenschaft, in: Philip Ajouri, Katja Mellmann & Christoph Rauen (eds.): Empirie in der Literaturwissenschaft, Münster 2013, 201–247.