Network Analysis of Dramatic TextsJekyll2019-05-25T10:16:46+02:00https://dlina.github.io/Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io/https://dlina.github.io/Potsdam-Hackathon-20172017-12-22T00:00:00+01:002017-12-22T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Thanks to the funding we received from the University of Potsdam (<a href="http://www.uni-potsdam.de/foerderung/6-international.html">KoUP 1</a>) and the Higher School of Economics (<a href="https://www.hse.ru/science/scifund/nug/">НУГ</a>), we were able to organise two hackathons this year, one in September in Moscow, another one earlier this month at <a href="http://www.fontanearchiv.de/startseite.html">Fontane Archive</a> in Potsdam. The latter concluded with a <a href="https://www.uni-potsdam.de/lit-19-jhd/digitale-literaturwissenschaft/potsdamer-arbeitstreffen/no2-2017.html">mini conference</a>.</p>
<p>The network analysis of literary texts remains the main business of our German-Russian research group. In 2017, though, we rebuilt our whole infrastructure so we’re able to look beyond network-analytical research questions and combine the network approach with other (quantitative) methods. Some of the scientific outcome of our efforts throughout this year was presented at the mini conference and on Twitter via the hashtag <a href="https://twitter.com/hashtag/potsdam_digilit?f=tweets&vertical=default&src=hash">#potsdam_digilit</a>, some will find its way into our upcoming research papers.</p>
<p>To capture a bit of the hackathon spirit, this end-of-the-year blog post will just roll out some pics from our December meeting, so here goes:</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/berlin-hbf-magenta.jpg" alt="Arriving at Berlin main station, magenta style." style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Arriving at Berlin Central Station, magenta style.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/walking-down-friedrichstrasse.jpg" alt="Walking down Friedrichstrasse." style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Walking down Friedrichstrasse.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/flashmob.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">The bunch, first morning.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/martinez.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Welcome to Fontane Archive!</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/hackathon-b-and-w.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Literary hackathon in black and white.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/discussing-shinyapp.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Fine-tuning our new <a href="https://shiny.dracor.org/">Shiny app</a>.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/discussing-chekhov-poster.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Discussing our Chekhov conference poster for <a href="http://dhd2018.uni-koeln.de/programm-donnerstag/">DHd2018</a>.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/discussing-dramavis.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Testing the new version of <a href="https://dlina.github.io/dramavis/">dramavis</a> on Russian plays (a.k.a. "laptop-sticker competition").</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/chris-and-peer.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Discussing next steps.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/discussing-api.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">New API!</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/promenade.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Lunch break: <a href="https://fr.wikipedia.org/wiki/Le_Neveu_de_Rameau"><i>Lui</i> and <i>Moi</i></a> on their way to the Café de la Régence.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/gg-and-ff.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">GG and FF.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/gg-and-ff-meta.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">And a meta perspective.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/ducks-and-swans.jpg" alt="alternative text" style="height:400px; width:533px;" />
</figure>
<p style="text-align:center;">Let's study ducks and swans.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/sans-souci.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">A quick visit to <a href="https://de.wikipedia.org/wiki/Das_Komma_von_SANS,_SOUCI.">the comma of SANS, SOUCI.</a></p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/walking-in-park.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Let's go back.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/studying-flaischlen.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Studying a real-life copy of Cäsar Flaischlen's <a href="http://weltliteratur.net/A-Giant-1890-Flowchart-of-Foreign-Influences-on-German-Literature/">"Graphische Litteratur-Tafel"</a> (1890).</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/hackathoning.jpg" alt="alternative text" style="height:400px; width:533px;" />
</figure>
<p style="text-align:center;">Still hacking.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/frank-and-peer.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">@peertrilcke and @umblaetterer looking at things. (<a href="https://twitter.com/umblaetterer/status/616232211952500736">context</a>)</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/ira-and-danya.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Still hacking.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/danya-on-tolstoy.jpg" alt="alternative text" style="height:400px; width:700px;" />
</figure>
<p style="text-align:center;">Danya on Tolstoy.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/conference-break.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Conference break.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/night-walk.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">The inevitable night walk.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/ice-skating-selfie.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">Visit to the Potsdam Christmas market …</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/ice-skating.jpg" alt="alternative text" style="height:400px; width:400px;" />
</figure>
<p style="text-align:center;">… and some ice-skating.</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/en-garde.jpg" alt="alternative text" style="height:400px; width:700px;" />
</figure>
<p style="text-align:center;">Restaging a random swashbuckler movie …</p>
<figure>
<img src="https://dlina.github.io/images/photos/potsdam-2017/emilia-galotti.jpg" alt="alternative text" style="height:400px; width:700px;" />
</figure>
<p style="text-align:center;">… and a jump cut to the final scene of <a href="https://en.wikipedia.org/wiki/Emilia_Galotti"><i>Emilia Galotti</i></a>, "crazy Odoardo" edition.</p>
<p style="text-align:center;">Best wishes and see all you next year.</p>
<p><a href="https://dlina.github.io/Potsdam-Hackathon-2017/">December Hackathon in Potsdam</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on December 22, 2017.</p>https://dlina.github.io/Subgraphs2017-10-03T00:00:00+02:002017-10-03T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>The network analysis of literary texts rests on a number of algorithmic
foundations, which are often not sufficiently reflected in the field. In
this regard, one problematic case is the existence of detached subgraphs.
Here’s a classic example, the network of Goethe’s <em>Faust, Part One</em> (1808),
visualised with our online tool <a href="https://dlina.github.io/ezlinavis/"><strong>ezlinavis</strong></a>
(<em>Faust</em> being one of the examples you can select from the pull-down menu
in the right upper corner):</p>
<figure>
<img src="https://dlina.github.io/images/faust-ezlinavis.png" alt="Faust, generated with Ezlinavis" style="width:900px;" />
</figure>
<p>We can visually distinguish three subgraphs:</p>
<ul>
<li>the main graph revolving around Faust and Mephisto, which basically
comprises the entire plot of the play, except for two detached single scenes:
<ul>
<li>Vorspiel auf dem Theater (Prelude in the Theater)</li>
<li>Walpurgisnachtstraum (Walpurgis Night’s Dream)</li>
</ul>
</li>
</ul>
<p>The two latter scenes do not feature any character from the main graph, which is
problematic when starting to calculate network metrics. For example, if we want
to calculate the <a href="https://en.wikipedia.org/wiki/Average_path_length">average path length</a>,
which is the average of all average distances from one node to all other nodes, how long is the distance
between, say, Faust and any of the characters in the detached Walpurgis Night’s Dream?
<strong>It is, well, infinite.</strong> If we still want to calculate things like the average
distance, we can do that, we just have to find a way to deal with unconnected
pairs of nodes. In any case: “Computing the average distance in disconnected
graphs needs careful consideration.”
(<a href="https://books.google.com/books?id=MpNjDQAAQBAJ&pg=PA223">Zweig 2016, p. 223</a>).</p>
<p>There are different ways to implement this, and even if you’re just using
network tools out of the box, you should be aware of the kind of algorithm
that is used to calculate network metrics in unconnected graphs.</p>
<p>One way is to only consider the paths that actually exist and neglect all
other pairs of nodes. If we use that option, the results for six selected
characters from <em>Faust, Part One</em> are such:</p>
<table>
<thead>
<tr>
<th>Character</th>
<th>Degree</th>
<th>Average Distance</th>
<th>Closeness Centrality</th>
</tr>
</thead>
<tbody>
<tr>
<td>Faust</td>
<td>55</td>
<td>1.11</td>
<td>0.90</td>
</tr>
<tr>
<td>Mephistopheles</td>
<td>35</td>
<td>1.44</td>
<td>0.70</td>
</tr>
<tr>
<td>Wagner</td>
<td>25</td>
<td>1.71</td>
<td>0.58</td>
</tr>
<tr>
<td>Margarete</td>
<td>9</td>
<td>1.85</td>
<td>0.54</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
<tr>
<td>Weltkind</td>
<td>35</td>
<td>1.0</td>
<td>1.0</td>
</tr>
<tr>
<td>Sternschnuppe</td>
<td>35</td>
<td>1.0</td>
<td>1.0</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
</tbody>
</table>
<p>This actually makes sense. Characters/speakers in Walpurgis Night’s Dream
(represented by Weltkind and Sternschnuppe) are not interacting directly with
characters in other scenes and “stay among themselves”, so to speak, which is
why they all have an average distance of 1.0. – Yet if it is true that the
central character, the protagonist if you will, is “the character that minimize[s]
the sum of the distances to all other vertices” (<a href="https://arxiv.org/abs/cond-mat/0202174v1">Alberich/Miro-Julia/Rosselló
2002</a>), we have a problem, because
<strong>Faust stops being the protagonist of <em>Faust</em></strong>, overrun by the 36 speakers of
the Walpurgis Night’s Dream. In other words: <strong>Goethe’s Walpurgis Night’s Dream,
in regard of network theory, is a <a href="https://en.wikipedia.org/wiki/Link_farm">link farm</a>.</strong></p>
<p>If we still want network metrics to be meaningful when it comes to determining
who the central character of a play could be, we better rely on a different
option. For practical reasons, the distance between two unconnected nodes is
sometimes declared as length of the longest existing path, plus one. If we use
this method to assume an (artificial) distance for every pair of nodes, the
above table would look like this:</p>
<table>
<thead>
<tr>
<th>Character</th>
<th>Degree</th>
<th>Average Distance</th>
<th>Closeness Centrality</th>
</tr>
</thead>
<tbody>
<tr>
<td>Faust</td>
<td>55</td>
<td>1.81</td>
<td>0.55</td>
</tr>
<tr>
<td>Mephistopheles</td>
<td>35</td>
<td>2.33</td>
<td>0.42</td>
</tr>
<tr>
<td>Wagner</td>
<td>25</td>
<td>2.78</td>
<td>0.35</td>
</tr>
<tr>
<td>Margarete</td>
<td>9</td>
<td>3.02</td>
<td>0.33</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
<tr>
<td>Weltkind</td>
<td>35</td>
<td>2.88</td>
<td>0.34</td>
</tr>
<tr>
<td>Sternschnuppe</td>
<td>35</td>
<td>2.88</td>
<td>0.34</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
</tbody>
</table>
<p>And … <strong>Faust is back!</strong> Shortest average distance! – For our upcoming paper
on the different kinds of extracting protagonists from plays, we are using this
method to calculate average distances. But, having said that, it cannot be
emphasised enough that since the concept of the protagonist is such a rich
concept, we should not try to use but one simple measure to automatically
determine such entities. Which is something we’ll address in said paper, stay
tuned. 😊</p>
<p>Ok, let’s consider one last way to calculate distance values between unconnected
networks. E.g., when we used <strong>igraph</strong> as network library (before switching to
<strong>networkx</strong>), we saw results that were totally different, because we used a
fallback that determined that
<a href="http://igraph.org/r/doc/distances.html">“the length of the missing paths are counted having length <code class="highlighter-rouge">vcount(graph)</code>, one longer than the longest possible geodesic in the network”</a>
(i.e., <strong>vcount</strong> being the number of vertices of a graph). The resulting metrics,
although calculated correctly, don’t make much sense:</p>
<table>
<thead>
<tr>
<th>Character</th>
<th>Degree</th>
<th>Average Distance</th>
<th>Closeness Centrality</th>
</tr>
</thead>
<tbody>
<tr>
<td>Faust</td>
<td>55</td>
<td>40.07</td>
<td>0.02</td>
</tr>
<tr>
<td>Mephistopheles</td>
<td>35</td>
<td>40.27</td>
<td>0.02</td>
</tr>
<tr>
<td>Wagner</td>
<td>25</td>
<td>40.44</td>
<td>0.02</td>
</tr>
<tr>
<td>Margarete</td>
<td>9</td>
<td>40.52</td>
<td>0.02</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
<tr>
<td>Weltkind</td>
<td>35</td>
<td>67.0</td>
<td>0.01</td>
</tr>
<tr>
<td>Sternschnuppe</td>
<td>35</td>
<td>67.0</td>
<td>0.01</td>
</tr>
<tr>
<td>…</td>
<td>…</td>
<td>…</td>
<td>…</td>
</tr>
</tbody>
</table>
<p>In this approach, the assumed paths when bridging the infinite distance between
two subgraphs are much longer than with the previous algorithms, and almost equal:
differences in the average distances really only become visible after the
decimal point. So while this approach might make sense in some contexts,
it is not very helpful in our case.</p>
<p>All told, our maxim really has to be, and not only when confronted with
subgraphs: <strong>Know your implementation!</strong></p>
<p><a href="https://dlina.github.io/Subgraphs/">Know Your Implementation: Subgraphs in Literary Networks</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on October 03, 2017.</p>https://dlina.github.io/Gogol-Leaving-the-Theatre2017-07-09T00:00:00+02:002017-07-09T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>A couple of days ago, we presented a first version of our TEI-encoded Russian Drama Corpus (RusDraCor) at the <a href="https://events.spbu.ru/events/anons/corpora-2017/">CORPORA 2017 conference</a> in St. Petersburg (<a href="https://dlina.github.io/presentations/2017-spb/">slides</a>). Our goal is to assemble hundreds of Russian plays from the 1740s (Sumarokov) up to the 1930s with authors like Gorky and Mayakovsky.</p>
<p>Right in the middle, chronologically, our corpus features a number of plays by Gogol, one of which is “Театральный разъезд после представления новой комедии” (“Leaving the Theatre after the Presentation of a New Comedy”; <a href="http://ilibrary.ru/text/1557/p.1/index.html">full text at ilibrary.ru</a>).</p>
<p>We don’t concentrate so much on individual networks in our research, we’re more focusing in on the structural evolution of a bulk of literary texts over time. But some networks are just special enough to warrant a bit more attention. So here is the network graph for “Leaving the Theatre”, extracted from <a href="https://raw.githubusercontent.com/dracor-org/rusdracor/master/tei/gogol-teatralnyi-razezd.xml">our TEI version of the play</a> and embellished with Gephi:</p>
<figure>
<img src="https://dlina.github.io/data/gogol-leaving-the-theatre/gogol-teatralnyi-razezd-gephi.png" alt="Character Network of Gogol's 'Leaving the Theatre'" style="width:900px;" />
</figure>
<p>This is a ridicilously big social network for a theatre play (99 characters, it is hard to find plays with more characters). The reason is that Gogol’s “Leaving the Theatre” is a <strong>metaplay</strong>. Gogol started to draft it right after his infamous <a href="https://en.wikipedia.org/wiki/The_Government_Inspector">“Revizor”</a> was released in 1836, but he didn’t publish “Leaving the Theatre” until 1842.</p>
<p>The plot, if we can call it that: A playwright is eavesdropping on the audience leaving the theatre after the presentation of his new play. We hear him comment sometimes, but he doesn’t directly interact with any of the other characters, and neither do they. They are just the exiting audience, ranting or raving about the play they just saw. They have no names, Gogol uses type descriptions to launch their speech acts. They go by names such as …</p>
<ul>
<li>“Светский человек, щеголевато одетый” (“A society man, smartly dressed”)</li>
<li>“Господин, несколько беззаботный насчет литературы (“A gentleman a little careless about literature”)</li>
<li>“Чиновник разговорчивого свойства (“An official of talkative qualities”)</li>
<li>etc.</li>
</ul>
<p>Like mentioned above, we can distinguish <strong>99 characters (or voices)</strong> in this play. Most of the people are just pouring out of the theatre, alone or in groups of two or three, contributing their bit, then vaporising into the evening. We cannot really apply our understanding of social interaction here (the <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2/2">‘digital spectator’</a>), but with a little tweak we can create a meaningful graph.</p>
<p>The play has no acts or scenes, so we segmented it to catch what Manfred Pfister called ‘configurations’, subsets of the character list of a play, i.e., groups of people present on the stage at a certain point during the play. For all characters present in the same segment, we would establish a relation. That way, we’d end up with many small, unconnected subnets. And here comes our tweak: Since our “author” character eavesdrop on all conversations, we added him to all <strong>37 ‘configurations’</strong>, ending up with the star-like network you’ve seen above.</p>
<p>Of course, this is an experimental extension of our approach, but it still helps to better understand the structure of Gogol’s metaplay. For example, we can easily tell apart single characters uttering their opinion and larger conversations involving a group of people, something that doesn’t become as clear when close-reading the play.</p>
<p>Btw, the underlying CSV file for “Leaving the Theatre” can be found <strong><a href="https://raw.githubusercontent.com/lehkost/RusDraCor/master/csv/Gogol_-_Teatralnyi_razezd_-_ilibrary.csv">here</a></strong>.</p>
<h2 id="a-note-on-laughter">A Note on Laughter</h2>
<p>Although we spent a lot of time to get our network data right, there’s still at least one shortcoming when we look at this nice quote from the concluding speech of Gogol’s alter ego in the play:</p>
<blockquote>
<p>“Странно: мне жаль, что никто не заметил честного лица, бывшего в моей пьесе. Да, было одно честное, благородное лицо, действовавшее в нем во все продолжение ее. Это честное, благородное лицо был – <em>смех</em>.”</p>
</blockquote>
<blockquote>
<p>“It’s strange: I regret that no one noticed the one honest person in the play. Yes, there was an honest, noble person acting in it throughout its continuance. This honest, noble person was – <em>laughter</em>.” (our trans.)</p>
</blockquote>
<p>Our current algorithms aren’t able to extract an abstract entity like “laughter” as part of a communication network, but who knows, involving more actor–network theory might bring us a whole bunch of new ideas.</p>
<h2 id="russian-drama-network-as-shiny-app">Russian Drama Network as Shiny App</h2>
<p>On a different note, we also released a Shiny App for the analysis of our networks at the aforementioned conference. It looks like this …</p>
<figure>
<img src="https://dlina.github.io/presentations/2017-spb/images/Screenshot_Shinyapp_2017-06-21.jpg" alt="RusDraCor as Shiny App (screenshot)" style="width:760px;margin-top:15px;margin-bottom:30px;" />
</figure>
<p>… and can be accessed at <strong><a href="https://rusdracor.shinyapps.io/showcase/">https://rusdracor.shinyapps.io/showcase/</a></strong>. It features live data, so to speak, continuously generated from our TEI files as the corpus grows. “Leaving the Theatre” is among the plays, as are works by Blok, Bulgakov, Chechov, Fonvizin, Gorky, Gumilyov, Krylov, Mayakovsky, Ostrovsky, Plavilschikov, Prutkov ☺, Pushkin, Sumarokov, Leo Tolstoy and Turgenev. And more is to come.</p>
<p>Oh, our project will also be presented at the <a href="https://digitizingthestage.wordpress.com/">“Digitizing the Stage” conference</a> starting tomorrow at the University of Oxford.</p>
<p>Etc. etc. etc.</p>
<p><a href="https://dlina.github.io/Gogol-Leaving-the-Theatre/">Network Analysis of Gogol's Metaplay "Leaving the Theatre …" (1842)</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on July 09, 2017.</p>https://dlina.github.io/Mayakovsky-Klop2016-09-18T00:00:00+02:002016-09-18T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>We don’t know if you noticed, but the LINA research field (LIterary Network Analysis) has come up with pretty good PR videos lately. Look at <a href="https://www.youtube.com/watch?v=KX7rzQMswEw">this fancy Youtube clip</a> produced by the “Nation, Genre & Gender” project at the University College Dublin (their project homepage is <a href="http://www.nggprojectucd.ie/">here</a>). The NG+G project applies Social Network Analysis to Irish and British Fiction (1800–1922), their corpus involves 46 novels from 29 authors (according to the video they identified 9,630 unique fictional characters). And although the automated extraction of characters from novels has made progress in recent years (see, for example, <a href="http://dh2016.adho.org/abstracts/297">Jannidis et al.’s paper from DH2016</a>), it is still rough on many edges. That’s why the UCD project chose manual annotation as their approach, and that’s why their data is of such high quality (but also limited in scope).</p>
<p>If you’re working with dramatic texts, automated character extraction is far less of a problem, since this kind of texts comes pre-structured, so to speak. If you work with one of the many TEI-tagged corpora it is even easier to pull out interactions and start analysing them with network metrics. Although, admittedly, sometimes it’s harder than it seems, depending on the quality and depth of the mark-up (we covered that issue <a href="/recent/">in multiple postings</a> last year).</p>
<p>But what do you do if you can’t rely on a fine-grained TEI corpus? That’s what we’re confronted with when gathering network data from Russian drama. If you assemble all the plays that you can find on <a href="http://az.lib.ru/type/index_type_9-1.shtml">lib.ru</a>, <a href="http://rvb.ru/">rvb.ru</a> and <a href="https://ru.wikisource.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D0%B8%D1%8F:%D0%A0%D0%B5%D0%B2%D0%B8%D0%B7%D0%BE%D1%80_(%D0%93%D0%BE%D0%B3%D0%BE%D0%BB%D1%8C)">ru.wikisource.org</a>, you got yourself a pretty good working corpus. The sustainable way would be to assemble all the works and then transform them into TEI and share it with the community. But corpus building is a task of its own and needs a lot of dedication. And after all, we “just” need some kind of network data, not a polished digital edition of the works. So one idea to go forward is to exploit the HTML structure of the texts.</p>
<h2 id="mayakovskys-the-bedbug">Mayakovsky’s “The Bedbug”</h2>
<p>In the beginning of July, we taught a Network Analysis course at the First Moscow-Tartu Digital Humanities Summer School in Yasnaya Polyana (<a href="https://dlina.github.io/presentations/2016-yasnaya-polyana/">if you speak Russian, slides are here</a>). Originally, we wanted to analyse 19th-century drama, but one of the participants preferred to confront our methods with one of <a href="https://en.wikipedia.org/wiki/Vladimir_Mayakovsky">Vladimir Mayakovsky</a>’s plays (hi G.! :-). He chose “Klop” (translated as “The Bedbug”, see <a href="https://en.wikipedia.org/wiki/The_Bedbug">en.wikipedia.org</a>; an English adaption by Snoo Wilson is <a href="http://snoowilson.co.uk/The%20Bedbug.pdf">here as PDF</a>; a concise English summary can be found at <a href="http://www.sovlit.net/bedbug/">sovlit.net</a>), written in 1928 and first published the year after.</p>
<p>“Klop” is definitely one of the challenging plays when it comes to character extraction. And now, two months after the summer school, we tried to automatise the extraction process and used “Klop” as an example. Before we get into the details, this is the end result (visualised in Gephi 0.9.1 using its built-in modularity algorithm; the image is licensed under <a href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a>):</p>
<figure>
<img src="https://dlina.github.io/data/mayakovsky-klop/mayakovsky-klop-network-graph-gephi-cc-by-40.gif" alt="Character Network of Mayakovsky's 'Klop'" style="width:900px;" />
</figure>
<h2 id="network-driven-synopsis">Network-Driven Synopsis</h2>
<p>It’s the late 1920s in a mid-sized town in Soviet Russia. The protagonist in “Klop”, “Pierre Skripkin” (who changed his name from “Prisypkin”), abandons his socialist ideals, because after all the fighting and suffering he wants to start benefiting from what has been achieved. And because this is such an unusual play, we can actually base our synopsis on the network graph. The play consists of nine scenes:</p>
<ul>
<li>In scene 1, we see Skripkin (dark-green, central node) with his friend Bayan and his soon-to-be mother-in-law Rozaliya (both orange) strolling through a warehouse where merchants praise their products (dark-green cluster).</li>
<li>In scene 2, Skripkin discusses his lifestyle with the characters in the light-brown/beige cluster.</li>
<li>Scene 3 shows Skripkin’s wedding with his bourgeois bride Elsevira (orange cluster). However, fire breaks out and everybody dies, except for Skripkin who, …</li>
<li>… in scene 4, goes unnoticed by the firefighters and is preserved in the icy water in the cellar. The firefighters and their captain are depicted in the red cluster, which is detached from the other clusters.</li>
<li>In scene 5, the play reaches the future, jumping 50 years ahead in time. It is now the end of the 1970s, a global socialist state has been created (kind of an aseptic one, though). We follow a call-in discussion among several participants led by an operator, depicted in the light-blue cluster. It is discussed if Skripkin’s recovered body shall be defrosted or not, and a majority votes in favour of unfreezing. Just like the red cluster, this light-blue one is also detached from the main cluster. <strong>So the transitional scenes between present and future are detached, character-wise, from the rest of the play, which is a nice structure-related finding: Skripkin is kind of tunnelling through these scenes into the 1970s.</strong></li>
<li>In scene 6, we meet Skripkin’s ex-girlfriend Zoya Beryozkina, who already occurred in the first two scenes and who is the only other person next to Skripkin who makes it from the present to the future in this play. She shares scene 6 with the professor (purple), some doctors (dark-green) and the resurrected protagonist.</li>
<li>In scene 7, we see a journalist reporting about the “resurrected mammal” (purple cluster). It is said that Skripkin is dangerous since he started to spread these ancient diseases among the people (like dancing, drinking beer and falling in love). In the same scene, the equally dangerous bedbug, which was defrosted along with Skripkin, is hunted down. The eponymous insect, which clearly serves as a symbol in the play, is not featured in the network graph, since no speech act can be attributed to it. 😉 (Although you might well think of a different approach including the little bug in the network analysis.)</li>
<li>Scene 8 presents a disappointed Skripkin who doesn’t like this aseptic future and declares that he would have preferred to stay frozen. The scene is mainly shared between him, Zoya and the professor.</li>
<li>Scene 9 takes place in the zoo, where Skripkin and the bedbug are presented as attractions (light-green cluster). When Skripkin is released from his cage, he holds a speech, but people are appalled and he’s put behind bars again and, further on, “displayed as a specimen of society’s primitive past, where school children can feed him with cigarettes and alcohol” (<a href="http://www.dramaonlinelibrary.com/plays/the-bedbug-iid-135405">dramaonlinelibrary.com</a>).</li>
</ul>
<h2 id="extracting-the-network-data">Extracting the Network Data</h2>
<p>Coming back to where we started, how did we extract the character network: The play was digitised <a href="https://ru.wikisource.org/wiki/%D0%9A%D0%BB%D0%BE%D0%BF_(%D0%9C%D0%B0%D1%8F%D0%BA%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9)">at Wikisource</a>. After having a closer look at the underlying HTML it was clear that extraction was easy, we just needed clear indicators for the beginning of a new scene and all speakers involved. A little Bash script (making use of xmllint) extracted the info like this:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">I
Разносчик пуговиц
Разносчик кукол
Разносчица яблок
(…)
Присыпкин (Пьер Скрипкин)
Розалия Павловна
Присыпкин (Пьер Скрипкин)
Баян
(…)
II
Босой
Уборщик
Босой
Молодой рабочий
Девушка
Парень
(…)
III
Эльзевира
Присыпкин (Пьер Скрипкин)
Эльзевира
Присыпкин (Пьер Скрипкин)
Гость
(…)</code></pre></figure>
<h2 id="disambiguation">Disambiguation</h2>
<p>Now came the tricky part. Since we’re relying on character names, just like the author put them in his play, we had to deal with plenty of ambiguities. This wouldn’t happen with proper TEI, when every <code class="highlighter-rouge"><sp></code>eech act provides IDs for all involved characters. An additional problem is that you have different entities going by the same name, like “Голоса” (“Voices”) in the second and third scene.</p>
<p>So what we had to account for to get a really clean character network is the following:</p>
<ul>
<li>“Зоя” = “Зоя Берёзкина”</li>
<li>“Присыпкин” and “Скрипкин” where combined to “Присыпкин (Пьер Скрипкин)” (since the protagonist proactively changed his name, see above)</li>
<li>1st scene: “Пуговичный разносчик” = “Разносчик пуговиц”</li>
<li>2nd scene: “Босой парень” and “Босой” are the same</li>
<li>2nd scene: “Молодой рабочий” and “Парень” are the same (just like “Парень с метлой”)</li>
<li>2nd scene: the “Девушка” in this scene is not the same as in scene 7 (disambiguation by numbering)</li>
<li>3rd scene: “Посажёный отец—бухгалтер” = “Бухгалтер”</li>
<li>3rd scene: “Крики” at the end eliminated</li>
<li>4th scene: “Пожарные” deleted (for the same reasons for which “Все” was deleted)</li>
<li>5th scene: “Старший и младший” deleted</li>
<li>5th scene: the incoming messages from the several outposts are not marked with their speakers (as a result, they don’t appear in the network)</li>
<li>6th scene: “Хором” deleted</li>
<li>9th scene: “Голос из толпы” occurs three times, all voices are apparently different, so we numbered them</li>
<li>9th scene: “Председатель совета” and “Председатель” are the same</li>
</ul>
<p>We also eliminated all occurrences of “Все” (“All”): the idea is that characters contained in the “Все” already participate in the corresponding scene. That way, we avoid having “Все” as an additional character in the network. For the same reason we could have eliminated all occurrences of “Голоса” (“Voices”), but that’s a different thing since voices can come from unmentioned characters that don’t otherwise contribute to a speech act. So we let those in.</p>
<p>(The resulting TXT file can be found here: <a href="/data/mayakovsky-klop/mayakovsky-klop-speakers-per-scene.txt">“mayakovsky-klop-speakers-per-scene.txt”</a>.)</p>
<p>In comparison, the intermediary XML format we introduced when starting to work with our corpus of German drama <a href="/Introducing-Our-Zwischenformat/">can be much more fine-grained</a>, because we’re working with a TEI-encoded corpus there. <strong>One of the purposes of this article, though, is to demonstrate that you can already do stuff with the most basic of interactional data.</strong></p>
<h2 id="building-the-csv-file">Building the CSV File</h2>
<p>After we had cleaned the names of all speakers, we wrote another small script, this time in Python, to generate a CSV file containing all the edges of the network, here’s a little excerpt:</p>
<figure class="highlight"><pre><code class="language-csv" data-lang="csv">Source,Target,Weight
Баян,Босой,1
Баян,Бухгалтер,1
Баян,Голос,1
Баян,Голоса II,1
Баян,Голоса III,1
(…)
Баян,Присыпкин (Пьер Скрипкин),3
(…)</code></pre></figure>
<p>Really just containing info on who is talking to whom in how many scenes. (The CSV file can be obtained here: <a href="/data/mayakovsky-klop/mayakovsky-klop-edges.csv">“mayakovsky-klop-edges.csv”</a>. This, of course, was the data we fed into Gephi to visualise the network shown above.)</p>
<h2 id="some-network-values">Some Network Values</h2>
<p>The network graph does well in demonstrating the structural uniqueness of Mayakovsky’s play. It is rather unusual that almost every scene can be identified as an individual cluster in the graph. The number of characters (= network size) is 94, the network density is fairly low, 0.17 (i.e., 17% of all possible connections between nodes are actually happening). The node-degree distribution shows traits of a power law, but it’s hard to draw any conclusions from that, since the play is so short and the interactional mode of the play so unique.</p>
<p>If you have a look at the CSV file, almost all weights are “1”, meaning that characters share exactly one scene. The play is really about showing Pierre Skripkin in different contexts, in the present and the future. His closest contacts are his former lover Zoya Beryozkina and Oleg Bayan (3 shared scenes each), Rozaliya Pavlovna (bride’s mother) and the professor in the future (2 shared scenes each).</p>
<h2 id="something-like-a-conclusion">Something Like a Conclusion</h2>
<p>You cannot reflect enough on the practice of character extraction from literary texts. The method you use has a big impact on the numbers that you’re working with later. You not only have to “know your corpus”, but you also have to keep in mind the rationale on which you based the information extraction. Especially if you want to process not just one file (like we did in this post) but hundreds or thousands of them.</p>
<p><a href="https://dlina.github.io/Mayakovsky-Klop/">Extracting Network Data from Mayakovsky's Play "The Bedbug" (1928/29)</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on September 18, 2016.</p>https://dlina.github.io/Distant-Reading-Showcase-Poster-DHd2016-Leipzig2016-03-30T00:00:00+02:002016-03-30T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Three weeks ago, we attended the annual Digital Humanities conference of the German-speaking countries (<a href="http://www.dhd2016.de/">DHd2016</a>), this time taking place at the University of Leipzig. We delivered two papers (more on them later) and a poster. And were really excited <a href="http://www.dig-hum.de/gewinner-des-posterawards-2016">to be awarded the price for the best poster</a> out of 78 poster submissions (<a href="http://dhd2016.de/sites/default/files/dhd2016/files/PosterAward_Leipzig_2016.pdf">listed in this PDF</a>).</p>
<p>I will try to quickly explain what we tried to do when creating our poster. But first and foremost, <em>this</em> is the poster we’re talking about, its full title goes as follows: “Distant-Reading Showcase: 200 Years of German Drama History at a Glance”.</p>
<figure>
<img src="https://dlina.github.io/images/distant-reading-showcase-poster-dhd2016-leipzig-900px.jpg" alt="Distant-Reading Showcase, poster fore DHd2016, 900px version" style="width:900px;" />
</figure>
<p>A full-res version can be downloaded <strong><a href="https://dx.doi.org/10.6084/m9.figshare.3101203.v1">from Figshare</a></strong> (PDF; 28.88 MB).</p>
<p>What we set out to do with this poster was to produce a data-driven showcase for what they™ call <em>distant reading</em>. We have a working definition of ‘distant reading’ that differs from the one that underlies Franco Moretti’s articles on the matter since he first coined the term in 2000. Just recently, Peer and I gave a talk on the matter, last November in Vienna, at <a href="http://www.iwk.ac.at/events/distant-reading-und-diskursanalyse">a workshop dedicated to “Distant Reading and Discourse Analysis”</a>. (The corresponding article will appear shortly, we just finished the final editing.) Let’s just point you to two aspects: Moretti never talks about programming or code and neither describes nor provides his working corpus so that anybody could reproduce his findings, two things we consider essential and tried to address throughout the course of the DLINA project (<a href="/recent/">see our older postings</a>). ‘Data-driven’ means that we wanted the computer to generate the better part of the poster, a job done by our tool <strong><a href="https://github.com/lehkost/dramavis">dramavis</a></strong> which was revamped and completely rewritten from scratch just weeks before the conference (current version is v0.2).</p>
<p>In order to be a convincing <strong>Distant-Reading Showcase</strong> our poster should really <em>show</em> visualised data that could actually <em>be read</em> by viewers. The 465 character networks showing German-language dramas written/published between 1730 and 1930 are sorted chronologically, and one thing people should be able to spot is the decisive decade in which German authors started to binge-read and adapt Shakespeare. All of a sudden in the 1770s, they start to build character networks far bigger than the ones before: Goethe’s play <a href="https://en.wikipedia.org/wiki/G%C3%B6tz_von_Berlichingen_(Goethe)">“Götz von Berlichingen”</a> is one of the first that, instead of only 8 or 12 or 16 characters, started to let more than 70 characters appear on stage. You can witness this ‘explosion’ in the 3rd row from above, 3rd column from the right. There are other things you can actually recognise in the poster, just take the network built from Schnitzler’s “Der Reigen” (<a href="https://en.wikipedia.org/wiki/La_Ronde_(play)">“La Ronde”</a>), which describes a circle in correspondence with the symptomatic course of the play (6th line from below, 7th column from the right; see also <a href="https://twitter.com/gimsieke/status/707855735070322688">Gerrit Imsieke’s tweet on the matter</a>).</p>
<p>At some point (when pottering about with Illustrator trying to open and convert a 20+ MB SVG) we had the notion that next time we should aim at generating the entire poster directly as script-driven SVG. But okay, this time we still managed to undertake the finishing steps on an old 2×2.8 GHz Quad-Core Intel Xeon Mac Pro with just about 6 GB of RAM using InDesign to properly fill the rest of the poster with descriptive info and some additional stuff: The two diagrams in the lower left of the sidebar already show further parts of our research, one of them the number of dramas with ‘small world’ characteristics, something <a href="https://www.conftool.pro/dh2016/index.php?page=browseSessions&form_session=42">we will also talk about at the DH2016 in Krakow, on July 14</a>.</p>
<p>To add a bit of suspense, we arrived in Leipzig with a still unfinished poster. A tiny little night shift at <a href="http://www.cafe-telegraph.de/">Café Telegraph</a> settled things and on Wednesday, the very day of the poster presentations, we printed the actual poster on glossy paper in <a href="https://en.wikipedia.org/wiki/ISO_216#A_series">A0 format</a> at the local print shop <a href="https://www.sedruck-leipzig.de/"><em>sedruck</em></a>, their store at Beethovenstraße 23. The result was amazing, one of the best A0 printing experiences we had so far.</p>
<h2 id="credits">Credits</h2>
<p>Creating this poster was a team effort:</p>
<figure>
<img src="https://dlina.github.io/images/distant-reading-showcase-poster-team-dhd2016.png" alt="The team responsible for the Distant Reading-Showcase poster…" style="width:600px;" />
</figure>
<h2 id="some-criticism">Some Criticism</h2>
<p>It was <a href="http://www.dhd2016.de/Abschluss">keynote speaker</a> Daniel Keim himself who uttered some criticism when discussing the poster with us later that evening, broaching the problems of spring-embedder algorithms. And we couldn’t agree more: Spring embedders have “an undeniable aesthetic appeal, […] yet a random layout is nearly always the default” (<a href="http://gdea.informatik.uni-koeln.de/1327/">source</a>). One side effect of this is that graphs always look a tad different when generating them anew. Thus, similar graphs don’t always look similar. This is a mere graph-visualisation problem and not too relevant for the actual research we’re conducting with the network measures we calculate with our <em>dramavis</em> tool. But feel free to give us a hint on how to normalise graphs generated with spring-embedding algorithms.</p>
<h2 id="closing-words">Closing Words</h2>
<p>Albeit the usual time pressure, it was great fun to plan, design and discuss our poster and to face some real competition. A big shout-out to our fellow winners who ranked 2nd (<a href="https://twitter.com/cutuchiqueno/status/707839351720419328">“Digitales Publizieren. Bedingungen – Optionen – Empfehlungen”</a>) and 3rd (<a href="https://twitter.com/ARockenberger/status/707584563447513088">“Das Tool LAKomp und seine Anwendung auf Texte nichtstandardisierter Sprachstufen”</a>). Right after the ceremony, we enjoyed a nice little dinner with the runners-up and some other friends at the dimly lit restaurant located in the <a href="https://de.wikipedia.org/wiki/Alte_Nikolaischule_(Leipzig)">Alte Nikolaischule building</a> of which there is a twitpic <a href="https://twitter.com/peertrilcke/status/707997860386750464">here</a>.</p>
<p>See y’all next year at the <a href="http://www.dig-hum.de/dhd-2017">DHd2017 conference in Berne, CH</a>.</p>
<p><a href="https://dlina.github.io/Distant-Reading-Showcase-Poster-DHd2016-Leipzig/">“Distant-Reading Showcase”: Designing Our DHd2016 Conference Poster</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on March 30, 2016.</p>https://dlina.github.io/The-Facebook-of-German-Playwrights2016-01-28T00:00:00+01:002016-01-28T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>This short article is a follow-up to our last posting, <a href="/The-Birth-and-Death-of-German-Playwrights/">“The Birth & Death of German Playwrights”</a>. Plotting the birth and death places of our 178 authors onto a map was bringing us closer to understanding the character of our corpus which – <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">codenamed “Sydney”</a> – contains 465 German-language plays. But it didn’t bring us close enough to understanding who the authors are. So let’s build a gallery with their portraits, <strong>a <a href="https://en.wikipedia.org/wiki/Face_book">facebook</a> of German playwrights</strong>, so to speak, and let’s do that automatically.</p>
<p>We’re relying on <strong>Wikidata</strong> again and, for each author, extract a link to their principal image which leads to the actual portrait file on Wikimedia Commons. We do this with nothing more than an <strong>XSLT</strong> transformation. Some simple BASH scripting was added to build the actual gallery for this post. The male and female silhouettes for authors who still lack an image on Commons <a href="http://blog.ruthreiche.de/profilbilder/">were designed by our accomplice Ruth Reiche</a> (thanks!). For some more details on how we did all this scroll down to the end of the gallery. And now without further ado, this is the gallery (click on an image to get to the source file on Commons):</p>
<div id="portraitgallery">
<ul>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Friederike_Caroline_Neuber_1898_Neuer_Theater-Almanch.pn%67"><img src="https://dlina.github.io/images/authorpics/1697_neuber.jpg" alt="Neuber, Friederike Caroline (1697–1760)" /></a><figcaption>Neuber, Friederike Caroline <br />(1697–1760)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Bodmer.jp%67"><img src="https://dlina.github.io/images/authorpics/1698_bodmer.jp%67" alt="Bodmer, Johann Jacob (1698–1783)" /></a><figcaption>Bodmer, Johann Jacob <br />(1698–1783)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johann_Christoph_Gottsched.jp%67"><img src="https://dlina.github.io/images/authorpics/1700_gottsched.jpg" alt="Gottsched, Johann Christoph (1700–1766)" /></a><figcaption>Gottsched, Johann Christoph <br />(1700–1766)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jp%67" alt="Borkenstein, Hinrich (1705–1777)" /><figcaption>Borkenstein, Hinrich <br />(1705–1777)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Gottschedin.jp%67"><img src="https://dlina.github.io/images/authorpics/1713_gottsched.jpg" alt="Gottsched, Luise Adelgunde Victorie (1713–1762)" /></a><figcaption>Gottsched, Luise Adelgunde Victorie <br />(1713–1762)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Christian_Fürchtegott_Gellert.jp%67"><img src="https://dlina.github.io/images/authorpics/1715_gellert.jpg" alt="Gellert, Christian Fürchtegott (1715–1769)" /></a><figcaption>Gellert, Christian Fürchtegott <br />(1715–1769)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Kurz_Stich.jp%67"><img src="https://dlina.github.io/images/authorpics/1717_kurz.jpg" alt="Kurz, Joseph von (1717–1784)" /></a><figcaption>Kurz, Joseph von <br />(1717–1784)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jp%67" alt="Schlegel, Johann Elias (1719–1749)" /><figcaption>Schlegel, Johann Elias <br />(1719–1749)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Leopold-von-auenbrugger.jp%67"><img src="https://dlina.github.io/images/authorpics/1722_auenbrugger.jpg" alt="Auenbrugger, Johann Leopold von (1722–1809)" /></a><figcaption>Auenbrugger, Johann Leopold von <br />(1722–1809)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Mylius, Christlob (1722–1754)" /><figcaption>Mylius, Christlob <br />(1722–1754)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Quistorp, Theodor Johann (1722–1776)" /><figcaption>Quistorp, Theodor Johann <br />(1722–1776)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Krüger, Johann Christian (1723–1750)" /><figcaption>Krüger, Johann Christian <br />(1723–1750)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Klopstock_(Füßli).jp%67"><img src="https://dlina.github.io/images/authorpics/1724_klopstock.jpg" alt="Klopstock, Friedrich Gottlieb (1724–1803)" /></a><figcaption>Klopstock, Friedrich Gottlieb <br />(1724–1803)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Christian_felix_weisse.jp%67"><img src="https://dlina.github.io/images/authorpics/1726_weisse.jpg" alt="Weiße, Christian Felix (1726–1804)" /></a><figcaption>Weiße, Christian Felix <br />(1726–1804)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Gotthold_Ephraim_Lessing_Kunstsammlung_Uni_Leipzig.jp%67"><img src="https://dlina.github.io/images/authorpics/1729_lessing.jpg" alt="Lessing, Gotthold Ephraim (1729–1781)" /></a><figcaption>Lessing, Gotthold Ephraim <br />(1729–1781)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Anton_Graff_Salomon_Gessner.jp%67"><img src="https://dlina.github.io/images/authorpics/1730_gessner.jpg" alt="Gessner, Salomon (1730–1788)" /></a><figcaption>Gessner, Salomon <br />(1730–1788)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johann_Friedrich_Freiherr_von_Cronegk.jp%67"><img src="https://dlina.github.io/images/authorpics/1731_cronegk.jpg" alt="Cronegk, Johann Friedrich von (1731–1758)" /></a><figcaption>Cronegk, Johann Friedrich von <br />(1731–1758)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Hafner, Philipp (1731–1764)" /><figcaption>Hafner, Philipp <br />(1731–1764)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Pfeil, Johann Gottlob Benjamin (1732–1800)" /><figcaption>Pfeil, Johann Gottlob Benjamin <br />(1732–1800)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ayrenhoff,_Cornelius_von1.jp%67"><img src="https://dlina.github.io/images/authorpics/1733_ayrenhoff.jpg" alt="Ayrenhoff, Cornelius Hermann von (1733–1819)" /></a><figcaption>Ayrenhoff, Cornelius Hermann von <br />(1733–1819)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Christoph_Martin_Wieland_by_Jagemann_1805.jp%67"><img src="https://dlina.github.io/images/authorpics/1733_wieland.jpg" alt="Wieland, Christoph Martin (1733–1813)" /></a><figcaption>Wieland, Christoph Martin <br />(1733–1813)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johann_Christian_Brandes.jp%67"><img src="https://dlina.github.io/images/authorpics/1735_brandes.jpg" alt="Brandes, Johann Christian (1735–1799)" /></a><figcaption>Brandes, Johann Christian <br />(1735–1799)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Klemm, Christian Gottlob (1736–1802)" /><figcaption>Klemm, Christian Gottlob <br />(1736–1802)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Helfrich_Peter_Sturz_1.jp%67"><img src="https://dlina.github.io/images/authorpics/1736_sturz.jpg" alt="Sturz, Helfrich Peter (1736–1779)" /></a><figcaption>Sturz, Helfrich Peter <br />(1736–1779)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Heinrich_W._von_Gerstenberg.jp%67"><img src="https://dlina.github.io/images/authorpics/1737_gerstenberg.jpg" alt="Gerstenberg, Heinrich Wilhelm von (1737–1823)" /></a><figcaption>Gerstenberg, Heinrich Wilhelm von <br />(1737–1823)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Brawe, Joachim Wilhelm von (1738–1758)" /><figcaption>Brawe, Joachim Wilhelm von <br />(1738–1758)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johann_Jacob_Engel.jp%67"><img src="https://dlina.github.io/images/authorpics/1741_engel.jpg" alt="Engel, Johann Jakob (1741–1802)" /></a><figcaption>Engel, Johann Jakob <br />(1741–1802)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Theodor_Gottlieb_von_Hippel_d._Ä..JP%47"><img src="https://dlina.github.io/images/authorpics/1741_hippel.jpg" alt="Hippel, Theodor Gottlieb von (1741–1796)" /></a><figcaption>Hippel, Theodor Gottlieb von <br />(1741–1796)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Gottlieb_Stephanie_der_Jüngere.pn%67"><img src="https://dlina.github.io/images/authorpics/1741_stephanie.jpg" alt="Stephanie, Johann Gottlieb (der Jüngere) (1741–1800)" /></a><figcaption>Stephanie, Johann Gottlieb (d. J.) <br />(1741–1800)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Friedrich_Ludwig_Schröder_(1744-1816).jp%67"><img src="https://dlina.github.io/images/authorpics/1744_schroeder.jpg" alt="Schröder, Friedrich Ludwig (1744–1816)" /></a><figcaption>Schröder, Friedrich Ludwig <br />(1744–1816)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Weidmann, Paul (1744–1801)" /><figcaption>Weidmann, Paul <br />(1744–1801)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:JohannFriedrichWilhelmGotter.jp%67"><img src="https://dlina.github.io/images/authorpics/1746_gotter.jpg" alt="Gotter, Friedrich Wilhelm (1746–1797)" /></a><figcaption>Gotter, Friedrich Wilhelm <br />(1746–1797)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Wagner, Heinrich Leopold (1747–1779)" /><figcaption>Wagner, Heinrich Leopold <br />(1747–1779)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Bretzner, Christoph Friedrich (1748–1807)" /><figcaption>Bretzner, Christoph Friedrich <br />(1748–1807)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Goethe_(Stieler_1828).jp%67"><img src="https://dlina.github.io/images/authorpics/1749_goethe.jpg" alt="Goethe, Johann Wolfgang von (1749–1832)" /></a><figcaption>Goethe, Johann Wolfgang von <br />(1749–1832)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:FriedrichMueller.jp%67"><img src="https://dlina.github.io/images/authorpics/1749_mueller.jpg" alt="Müller, Friedrich (Maler Müller) (1749–1825)" /></a><figcaption>Müller, Friedrich (Maler Müller) <br />(1749–1825)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:JMRLenz.jp%67"><img src="https://dlina.github.io/images/authorpics/1751_lenz.jpg" alt="Lenz, Jakob Michael Reinhold (1751–1792)" /></a><figcaption>Lenz, Jakob Michael Reinhold <br />(1751–1792)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Emanuel_Schikaneder.jp%67"><img src="https://dlina.github.io/images/authorpics/1751_schikaneder.jpg" alt="Schikaneder, Johann Emanuel (1751–1812)" /></a><figcaption>Schikaneder, Johann Emanuel <br />(1751–1812)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Friedrich_Maximilian_von_Klinger.jp%67"><img src="https://dlina.github.io/images/authorpics/1752_klinger.jpg" alt="Klinger, Friedrich Maximilian (1752–1831)" /></a><figcaption>Klinger, Friedrich Maximilian <br />(1752–1831)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johann_Anton_Leisewitz_(Schröder).pn%67"><img src="https://dlina.github.io/images/authorpics/1752_leisewitz.jpg" alt="Leisewitz, Johann Anton (1752–1806)" /></a><figcaption>Leisewitz, Johann Anton <br />(1752–1806)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Joseph_August_von_Toerring1.jp%67"><img src="https://dlina.github.io/images/authorpics/1753_toerring.jpg" alt="Törring, Josef August von (1753–1826)" /></a><figcaption>Törring, Josef August von <br />(1753–1826)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Soden, Julius von (1754–1831)" /><figcaption>Soden, Julius von <br />(1754–1831)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Gemmingen-Hornberg, Otto Heinrich von (1755–1836)" /><figcaption>Gemmingen-Hornberg, Otto Heinrich von <br />(1755–1836)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Schink, Johann Friedrich (1755–1835)" /><figcaption>Schink, Johann Friedrich <br />(1755–1835)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:August_Wilhelm_Iffland_Johann_Stephan_Decker.jp%67"><img src="https://dlina.github.io/images/authorpics/1759_iffland.jpg" alt="Iffland, August Wilhelm (1759–1814)" /></a><figcaption>Iffland, August Wilhelm <br />(1759–1814)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Hensler, Karl Friedrich (1759–1825)" /><figcaption>Hensler, Karl Friedrich <br />(1759–1825)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Friedrich_Schiller_by_Ludovike_Simanowiz.jp%67"><img src="https://dlina.github.io/images/authorpics/1759_schiller.jpg" alt="Schiller, Friedrich (1759–1805)" /></a><figcaption>Schiller, Friedrich <br />(1759–1805)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:August_von_Kotzebue.pn%67"><img src="https://dlina.github.io/images/authorpics/1761_kotzebue.jpg" alt="Kotzebue, August von (1761–1819)" /></a><figcaption>Kotzebue, August von <br />(1761–1819)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Benkowitz, Karl Friedrich (1764–1807)" /><figcaption>Benkowitz, Karl Friedrich <br />(1764–1807)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Joseph_Sonnleithner.jp%67"><img src="https://dlina.github.io/images/authorpics/1766_sonnleithner.jpg" alt="Sonnleithner, Joseph Ferdinand von (1766–1835)" /></a><figcaption>Sonnleithner, Joseph Ferdinand von <br />(1766–1835)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:August_Wilhelm_von_Schlegel.jp%67"><img src="https://dlina.github.io/images/authorpics/1767_schlegel.jpg" alt="Schlegel, August Wilhelm (1767–1845)" /></a><figcaption>Schlegel, August Wilhelm <br />(1767–1845)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Knädig_-_Johann_Friedrich_Kind.jp%67"><img src="https://dlina.github.io/images/authorpics/1768_kind.jpg" alt="Kind, Johann Friedrich (1768–1843)" /></a><figcaption>Kind, Johann Friedrich <br />(1768–1843)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Julius_von_Voss.pn%67"><img src="https://dlina.github.io/images/authorpics/1768_voss.jpg" alt="Voß, Julius von (1768–1832)" /></a><figcaption>Voß, Julius von <br />(1768–1832)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Zacharias_werner.jp%67"><img src="https://dlina.github.io/images/authorpics/1768_werner.jpg" alt="Werner, Zacharias (1768–1823)" /></a><figcaption>Werner, Zacharias <br />(1768–1823)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Zschokke.jp%67"><img src="https://dlina.github.io/images/authorpics/1771_zschokke.jpg" alt="Zschokke, Heinrich (1771–1848)" /></a><figcaption>Zschokke, Heinrich <br />(1771–1848)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Gleich, Joseph Alois (1772–1841)" /><figcaption>Gleich, Joseph Alois <br />(1772–1841)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Schlegelvers1829.jp%67"><img src="https://dlina.github.io/images/authorpics/1772_schlegel.jpg" alt="Schlegel, Friedrich (1772–1829)" /></a><figcaption>Schlegel, Friedrich <br />(1772–1829)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ludwig_Tieck.jp%67"><img src="https://dlina.github.io/images/authorpics/1773_tieck.jpg" alt="Tieck, Ludwig (1773–1853)" /></a><figcaption>Tieck, Ludwig <br />(1773–1853)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Jugendfoto.JP%47"><img src="https://dlina.github.io/images/authorpics/1773_weissenthurn.jpg" alt="Weißenthurn, Johanna von (1773–1847)" /></a><figcaption>Weißenthurn, Johanna von <br />(1773–1847)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Breuning, Stephan von (1774–1827)" /><figcaption>Breuning, Stephan von <br />(1774–1827)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Adolf_Müllner.jp%67"><img src="https://dlina.github.io/images/authorpics/1774_muellner.jpg" alt="Müllner, Adolph (1774–1829)" /></a><figcaption>Müllner, Adolph <br />(1774–1829)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Meisl, Karl (1775–1853)" /><figcaption>Meisl, Karl <br />(1775–1853)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Georg.Friedrich.Treitschke.jp%67"><img src="https://dlina.github.io/images/authorpics/1776_treitschke.jpg" alt="Treitschke, Georg Friedrich (1776–1842)" /></a><figcaption>Treitschke, Georg Friedrich <br />(1776–1842)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Kleist,_Heinrich_von.jp%67"><img src="https://dlina.github.io/images/authorpics/1777_kleist.jpg" alt="Kleist, Heinrich von (1777–1811)" /></a><figcaption>Kleist, Heinrich von <br />(1777–1811)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Bonaventura.jp%67"><img src="https://dlina.github.io/images/authorpics/1777_klingemann.jpg" alt="Klingemann, Ernst August Friedrich (1777–1831)" /></a><figcaption>Klingemann, Ernst August Friedrich <br />(1777–1831)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Friedrich_de_la_Motte-Fouqué_in_Husarenuniform.jp%67"><img src="https://dlina.github.io/images/authorpics/1777_motte-fouque.jpg" alt="Fouqué, Friedrich de la Motte (1777–1843)" /></a><figcaption>Fouqué, Friedrich de la Motte <br />(1777–1843)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Clemens_Brentano.jp%67"><img src="https://dlina.github.io/images/authorpics/1778_brentano.jpg" alt="Brentano, Clemens (1778–1842)" /></a><figcaption>Brentano, Clemens <br />(1778–1842)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Bernard, Josef Karl (1780–1850)" /><figcaption>Bernard, Josef Karl <br />(1780–1850)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Karoline_von_guenderode.jp%67"><img src="https://dlina.github.io/images/authorpics/1780_guenderode.jpg" alt="Günderode, Karoline von (1780–1806)" /></a><figcaption>Günderode, Karoline von <br />(1780–1806)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ludwig_Achim_von_Arnim.jp%67"><img src="https://dlina.github.io/images/authorpics/1781_arnim.jpg" alt="Arnim, Ludwig Achim von (1781–1831)" /></a><figcaption>Arnim, Ludwig Achim von <br />(1781–1831)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Helmina_von_Chézy_2.pn%67"><img src="https://dlina.github.io/images/authorpics/1783_chezy.jpg" alt="Chézy, Helmina von (1783–1856)" /></a><figcaption>Chézy, Helmina von <br />(1783–1856)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ernst_Benjamin_Salomo_Raupach.pn%67"><img src="https://dlina.github.io/images/authorpics/1784_raupach.jpg" alt="Raupach, Ernst (1784–1852)" /></a><figcaption>Raupach, Ernst <br />(1784–1852)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Adolf_Baeuerle.jp%67"><img src="https://dlina.github.io/images/authorpics/1786_baeuerle.jpg" alt="Bäuerle, Adolf (1786–1859)" /></a><figcaption>Bäuerle, Adolf <br />(1786–1859)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Uhland.jp%67"><img src="https://dlina.github.io/images/authorpics/1787_uhland.jpg" alt="Uhland, Ludwig (1787–1862)" /></a><figcaption>Uhland, Ludwig <br />(1787–1862)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Eichendorff.jp%67"><img src="https://dlina.github.io/images/authorpics/1788_eichendorff.jpg" alt="Eichendorff, Joseph von (1788–1857)" /></a><figcaption>Eichendorff, Joseph von <br />(1788–1857)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ferdinand_Raimund.jp%67"><img src="https://dlina.github.io/images/authorpics/1790_raimund.jpg" alt="Raimund, Ferdinand (1790–1836)" /></a><figcaption>Raimund, Ferdinand <br />(1790–1836)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Grillparzer.jp%67"><img src="https://dlina.github.io/images/authorpics/1791_grillparzer.jpg" alt="Grillparzer, Franz (1791–1872)" /></a><figcaption>Grillparzer, Franz <br />(1791–1872)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Karl_Theodor_Körner.jp%67"><img src="https://dlina.github.io/images/authorpics/1791_koerner.jpg" alt="Körner, Theodor (1791–1813)" /></a><figcaption>Körner, Theodor <br />(1791–1813)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Kupelwieser, Josef (1791–1866)" /><figcaption>Kupelwieser, Josef <br />(1791–1866)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Carl_Balthasar_Malß.jp%67"><img src="https://dlina.github.io/images/authorpics/1792_malss.jpg" alt="Malß, Karl (1792–1848)" /></a><figcaption>Malß, Karl <br />(1792–1848)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Gehe, Eduard Heinrich (1795–1830)" /><figcaption>Gehe, Eduard Heinrich <br />(1795–1830)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Wohlbrück, Wilhelm August (1795–1848)" /><figcaption>Wohlbrück, Wilhelm August <br />(1795–1848)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Carl_Leberecht_Immermann.%67if"><img src="https://dlina.github.io/images/authorpics/1796_immermann.jpg" alt="Immermann, Karl (1796–1840)" /></a><figcaption>Immermann, Karl <br />(1796–1840)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:AugustGrafVonPlaten-Hallermuende.jp%67"><img src="https://dlina.github.io/images/authorpics/1796_platen.jpg" alt="Platen, August von (1796–1835)" /></a><figcaption>Platen, August von <br />(1796–1835)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:F.v.S..jp%67"><img src="https://dlina.github.io/images/authorpics/1796_schober.jpg" alt="Schober, Franz von (1796–1882)" /></a><figcaption>Schober, Franz von <br />(1796–1882)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Annette_von_Droste-Huelshoff_-_1845.jp%67"><img src="https://dlina.github.io/images/authorpics/1797_droste-huelshoff.jpg" alt="Droste-Hülshoff, Annette von (1797–1848)" /></a><figcaption>Droste-Hülshoff, Annette von <br />(1797–1848)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Heinrich-heine_1.jp%67"><img src="https://dlina.github.io/images/authorpics/1797_heine.jpg" alt="Heine, Heinrich (1797–1856)" /></a><figcaption>Heine, Heinrich <br />(1797–1856)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Holtei.jp%67"><img src="https://dlina.github.io/images/authorpics/1798_holtei.jpg" alt="Holtei, Karl von (1798–1880)" /></a><figcaption>Holtei, Karl von <br />(1798–1880)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Michael_beer.jp%67"><img src="https://dlina.github.io/images/authorpics/1800_beer.jpg" alt="Beer, Michael (1800–1833)" /></a><figcaption>Beer, Michael <br />(1800–1833)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Charlotte_Birch-Pfeiffer.jp%67"><img src="https://dlina.github.io/images/authorpics/1800_birch-pfeiffer.jpg" alt="Birch-Pfeiffer, Charlotte (1800–1868)" /></a><figcaption>Birch-Pfeiffer, Charlotte <br />(1800–1868)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Eduard_Devrient.jp%67"><img src="https://dlina.github.io/images/authorpics/1801_devrient.jpg" alt="Devrient, Philipp Eduard (1801–1877)" /></a><figcaption>Devrient, Philipp Eduard <br />(1801–1877)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Christian_Dietrich_Grabbe_by_Joseph_Wilhelm_Pero.jp%67"><img src="https://dlina.github.io/images/authorpics/1801_grabbe.jpg" alt="Grabbe, Christian Dietrich (1801–1836)" /></a><figcaption>Grabbe, Christian Dietrich <br />(1801–1836)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Albert_Lortzing-Stahlstich.jp%67"><img src="https://dlina.github.io/images/authorpics/1801_lortzing.jpg" alt="Lortzing, Albert (Gustav) (1801–1851)" /></a><figcaption>Lortzing, Albert (Gustav) <br />(1801–1851)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Nestroy.jp%67"><img src="https://dlina.github.io/images/authorpics/1801_nestroy.jpg" alt="Nestroy, Johann (1801–1862)" /></a><figcaption>Nestroy, Johann <br />(1801–1862)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Bauernfeld.jp%67"><img src="https://dlina.github.io/images/authorpics/1802_bauernfeld.jpg" alt="Bauernfeld, Eduard von (1802–1890)" /></a><figcaption>Bauernfeld, Eduard von <br />(1802–1890)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Karl_Johann_Braun_von_Braunthal.jp%67"><img src="https://dlina.github.io/images/authorpics/1802_braun_von_braunthal.jpg" alt="Braun von Braunthal, Karl Johann (1802–1866)" /></a><figcaption>Braun von Braunthal, Karl Johann <br />(1802–1866)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Karl_Joseph_Simrock.jp%67"><img src="https://dlina.github.io/images/authorpics/1802_simrock.jpg" alt="Simrock, Karl (1802–1876)" /></a><figcaption>Simrock, Karl <br />(1802–1876)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Franz_von_Kobell.jp%67"><img src="https://dlina.github.io/images/authorpics/1803_kobell.jpg" alt="Kobell, Franz von (1803–1882)" /></a><figcaption>Kobell, Franz von <br />(1803–1882)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Karl_Haffner.jp%67"><img src="https://dlina.github.io/images/authorpics/1804_haffner.jpg" alt="Haffner, Carl (1804–1876)" /></a><figcaption>Haffner, Carl <br />(1804–1876)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Riese, Friedrich Wilhelm (1805–1879)" /><figcaption>Riese, Friedrich Wilhelm <br />(1805–1879)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Die_Gartenlaube_(1856)_b_249.jp%67"><img src="https://dlina.github.io/images/authorpics/1806_halm.jpg" alt="Halm, Friedrich (1806–1871)" /></a><figcaption>Halm, Friedrich <br />(1806–1871)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:H_Laube_Portrait.jp%67"><img src="https://dlina.github.io/images/authorpics/1806_laube.jpg" alt="Laube, Heinrich (1806–1884)" /></a><figcaption>Laube, Heinrich <br />(1806–1884)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Vischer.jp%67"><img src="https://dlina.github.io/images/authorpics/1807_vischer.jpg" alt="Vischer, Friedrich Theodor (1807–1887)" /></a><figcaption>Vischer, Friedrich Theodor <br />(1807–1887)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Robert_Schumann.jp%67"><img src="https://dlina.github.io/images/authorpics/1810_schumann.jpg" alt="Schumann, Robert (1810–1856)" /></a><figcaption>Schumann, Robert <br />(1810–1856)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Roderich_Benedix..jp%67"><img src="https://dlina.github.io/images/authorpics/1811_benedix.jpg" alt="Benedix, Julius Roderich (1811–1873)" /></a><figcaption>Benedix, Julius Roderich <br />(1811–1873)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:K_Gutzkow_06_cropped.jp%67"><img src="https://dlina.github.io/images/authorpics/1811_gutzkow.jpg" alt="Gutzkow, Karl (1811–1878)" /></a><figcaption>Gutzkow, Karl <br />(1811–1878)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Georg_Büchner.pn%67"><img src="https://dlina.github.io/images/authorpics/1813_buechner.jpg" alt="Büchner, Georg (1813–1837)" /></a><figcaption>Büchner, Georg <br />(1813–1837)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Fritz-Hebbel.jp%67"><img src="https://dlina.github.io/images/authorpics/1813_hebbel.jpg" alt="Hebbel, Friedrich (1813–1863)" /></a><figcaption>Hebbel, Friedrich <br />(1813–1863)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Fotothek_df_rp-c_0200071_Triebischtal-Semmelsberg._Otto_Ludwig,_Porträt,_Zeichnung_(Stadtarchiv_Meißen,_Graphiksammlung).jp%67"><img src="https://dlina.github.io/images/authorpics/1813_ludwig.jpg" alt="Ludwig, Otto (1813–1865)" /></a><figcaption>Ludwig, Otto <br />(1813–1865)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:RichardWagner.jp%67"><img src="https://dlina.github.io/images/authorpics/1813_wagner.jpg" alt="Wagner, Richard (1813–1883)" /></a><figcaption>Wagner, Richard <br />(1813–1883)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Kaisepor.jp%67"><img src="https://dlina.github.io/images/authorpics/1814_Kaiser.jpg" alt="Kaiser, Friedrich (1814–1874)" /></a><figcaption>Kaiser, Friedrich <br />(1814–1874)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Niebergall, Ernst Elias (1815–1843)" /><figcaption>Niebergall, Ernst Elias <br />(1815–1843)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Gustav_Freytag_by_Karl_Stauffer-Bern_1886-1887.jp%67"><img src="https://dlina.github.io/images/authorpics/1816_freytag.jpg" alt="Freytag, Gustav (1816–1895)" /></a><figcaption>Freytag, Gustav <br />(1816–1895)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Robert_Eduard_Prutz.jp%67"><img src="https://dlina.github.io/images/authorpics/1816_prutz.jpg" alt="Prutz, Robert Eduard (1816–1872)" /></a><figcaption>Prutz, Robert Eduard <br />(1816–1872)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Christian_Pfann_Albert_Dulk_1867.jp%67"><img src="https://dlina.github.io/images/authorpics/1819_dulk.jpg" alt="Dulk, Albert (1819–1884)" /></a><figcaption>Dulk, Albert <br />(1819–1884)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Roeber, Friedrich (1819–1901)" /><figcaption>Roeber, Friedrich <br />(1819–1901)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Die_Gartenlaube_(1867)_b_205_David_Kalisch.jp%67"><img src="https://dlina.github.io/images/authorpics/1820_kalisch.jpg" alt="Kalisch, David (1820–1872)" /></a><figcaption>Kalisch, David <br />(1820–1872)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Salomon_Hermann_Mosenthal.jp%67"><img src="https://dlina.github.io/images/authorpics/1821_mosenthal.jpg" alt="Mosenthal, Salomon Hermann von (1821–1877)" /></a><figcaption>Mosenthal, Salomon Hermann von <br />(1821–1877)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Luckhardt_-_Richard_Genée_(ÖNB_9819483).jp%67"><img src="https://dlina.github.io/images/authorpics/1823_genee.jpg" alt="Genée, Richard (1823–1895)" /></a><figcaption>Genée, Richard <br />(1823–1895)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Peter_cornelius.jp%67"><img src="https://dlina.github.io/images/authorpics/1824_cornelius.jpg" alt="Cornelius, Peter (1824–1874)" /></a><figcaption>Cornelius, Peter <br />(1824–1874)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ferdinandlasalle.jp%67"><img src="https://dlina.github.io/images/authorpics/1825_lassalle.jpg" alt="Lassalle, Ferdinand (1825–1864)" /></a><figcaption>Lassalle, Ferdinand <br />(1825–1864)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Nicola_Perscheid_-_Gustav_von_Moser.jp%67"><img src="https://dlina.github.io/images/authorpics/1825_moser.jpg" alt="Moser, Gustav von (1825–1903)" /></a><figcaption>Moser, Gustav von <br />(1825–1903)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Adolf_Friedrich_Erdmann_von_Menzel_042.jp%67"><img src="https://dlina.github.io/images/authorpics/1830_heyse.jpg" alt="Heyse, Paul (1830–1914)" /></a><figcaption>Heyse, Paul <br />(1830–1914)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Berg, O. F. (1833–1886)" /><figcaption>Berg, O. F. <br />(1833–1886)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Schaefer, Wilhelm (1835–1908)" /><figcaption>Schaefer, Wilhelm <br />(1835–1908)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Bunge, Rudolf (1836–1907)" /><figcaption>Bunge, Rudolf <br />(1836–1907)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Wilbrandt_1882.jp%67"><img src="https://dlina.github.io/images/authorpics/1837_wilbrandt.jpg" alt="Wilbrandt, Adolf von (1837–1911)" /></a><figcaption>Wilbrandt, Adolf von <br />(1837–1911)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Berlin,_Kreuzberg,_Mehringdamm_21,_Friedhof_III_Jerusalems-_und_Neue_Kirche,_Grab_Adolph_L%27Arronge,_Portraitrelief.jp%67"><img src="https://dlina.github.io/images/authorpics/1838_larronge.jpg" alt="L'Arronge, Adolph (1838–1908)" /></a><figcaption>L'Arronge, Adolph <br />(1838–1908)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ludwig_Anzengruber.jp%67"><img src="https://dlina.github.io/images/authorpics/1839_anzengruber.jpg" alt="Anzengruber, Ludwig (1839–1889)" /></a><figcaption>Anzengruber, Ludwig <br />(1839–1889)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ignaz_Schnitzer_1839–1921.jp%67"><img src="https://dlina.github.io/images/authorpics/1839_schnitzer.jpg" alt="Schnitzer, Ignaz (1839–1921)" /></a><figcaption>Schnitzer, Ignaz <br />(1839–1921)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Hermann_goetz.jp%67"><img src="https://dlina.github.io/images/authorpics/1840_goetz.jpg" alt="Goetz, Hermann (1840–1876)" /></a><figcaption>Goetz, Hermann <br />(1840–1876)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:KarlMay_Raupp.jp%67"><img src="https://dlina.github.io/images/authorpics/1842_may.jpg" alt="May, Karl (1842–1912)" /></a><figcaption>May, Karl <br />(1842–1912)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Widmann, Joseph Viktor (1842–1911)" /><figcaption>Widmann, Joseph Viktor <br />(1842–1911)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Wildenbruch.JP%47"><img src="https://dlina.github.io/images/authorpics/1845_wildenbruch.jpg" alt="Wildenbruch, Ernst von (1845–1909)" /></a><figcaption>Wildenbruch, Ernst von <br />(1845–1909)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Franz_von_Schönthan_1909_Pietzner.pn%67"><img src="https://dlina.github.io/images/authorpics/1849_schoenthan.jpg" alt="Schönthan, Franz von (1849–1913)" /></a><figcaption>Schönthan, Franz von <br />(1849–1913)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Gustav_Kadelburg_(BerlLeben_1904-12).jp%67"><img src="https://dlina.github.io/images/authorpics/1851_kadelburg.jpg" alt="Kadelburg, Gustav (1851–1925)" /></a><figcaption>Kadelburg, Gustav <br />(1851–1925)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Oscar_Blumenthal_1905.jp%67"><img src="https://dlina.github.io/images/authorpics/1852_blumenthal.jpg" alt="Blumenthal, Oskar (1852–1917)" /></a><figcaption>Blumenthal, Oskar <br />(1852–1917)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Panizza1.jp%67"><img src="https://dlina.github.io/images/authorpics/1853_panizza.jpg" alt="Panizza, Oskar (1853–1921)" /></a><figcaption>Panizza, Oskar <br />(1853–1921)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Wenzl_Weis_-_Ludwig_Ganghofer.jp%67"><img src="https://dlina.github.io/images/authorpics/1855_ganghofer.jpg" alt="Ganghofer, Ludwig (1855–1920)" /></a><figcaption>Ganghofer, Ludwig <br />(1855–1920)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Jacoby, Wilhelm (1855–1925)" /><figcaption>Jacoby, Wilhelm <br />(1855–1925)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ferdinand_Avenarius,_portrait_2.jp%67"><img src="https://dlina.github.io/images/authorpics/1856_avenarius.jpg" alt="Avenarius, Ferdinand (1856–1923)" /></a><figcaption>Avenarius, Ferdinand <br />(1856–1923)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Nicola_Perscheid_-_Hermann_Sudermann_nach_1925.jp%67"><img src="https://dlina.github.io/images/authorpics/1857_sudermann.jpg" alt="Sudermann, Hermann (1857–1928)" /></a><figcaption>Sudermann, Hermann <br />(1857–1928)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Carl_Hauptmann.jp%67"><img src="https://dlina.github.io/images/authorpics/1858_hauptmann.jpg" alt="Hauptmann, Carl (1858–1921)" /></a><figcaption>Hauptmann, Carl <br />(1858–1921)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Laufs, Carl (1858–1900)" /><figcaption>Laufs, Carl <br />(1858–1900)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_female.jpg" alt="Wette, Adelheid (1858–1916)" /><figcaption>Wette, Adelheid <br />(1858–1916)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Bleibtreu, Karl (1859–1928)" /><figcaption>Bleibtreu, Karl <br />(1859–1928)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Jerschke, Oskar (1861–1928)" /><figcaption>Jerschke, Oskar <br />(1861–1928)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Lovis_Corinth_Porträt_des_Dichters_Josef_Ruederer_1904.jp%67"><img src="https://dlina.github.io/images/authorpics/1861_ruederer.jpg" alt="Ruederer, Josef (1861–1915)" /></a><figcaption>Ruederer, Josef <br />(1861–1915)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Alberti, Konrad (1862–1918)" /><figcaption>Alberti, Konrad <br />(1862–1918)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Johannes_Schlaf,_portrait.jp%67"><img src="https://dlina.github.io/images/authorpics/1862_schlaf.jpg" alt="Schlaf, Johannes (1862–1941)" /></a><figcaption>Schlaf, Johannes <br />(1862–1941)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Arthur_Schnitzler_1912.jp%67"><img src="https://dlina.github.io/images/authorpics/1862_schnitzler.jpg" alt="Schnitzler, Arthur (1862–1931)" /></a><figcaption>Schnitzler, Arthur <br />(1862–1931)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:R-dehmel_1905.jp%67"><img src="https://dlina.github.io/images/authorpics/1863_dehmel.jpg" alt="Dehmel, Richard (1863–1920)" /></a><figcaption>Dehmel, Richard <br />(1863–1920)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Arno_holz.jp%67"><img src="https://dlina.github.io/images/authorpics/1863_holz.jpg" alt="Holz, Arno (1863–1929)" /></a><figcaption>Holz, Arno <br />(1863–1929)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Paul_Scheerbarth.jp%67"><img src="https://dlina.github.io/images/authorpics/1863_scheerbart.jpg" alt="Scheerbart, Paul (1863–1915)" /></a><figcaption>Scheerbart, Paul <br />(1863–1915)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Otto_Erich_Hartleben_Portraet.jp%67"><img src="https://dlina.github.io/images/authorpics/1864_hartleben.jpg" alt="Hartleben, Otto Erich (1864–1905)" /></a><figcaption>Hartleben, Otto Erich <br />(1864–1905)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:FrankWedekind1883.JP%47"><img src="https://dlina.github.io/images/authorpics/1864_wedekind.jpg" alt="Wedekind, Frank (1864–1918)" /></a><figcaption>Wedekind, Frank <br />(1864–1918)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Hedwig_Lachmann_-_1865-1918.jp%67"><img src="https://dlina.github.io/images/authorpics/1865_lachmann.jpg" alt="Lachmann, Hedwig (1865–1918)" /></a><figcaption>Lachmann, Hedwig <br />(1865–1918)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:FerruccioBusoni1913.jp%67"><img src="https://dlina.github.io/images/authorpics/1866_busoni.jpg" alt="Busoni, Ferruccio (1866–1924)" /></a><figcaption>Busoni, Ferruccio <br />(1866–1924)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Beatrice_Dovsky_1893_Vilimek.jp%67"><img src="https://dlina.github.io/images/authorpics/1866_dovsky.jpg" alt="Dovsky, Beatrice (1866–1923)" /></a><figcaption>Dovsky, Beatrice <br />(1866–1923)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Ludwig_thoma_karl_klimsch1909.jp%67"><img src="https://dlina.github.io/images/authorpics/1867_thoma.jpg" alt="Thoma, Ludwig (1867–1921)" /></a><figcaption>Thoma, Ludwig <br />(1867–1921)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Gerhäuser, Emil (1868–1917)" /><figcaption>Gerhäuser, Emil <br />(1868–1917)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Rosenow, Emil (1871–1904)" /><figcaption>Rosenow, Emil <br />(1871–1904)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Hofmannsthal_1893.jp%67"><img src="https://dlina.github.io/images/authorpics/1874_hofmannsthal.jpg" alt="Hofmannsthal, Hugo von (1874–1929)" /></a><figcaption>Hofmannsthal, Hugo von <br />(1874–1929)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Heny-von-Heiseler01.jp%67"><img src="https://dlina.github.io/images/authorpics/1875_heiseler.jpg" alt="Heiseler, Henry von (1875–1928)" /></a><figcaption>Heiseler, Henry von <br />(1875–1928)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Rainer_Maria_Rilke,_1900.jp%67"><img src="https://dlina.github.io/images/authorpics/1875_rilke.jpg" alt="Rilke, Rainer Maria (1875–1926)" /></a><figcaption>Rilke, Rainer Maria <br />(1875–1926)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Fritz_Stavenhagen_Grab_Ohlsdorf.jp%67"><img src="https://dlina.github.io/images/authorpics/1876_stavenhagen.jpg" alt="Stavenhagen, Fritz (1876–1906)" /></a><figcaption>Stavenhagen, Fritz <br />(1876–1906)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Boßdorf, Hermann (1877–1921)" /><figcaption>Boßdorf, Hermann <br />(1877–1921)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:HermannEssigErichBüttner1917klein.jp%67"><img src="https://dlina.github.io/images/authorpics/1878_essig.jpg" alt="Essig, Hermann (1878–1918)" /></a><figcaption>Essig, Hermann <br />(1878–1918)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Bundesarchiv_Bild_146-1981-003-08,_Erich_Mühsam.jp%67"><img src="https://dlina.github.io/images/authorpics/1878_muehsam.jpg" alt="Mühsam, Erich (1878–1934)" /></a><figcaption>Mühsam, Erich <br />(1878–1934)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:GorchFock.jp%67"><img src="https://dlina.github.io/images/authorpics/1880_fock.jpg" alt="Fock, Gorch (1880–1916)" /></a><figcaption>Fock, Gorch <br />(1880–1916)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Heinrich_Lautensack_Gedenktafel_v2.jp%67"><img src="https://dlina.github.io/images/authorpics/1881_lautensack.jpg" alt="Lautensack, Heinrich (1881–1919)" /></a><figcaption>Lautensack, Heinrich <br />(1881–1919)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Rubiner, Ludwig (1881–1920)" /><figcaption>Rubiner, Ludwig <br />(1881–1920)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Anton_Wildgans_(1881–1932)_1932.jp%67"><img src="https://dlina.github.io/images/authorpics/1881_wildgans.jpg" alt="Wildgans, Anton (1881–1932)" /></a><figcaption>Wildgans, Anton <br />(1881–1932)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Hugoball.jp%67"><img src="https://dlina.github.io/images/authorpics/1886_ball.jpg" alt="Ball, Hugo (1886–1927)" /></a><figcaption>Ball, Hugo <br />(1886–1927)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Ertler, Bruno (1889–1927)" /><figcaption>Ertler, Bruno <br />(1889–1927)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:Orlik_Klabund.JP%47"><img src="https://dlina.github.io/images/authorpics/1890_klabund.jpg" alt="Klabund (1890–1928)" /></a><figcaption>Klabund <br />(1890–1928)</figcaption></figure></li>
<li><figure><a href="https://commons.wikimedia.org/wiki/File:RJSorge1.jp%67"><img src="https://dlina.github.io/images/authorpics/1892_sorge.jpg" alt="Sorge, Reinhard (1892–1916)" /></a><figcaption>Sorge, Reinhard <br />(1892–1916)</figcaption></figure></li>
<li><figure><img src="https://dlina.github.io/images/authorpics/noimage_male.jpg" alt="Kaltneker, Hans (1895–1919)" /><figcaption>Kaltneker, Hans <br />(1895–1919)</figcaption></figure></li>
</ul>
</div>
<div style="clear:left;" />
<h2 id="some-details-on-how-it-was-done">Some Details on How It Was Done</h2>
<p>The XSLT file for the automatic generation of the gallery out of the TEI files that comprise our corpus can be found <strong><a href="https://github.com/dlina/project/blob/master/apps/scripts/tei-author-portrait.xsl">here</a></strong>.</p>
<p>The renaming of the image files was done with some regexps on BASH, the conversion and crunching of the images to 150px height were done with the ImageMagick command-line tool <code class="highlighter-rouge">convert</code> in a simple <code class="highlighter-rouge">for</code> loop:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="k">for </span>file <span class="k">in</span> <span class="nv">$SOURCE_DIR</span>/<span class="k">*</span>
<span class="k">do
</span>convert <span class="nv">$file</span> <span class="nt">-strip</span> <span class="nt">-resize</span> x150 <span class="nt">-quality</span> 60 <span class="nv">$TARGET_DIR</span>/<span class="sb">`</span>basename <span class="k">$(</span><span class="nb">echo</span> <span class="nv">$file</span> | sed <span class="s1">'s/\(gif\|png\)/jpg/g'</span><span class="k">)</span><span class="sb">`</span>
<span class="k">done</span></code></pre></figure>
<h2 id="gender-data-and-placeholder-images">Gender Data and Placeholder Images</h2>
<p>Our XSLT file also extracts gender information from Wikidata showing that there are only 10 female writers among the 178 authors.</p>
<p>As stated above, the placeholder images were done by Ruth Reiche, we chose one for female and one for male authors out of <a href="http://blog.ruthreiche.de/profilbilder/">a whole bunch of silhouettes</a> she designed for her <a href="http://blog.ruthreiche.de/neun-geschichten-neun-netzwerke/">network visualisations of characters in Daniel Kehlmann’s novel “Ruhm”</a>.</p>
<p>(End of transmission.)</p>
<p><a href="https://dlina.github.io/The-Facebook-of-German-Playwrights/">The Facebook of German Playwrights</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on January 28, 2016.</p>https://dlina.github.io/The-Birth-and-Death-of-German-Playwrights2015-10-22T00:00:00+02:002015-10-22T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>“If your metadata is good, it can help you in many ways,” mumbled Captain Obvious when we last met, and we couldn’t agree more. So let’s toy around with some metadata today to get a better impression of what our corpus of roughly half a thousand German-language theatre plays actually contains.</p>
<p>You surely have seen the piece in <em>Science</em>, <a href="http://www.sciencemag.org/content/345/6196/558">“A Network Framework of Cultural History”</a>, and the corresponding <a href="https://www.youtube.com/watch?v=4gIhRkCcD4U">lifetime-curve videos</a>. Max Schich et al. set out to visualise “intellectual mobility” based on “spatiotemporal birth and death information (…) of more than 150,000 notable individuals”. That’s a lot of people, and we wouldn’t even dare to compare this little blog post to what they did. But anyway, we dabbled in telling the story of <strong>the birth and death of German playwrights</strong> by using a similar method with a much (like, <em>much</em>) smaller set of people – 178 authors altogether who wrote 465 plays published between 1731 and 1929.</p>
<p>The <strong>tl;dr version</strong> of how we did that: Wrote an XQuery script that uses the <a href="http://www.dnb.de/gnd">GND identifier</a> for each author in our XML files to find our way to corresponding Wikidata objects where we extracted dates and places of birth and death of all the authors contained in our corpus. Generated two KML files and put them into the GeoBrowser – mission accomplished (feel free to zoom in a bit):</p>
<iframe id="geobrowser" src="https://geobrowser.de.dariah.eu/embed/?kml1=https://dlina.github.io/data/geobrowser/lina-birth.kml&kml2=https://dlina.github.io/data/geobrowser/lina-death.kml&currentStatus=mapChanged=Historical+Map+of+1880"></iframe>
<p><a href="https://geobrowser.de.dariah.eu/embed/?kml1=https://dlina.github.io/data/geobrowser/lina-birth.kml&kml2=https://dlina.github.io/data/geobrowser/lina-death.kml&currentStatus=mapChanged=Historical+Map+of+1880" traget="_blank">view full screen</a></p>
<h2 id="workflow-bit-more-detailed">Workflow, Bit More Detailed</h2>
<p>Our <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">Sydney corpus</a> – which was derived from the <a href="https://textgrid.de/digitale-bibliothek">“Digitale Bibliothek” corpus</a> within the TextGrid Repository – holds <strong>465 dramatic pieces from 1731 to 1929</strong>, written by <strong>178 authors</strong> altogether. By plotting the places of birth and death of all of them onto a map we would probably find out if our corpus was balanced or if there were any (regional) biases we weren’t aware of.</p>
<p>All the documents in our repository contain authorship information, including GND identifiers. Their values are stored in an XML attribute (<code class="highlighter-rouge">key</code>) as follows (<a href="http://www.dnb.de/EN/gnd">for legacy reasons</a>, the value starts with <code class="highlighter-rouge">pnd</code>, not <code class="highlighter-rouge">GND</code>):</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><author</span> <span class="na">key=</span><span class="s">"pnd:118540238"</span><span class="nt">></span>Goethe, Johann Wolfgang von<span class="nt"></author></span></code></pre></figure>
<p>We had to update our schema to insert this attribute into our <a href="/Introducing-Our-Zwischenformat/">intermediary format</a> (<a href="https://github.com/dlina/project/commit/4811e0cd6bb81b0230a7afbd0ecfc34bc7f4b83e">here’s the commit</a>) to fully benefit from the beauty of linked open data (LOD). If you read German, there’s a nice chapter on the topic in the <a href="http://www.univerlag.uni-goettingen.de/bitstream/handle/3/Neuroth_TextGrid/TextGrid_book.pdf">TextGrid compendium</a> published last year (pp. 91, “Metadaten, LOD und der Mehrwert standardisierter und vernetzter Daten”, authored by <a href="https://twitter.com/delaiglesia">Martin de la Iglesia</a>, Nicolas Moretto and Max Brodhun).</p>
<p>The identifier stored in <code class="highlighter-rouge">@key</code> is related to an entry in the Integrated Authority File (which is the translation for GND, Gemeinsame NormDatei) hosted by the German National Library. They provide an HTML view of the data, but you can also directly download the RDF and other representations. Let’s have a look at the data set on Goethe at <a href="http://d-nb.info/gnd/118540238">http://d-nb.info/gnd/118540238</a>. You’ll find basic info on him: aliases, occupation, dates and places of birth and death. In most cases, given places have an own GND identifier contained in the RDF file to each personal record. In the case of Goethe we’re pointed to his birthplace Frankfurt am Main like this:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><gndo:placeOfBirth></span>
<span class="nt"><rdf:Description</span> <span class="na">rdf:about=</span><span class="s">"http://d-nb.info/gnd/4018118-2"</span><span class="nt">></span>
<span class="nt"><gndo:preferredNameForThePlaceOrGeographicName></span>Frankfurt am Main<span class="nt"></gndo:preferredNameForThePlaceOrGeographicName></span>
<span class="nt"></rdf:Description></span>
<span class="nt"></gndo:placeOfBirth></span></code></pre></figure>
<p>Eventually, the Frankfurt am Main record gives away the geographical coordinates of the city:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><geo:hasGeometry</span> <span class="na">rdf:parseType=</span><span class="s">"Resource"</span><span class="nt">></span>
<span class="nt"><rdf:type</span> <span class="na">rdf:resource=</span><span class="s">"http://www.opengis.net/ont/sf#Point"</span> <span class="nt">/></span>
<span class="nt"><geo:asWKT</span> <span class="na">rdf:datatype=</span><span class="s">"http://www.opengis.net/ont/geosparql#wktLiteral"</span><span class="nt">></span>Point ( +008.684166 +050.115277 )<span class="nt"></geo:asWKT></span>
<span class="nt"></geo:hasGeometry></span></code></pre></figure>
<p>We just had to trim the string to <code class="highlighter-rouge">+008.684166 +050.115277</code> and hand it over to a KML file (which can be interpreted by the majority of geo-visualisation tools) like this:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><kml></span>
<span class="nt"><Placemark></span>
<span class="nt"><address></span>Frankfurt am Main<span class="nt"></address></span>
<span class="nt"><description></span>Place of Birth; 28 August 1749<span class="nt"></description></span>
<span class="nt"><name></span>Gūta, Yūhān Wulfgāng fun<span class="nt"></name></span>
<span class="nt"><Point></span>
<span class="nt"><coordinates></span>+008.684166 +050.115277<span class="nt"></coordinates></span>
<span class="nt"></Point></span>
<span class="nt"><TimeStamp></span>
<span class="nt"><when></span>1749<span class="nt"></when></span>
<span class="nt"></TimeStamp></span>
<span class="nt"></Placemark></span>
<span class="nt"><kml></span></code></pre></figure>
<p>Easy enough, we just had to repeat this for the other authors to fill up our KML file and we’d be all set, we thought.</p>
<h2 id="wikidata-comes-into-play">Wikidata Comes Into Play</h2>
<p>But there was a catch. We only found coordinates for about two thirds of the places. Now, instead of manually adding the missing data, we wanted to try out if <em>Wikidata</em> was a good way out of this problem. We are keen followers of Magnus Manske’s <a href="https://twitter.com/MagnusManske">Twitter</a> and <a href="http://magnusmanske.de/wordpress/">blog</a> and he’s undertaking great efforts to enhance Wikidata, so our expectations were high.</p>
<p>There’s probably a more elegant way to do this, but we went in brute force, extracted the Wikipedia link from the RDF representations over at the GND, fetched the Wikipedia page, extracted the Q identifier from it and went over to the corresponding Wikidata record. Luckily, there’s a simple way to obtain the RDF representation of a single Wikidata object, <a href="https://twitter.com/umblaetterer/status/656836110107213824">something that Magnus helped us find out via Twitter</a> (thanks again!).</p>
<p>Once we could directly examine the XML/RDF representation it was dead easy to get hold of all the geographical coordinates. We put the two resulting KML files on our GitHub:</p>
<ul>
<li>https://dlina.github.io/data/geobrowser/lina-birth.kml</li>
<li>https://dlina.github.io/data/geobrowser/lina-death.kml</li>
</ul>
<h2 id="pushing-our-data-into-the-geobrowser">Pushing Our Data Into the GeoBrowser</h2>
<p>Now we could finally feed the files into the GeoBrowser, our spatio-temporal visualisation playground of choice (after years in beta, <a href="http://dhd-blog.org/?p=5705">it finally went 1.0 just this month</a>). GeoBrowser supports both CSV and KML files. There is a pretty nice datasheet editor with autofill of coordinates based on the <a href="https://en.wikipedia.org/wiki/Getty_Thesaurus_of_Geographic_Names">Getty Thesaurus of Geographic Names</a> for those who want to copy/paste lists of place names. You can also spice up your KML files with HTML elements and link back to your edition or to wherever you like. And btw, if you want to feed the GeoBrowser directly from your own server, just <a href="https://wiki.de.dariah.eu/display/publicde/Geo-Browser+FAQ#Geo-BrowserFAQ-WarumkannichunterLoadDatakeineKML,KMZundCSV-Dateien%C3%BCberKML/KMZ/CSVFileURLeinbinden?">ask the developers</a> to add your domain to the whitelist.</p>
<p>You already viewed the result and thus the story of the birth and death of (some) German playwrights in the 18th, 19th and 20th century in the iframe above.</p>
<h1 id="analysis">Analysis</h1>
<p>As with most visualisations in the Humanities, this one needs a bit of explanation. First off, orange circles indicate places of birth, purple circles indicate places of death. As background map we chose the 1880 one. Bearing in mind that our corpus covers texts from ca. 1730 to 1930, you can also change the layout to a 1783, 1815, 1914 or 1920 map up in the GeoBrowser interface.</p>
<p>Now what is it we can see there? Feel free to zoom in and out as you please. One first impression is that our corpus is pretty well-balanced since there is no regional bias, i.e., no over-representation of authors from specific regions (like, no emphasis on Hessian, or Swabian, or Saxon, or East Prussian writers, etc., plus we’ve got a fair handful of Swiss and Austrian writers, too).</p>
<p>The biggest bubbles surround Berlin (11 births, 15 deaths) and Vienna (13 births, 20 deaths), the two metropolises of the Holy Roman Empire (and later the German and Austro-Hungarian Empires). But again, the two do not dominate the whole picture. So the well-balancedness is something we can state, even if we know that birth and death places are just basic metadata not saying anything about where the authors spent the most part of their lives.</p>
<h2 id="some-geospatial-peculiarities">Some Geospatial Peculiarities</h2>
<p>Let’s take a look at <strong>geospatial extremities</strong> here. Of course, we cannot say anything about German-language literature <em>in general</em>, just about the 178 authors whose works are contained in our corpus of 465 German dramas. The outmost places are, clockwise:</p>
<table>
<thead>
<tr>
<th>Direction</th>
<th>Playwright</th>
<th>Place</th>
</tr>
</thead>
<tbody>
<tr>
<td>N</td>
<td>Henry von Heiseler</td>
<td>born 1875 in St. Petersburg</td>
</tr>
<tr>
<td>E</td>
<td>J.M.R. Lenz</td>
<td>died 1792 in Moscow</td>
</tr>
<tr>
<td>S</td>
<td>Ernst von Wildenbruch</td>
<td>born 1845 in Beirut</td>
</tr>
<tr>
<td>W</td>
<td>Christlob Mylius</td>
<td>died 1752 in London</td>
</tr>
</tbody>
</table>
<p>Lenz and Mylius surely add behavioural and artistic extremism to their geographical one (btw, there are some nice passages on Mylius in Hugh Barr Nisbet’s 2008 biography on Lessing, <a href="https://books.google.de/books?id=hcyc5ZA5KQYC&pg=PA51">start reading here, pp. 51</a>). Oh, and let’s not forget Heinrich Heine being the westward runner-up having died in Paris in 1856.</p>
<p>Another thing you can see in the visualisation is that some German-language authors preferred to die in Italy:</p>
<table>
<thead>
<tr>
<th>Author</th>
<th>Time and place</th>
</tr>
</thead>
<tbody>
<tr>
<td>Maler Müller</td>
<td>1825 in Rome</td>
</tr>
<tr>
<td>August von Platen</td>
<td>1835 in Syracuse, Sicily</td>
</tr>
<tr>
<td>Friedrich Wilhelm Riese</td>
<td>1879 in Naples</td>
</tr>
<tr>
<td>Richard Wagner</td>
<td>1883 in Venice</td>
</tr>
<tr>
<td>Otto Erich Hartleben</td>
<td>1905 in Salò</td>
</tr>
</tbody>
</table>
<h1 id="some-more-notes-on-the-balancedness-of-our-corpus">Some More Notes on the Balancedness of Our Corpus</h1>
<p>In addition to the regional well-balancedness of the corpus, there is also a temporal one, if we might say so. Have a look at the time-bar diagram right underneath the map (you can use the pull-down menus to change the scale). The first author appearing on the time bar, born in 1697, is Caroline Neuber. The first one to die is Johann Elias Schlegel, in 1749. Our youngest author is Hans Kaltneker, born in 1895. The author who lived the longest is Johannes Schlaf who died in 1941. The reason for him being the most recent author are copyright issues, of course (German copyright expires 70 years after the author’s death).</p>
<h1 id="obstacles">Obstacles</h1>
<p>Some of the minor issues we encountered on our way were the usual amounts of strange (unrelatable) values and nonexistent data, like missing Wikipedia entries or missing properties on Wikidata (they were not many and we fixed them while we went along, i.e., two playwrights finally got their Wikipedia aticle, and Wikidata was filled with some new properties).</p>
<p>While building our bridge from the GND entries to the corresponding Wikipedia articles, we found an accordant relation in the RDF file – good. Yet it turned out not every RDF file contains something like</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><foaf:page</span> <span class="na">rdf:resource=</span><span class="s">"http://de.wikipedia.org/wiki/Johann_Wolfgang_von_Goethe"</span><span class="nt">/></span></code></pre></figure>
<p>Instead, the HTML presentation of the data contains a link to Wikipedia, automatically generated by help of a <a href="https://de.wikipedia.org/wiki/Wikipedia:BEACON">BEACON file</a>. So we had to parse the entire webpage. If we had encountered an XHTML page we could have made use of the <code class="highlighter-rouge">doc()</code> function. Alas, the German National Library uses redirects (not supported by the <code class="highlighter-rouge">doc()</code> function) rather than URL rewriting (supported by the <code class="highlighter-rouge">doc()</code> function), so we had to let the EXPath HTTP client grab the page.</p>
<p>The case of <a href="https://de.wikipedia.org/wiki/Karl_Haffner_(Dramatiker)">Karl Haffner</a> was a tad more complicated. The <a href="http://d-nb.info/gnd/120430231/about/lds">RDF file</a> did contain a link to Wikipedia, but it nowadays leads to a disambiguation page where we obviously wouldn’t find the corresponding Wikidata object. So we had to add an exception (just this one) to our crawler.</p>
<p>One last thing, in our initial data set we found an author who died in 1952, undercutting the 70-year copyright rule. A very early adaptor in terms of open-source publishing, we thought. 😉 But the Wilhelm Schäfer <a href="http://d-nb.info/gnd/118794868">(pnd:118794868)</a> referenced in our source was not the author who should be referenced for writing <em><a href="https://www.textgridrep.de/browse.html?id=textgrid:trwv.0">Faustine, der weibliche Faust</a></em>. So we <a href="https://github.com/dlina/project/commit/da414793101e9187d53cb4b04feb57062adb7121">corrected the data</a> and pointed to the real Wilhelm Schaefer <a href="http://d-nb.info/gnd/117099309">(pnd:117099309)</a> instead. Same happened with one of Arno Schmidt’s favourite authors, Friedrich de la Motte Fouqué, who was mistaken with his grandson (<a href="https://github.com/dlina/project/commit/304c26aa30cdbad0aa0f71def6a11be814ea079e">correcting commit here</a>). When we started, we took over the wrong PNDs from the TextGrid Repository, and things can go wrong any time, sure, especially when you (have to) apply automated tagging. In this case, we only found two wrong identifiers, but just imagine a slightly bigger project where you cannot double-check everything anymore, a wee bit of a nightmare for LOD.</p>
<h1 id="conclusion">Conclusion</h1>
<p>So what did we achieve here? Nothing much, really. This is just one possible response to the imperative: <strong>“Know your data!”</strong> By automatically visualising the birth and death places of the playwrights that build our corpus of dramatic texts, we added a useful layer of description. And this will help us to classify any new results that our research on the corpus might yield in the future.</p>
<p>(End of transmission.)</p>
<p><a href="https://dlina.github.io/The-Birth-and-Death-of-German-Playwrights/">The Birth and Death of German Playwrights</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on October 22, 2015.</p>https://dlina.github.io/dramavis2015-08-06T00:00:00+02:002015-08-06T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Some of you will have seen our distant-reading showcase poster, this one (<a href="http://dx.doi.org/10.6084/m9.figshare.1461761">hi-res version on figshare</a>):</p>
<figure>
<img src="https://dlina.github.io/presentations/2015-sydney/sydney-images/dlina-corpus-465-cleaned-drama-networks-superposter-900px.jpg" alt="At a glance: Character networks of 465 German-language dramas from 1731 to 1929." style="width:56.25rem; border-style:solid; border-color:#222222; border-width:1px;" />
</figure>
<p>These are the character networks of 465 German-language dramas from 1731 (left upper corner) to 1929 (bottom right) at one glance. You can see how networks are changing over time, the first network explosions occurring with Klopstock’s “Hermanns Schlacht” (1769) and Goethe’s “Götz von Berlichingen” (1773): second row, fifth and second from the right.</p>
<p>The network of Klopstock’s piece can be studied in detail <a href="https://github.com/lehkost/dramavis/blob/master/output_(465_cleaned_graphs_from_sydney_corpus)/1769-Klopstock_Friedrich_Gottlieb-Hermanns_Schlacht-speakers.png">here</a>, the Goethe one <a href="https://github.com/lehkost/dramavis/blob/master/output_(465_cleaned_graphs_from_sydney_corpus)/1773-Goethe_Johann_Wolfgang_von-Götz_von_Berlichingen_mit_der_eisernen_Hand-speakers.png">here</a>. All 465 network graphs can be accessed <a href="https://github.com/lehkost/dramavis/tree/master/output_(465_cleaned_graphs_from_sydney_corpus)">in a folder on GitHub</a>.</p>
<h2 id="character-centric-data">Character-Centric Data</h2>
<p>Visualisations are nice, especially when they set you up with the ability to kind of <em>read</em> a large number of literary texts from a distant. But there are two other things we did with our data. We used network values like <strong>size</strong>, <strong>min</strong>/<strong>avg</strong>/<strong>max degree</strong>, <strong>density</strong> and <strong>avg path length</strong> to make assumptions about literary evolution over time, as described in our last posting (<a href="/Network-Values-by-Genre/">“Comedy vs. Tragedy: Network Values by Genre”</a>).</p>
<p>But we also calculated character-centric data, play by play, to make assumptions about single characters and their position in a network. We haven’t written anything about the character-centric data yet, but the data is all there (and will probably overwhelm you at first sight), <a href="http://htmlpreview.github.io/?https://raw.githubusercontent.com/lehkost/dramavis/master/output_(465_cleaned_graphs_from_sydney_corpus)/drama_character_values.html">in a single HTML document</a>.</p>
<p>For each character of a play, you will find the following values in the tables: <strong>degree</strong>, <strong>betweenness centrality</strong>, <strong>average distance</strong>, <strong>closeness centrality</strong>. Let me give you a small example on how to bring all this data to talk. In the second pamphlet of the <a href="http://litlab.stanford.edu/pamphlets/">Stanford Literary Lab series</a>, Franco Moretti takes a look at the average of the distance of a character to each of the other characters, suggesting that the one with the lowest score would be the protagonist of a play (cf. <a href="http://litlab.stanford.edu/LiteraryLabPamphlet2.pdf">“Network Theory, Plot Analysis”</a>).</p>
<p>Another promising way to find the most important/most central person in a network of people is the betweenness-centrality score. To quote <a href="https://en.wikipedia.org/wiki/Betweenness_centrality">Wikipedia</a>: “A node with high betweenness centrality has a large influence on the transfer of items through the network, under the assumption that item transfer follows the shortest paths.” We still have to discuss in what way this could apply to character networks of dramatic texts (“items”, in this case, could be <em>information</em> passed on from character to character), but let’s assume for a minute that the betweenness centrality score does correlate to the importance of a character. Then <em>the numbers</em> would tell us in an instant that, e.g., Emilia Galotti is not the most central character in Lessing’s play that bears her name. We knew this already, of course, but with this method we can easily generate a long list of plays whose title characters are not the most central ones, without having to actually read or reread any of the plays of our corpus. “Just think of this,” says Moretti, “I am discussing <em>Hamlet</em>, and saying nothing about Shakespeare’s words.” In fact, there <em>will be</em> some surprises if you look at <a href="http://htmlpreview.github.io/?https://raw.githubusercontent.com/lehkost/dramavis/master/output_(465_cleaned_graphs_from_sydney_corpus)/drama_character_values.html">our numbers</a>. Such as this one: Lessing’s eponymous Nathan the Wise only ranks second after the sultan, Saladin.</p>
<p>Certainly, these are only very simple examples on how to leverage all the data we calculated. Working on such kind of models rather than on the actual text can bring a whole set of new results, it can draw our attention to aspect that went unnoticed so far. Just look at the centrality scores of Schiller’s first play “Die Räuber” which are in sharp disaccord with traditional research and our own intuition when reading the play. No doubt about it, there will have to be a lot of further research on these things.</p>
<h2 id="how-dramavis-works">How <em>dramavis</em> Works</h2>
<p>But now onto the main thing, the foremost reason for this post is to introduce you to the tool we developed for the purposes described. It is a Python script called <strong>dramavis</strong> and was written by <strong><a href="https://github.com/chreman">Christopher Kittel</a></strong> and me. You can find it on my GitHub account (<a href="https://github.com/lehkost/dramavis">https://github.com/lehkost/dramavis</a>). Feel free to use it for your own purposes. To facilitate that a little, here is how “dramavis.py” works:</p>
<ol>
<li>The script reads character networks of dramatic pieces from CSV files,</li>
<li>plots these networks into PNG graphs (using the igraph library and Fruchterman–Reingold as layout, things you can change in the code, of course),</li>
<li>writes drama network values to a CSV file,</li>
<li>writes drama character values to an HTML file (using the Django template language).</li>
</ol>
<p>There are input/output directories <a href="https://github.com/lehkost/dramavis">on GitHub</a>, so if you clone the whole shebang to your harddrive and have all the necessary libraries installed it should work out of the box and you can start adapting it to work with your own data.</p>
<p>As for a little history, the first version of the script was written in August, 2014, during the <a href="http://www.gcdh.de/en/teaching/2014-dariah-international-dh-summer-school">DARIAH International Digital Humanities Summer School</a> in Göttingen. We were bascially toying around with the networkx and igraph libraries and fed them with some literary network data. We showed some first results at workshops in <a href="http://www.germanistik.uni-wuerzburg.de/lehrstuehle/computerphilologie/aktuelles/veranstaltungen/auftaktworkshop_gattungsstilistik/">Würzburg</a> and <a href="/Conference_in_Munich/">Munich</a> and at DH conferences in <a href="/DHd-2015-Conference-in-Graz/">Graz</a> and <a href="/Our-Talk-at-DH2015/">Sydney</a> where some people were asking for the code. We didn’t wanna put it on GitHub until we revised the somewhat chaotic script (ha!), and that’s what we did at a spontaneous 2-day hackathon at the Göttingen Centre for Digital Humanities, at <a href="http://www.uni-goettingen.de/de/125323.html">Heyne-Haus</a>, in June, 2015.</p>
<p>Depending on your machine, it can take up to five minutes or so to process the 465 standard input files and generate all the different outputs, so in order to know that the script is still running, we included <a href="https://twitter.com/umblaetterer/status/608349018113101824">a simple progress bar</a> and want to include other things in the future (input formats other than CSV would be nice, for example), so if you have any suggestions, please bring them forward.</p>
<h2 id="other-approaches-to-visualise-literary-network-data">Other Approaches to Visualise Literary Network Data</h2>
<p>This Python-based approach runs parallel to another approach based on D3.js leveraging <a href="/Introducing-Our-Zwischenformat/">our intermediary XML format</a> to generate different kinds of outputs (as demonstrated <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5/1">on this slide</a> from our <a href="/Our-Talk-at-DH2015/">talk at the DH2015 in Sydney</a>). You can have a look at all the data generated via this approach at <a href="https://dlina.github.io/linas">dlina.github.io/linas</a>. There are still some bugs we have to fix, but feel free to toy around a bit. This small collection of dynamic visualisations already has traits of a toolbox for the structural analysis of dramatic texts. Either way, that’s where we’re headed.</p>
<p>Nothing more to say today. <em>Happy distant reading!</em></p>
<p><a href="https://dlina.github.io/dramavis/">dramavis: A Tool for Visualising and Calculating Literary Network Data</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on August 06, 2015.</p>https://dlina.github.io/Network-Values-by-Genre2015-07-31T00:00:00+02:002015-07-31T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>As described in <a href="/Introducing-Our-Zwischenformat/">a previous post</a>, our DLINA intermediary format stores structural data extracted from the full-text TEI files of the TextGrid Repository as well as various metadata, including the author’s name and date of origin of a play (and its publication and/or premiere date). In addition, the DLINA format also stores specific title information, three in total: the main title of a play, its subtitle (if available) and a genre title (only if a genre can be derived from the official subtitle of a play). To give an example, the first piece of our <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">Sydney corpus</a>, Gottsched’s “Der sterbende Cato” from 1731, looks something like this:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><header></span>
<span class="nt"><title></span>Der sterbende Cato<span class="nt"></title></span>
<span class="nt"><subtitle></span>Ein Trauerspiel<span class="nt"></subtitle></span>
<span class="nt"><genretitle></span>Trauerspiel<span class="nt"></genretitle></span>
[...]
<span class="nt"></header></span></code></pre></figure>
<p>As said before, we only inserted a <code class="highlighter-rouge"><genretitle></code> if the subtitle of a play contained a definite and largely conventional genre indication. Terms like “dramatische Skizze” (dramatic sketch) or an unspecified indication like “Drama” we did not regard as conventional genres, in the same way as we neglected unconventionally specified genres like “Ein Ammenmärchen in vier Akten” (An Old Wives’ Tale in Four Acts) or “Arabische Fantasia in zwei Akten” (Arabian Fantasy in Two Acts).</p>
<p>The resulting set of genre titles included the classic genres “Tragödie” and “Komödie”, or, “Trauerspiel” and “Lustspiel”, but also the general “Schauspiel”, “Posse” or “Oper”. These genre titles help us to better describe our corpus. We can now state that of 465 dramas in our Sydney corpus,</p>
<ul>
<li>101 are marked as tragedy (“Tragödie” or “Trauerspiel”) and</li>
<li>92 are marked as comedy (“Komödie” or “Lustspiel”).</li>
</ul>
<p>En plus, we were interested in how many of the texts were combined with music of any sort (i.e., “Opern”, “Operetten”, “Singspiele”, “Musikdramen”, etc.). For reasons of simplicity, we marked these texts as “Libretti”. Not all of these texts bear a corresponding genre indication in their subtitle. Wagner’s “Master-Singers of Nuremberg”, for example, don’t feature a subtitle we could directly use as <code class="highlighter-rouge"><genretitle></code>. In these cases we did a little research to identify all libretti. The result is that</p>
<ul>
<li>56 texts from our Sydney corpus are marked as “Libretti”.</li>
</ul>
<p>With this kind of metadata, we could now easily build generic subcorpora and have a differentiated, genre-specific look into our network data. The corresponding median values and averages look like this:</p>
<h3 id="table-1-network-measures-by-genre">Table 1: Network Measures, by Genre</h3>
<table>
<thead>
<tr>
<th> </th>
<th>N=</th>
<th>Number of Characters (Median)</th>
<th>Max Degree (Median)</th>
<th>Average Degree (Average)</th>
<th>Density (Average)</th>
<th>Average Path Length (Average)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Corpus</td>
<td>465</td>
<td>16</td>
<td>13</td>
<td>9,01</td>
<td>0,59</td>
<td>1,46</td>
</tr>
<tr>
<td>Tragedy</td>
<td>101</td>
<td>19</td>
<td>16</td>
<td>9,57</td>
<td>0,52</td>
<td>1,56</td>
</tr>
<tr>
<td>Comedy</td>
<td>92</td>
<td>14</td>
<td>11</td>
<td>8,61</td>
<td>0,67</td>
<td>1,36</td>
</tr>
<tr>
<td>Libretto</td>
<td>56</td>
<td>16</td>
<td>13,5</td>
<td>9,09</td>
<td>0,64</td>
<td>1,39</td>
</tr>
<tr>
<td>Other</td>
<td>216</td>
<td>17</td>
<td>14</td>
<td>8,88</td>
<td>0,59</td>
<td>1,48</td>
</tr>
</tbody>
</table>
<p>Let’s feed this data into some diagrams:</p>
<h3 id="fig-1-network-size-median-by-genre">Fig. 1: Network Size (Median), by Genre</h3>
<div class="blog-chart" id="barchart1"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig01.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 8; //linechart: base = 10
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "80" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("80") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart1").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
y.domain([0, d3.max(data, function(d) { return d.value; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Frequency");
svg.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return x(d.label); })
.attr("width", x.rangeBand())
.attr("y", function(d) { return y(d.value); })
.attr("height", function(d) { return height - y(d.value); });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>We can see that tragedy peaks, while comedy troughs. This trend is also confirmed when looking at other values like the network density, only this time comedy values are peaking and tragedy values troughing:</p>
<h3 id="fig-2-density-mean-by-genre">Fig. 2: Density (Mean), by Genre</h3>
<div class="blog-chart" id="barchart2"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig02.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 8; //linechart: base = 10
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "80" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("80") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart2").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
y.domain([0, d3.max(data, function(d) { return d.value; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Frequency");
svg.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return x(d.label); })
.attr("width", x.rangeBand())
.attr("y", function(d) { return y(d.value); })
.attr("height", function(d) { return height - y(d.value); });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>Results for the other values are similar, suggesting that there is an evident connection between the size of a network and the other values. But we still need to further examine this connection, it isn’t as simple as it looks. It can be assumed that typical genre conventions also have a strong influence on the values: Like, in tragedies, we often have two (or more) opposing groups of people who don’t share the stage too often, and if they do, it is mainly in shape of single representatives. Comedies, on the other hand, have a tendency to make as many characters as possible once more appear on stage, together, at the end, typically for the purpose of a wedding (or even multiple weddings). Just take George Bernard Shaw who argued that comedies were <a href="http://www.gutenberg.org/ebooks/13084">“plays in which everyone was married in the last act”</a>. These genre conventions have a crucial influence on, e.g., the density values (many characters on stage at the same time would make for higher density values, whereas density decreases if characters from two antagonising parties hardly ever meet).</p>
<p>[Edit, 3 June 2018 – Nils and Marcus pointed us to this nice quote: “Somebody always has to die onstage, die or marry; that’s the only difference between a comedy and a tragedy as far as the world knows.” – from <a href="https://books.google.com/books?id=csL78WNZbA0C&pg=PA20">Mary di Michele, <em>Tenor of Love</em>, 2005, p. 20</a>]</p>
<p>Regarding the density values, Figure 2 suggests a proximity between comedy and libretto. This is confirmed if we don’t consider the median, but the mean values:</p>
<h3 id="fig-3-network-size-mean-by-genre">Fig. 3: Network Size (Mean), by Genre</h3>
<div class="blog-chart" id="barchart3"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig03.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 8; //linechart: base = 10
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "80" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("80") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart3").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
y.domain([0, d3.max(data, function(d) { return d.value; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Frequency");
svg.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return x(d.label); })
.attr("width", x.rangeBand())
.attr("y", function(d) { return y(d.value); })
.attr("height", function(d) { return height - y(d.value); });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>The structural similarity of comedy and libretto and their coinciding distance to the tragedy is showing up even if we look at the temporal evolution over two centuries, another simple subdivision of our corpus. These are the values we calculated:</p>
<h3 id="table-2-network-size-median-by-genre-and-century">Table 2: Network Size (Median), by Genre and Century</h3>
<table>
<thead>
<tr>
<th>Genre</th>
<th>18th Century</th>
<th>19th Century</th>
<th>20th Century</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tragedy</td>
<td>11,00</td>
<td>24,50</td>
<td>20,00</td>
</tr>
<tr>
<td>Comedy</td>
<td>9,00</td>
<td>16,50</td>
<td>16,00</td>
</tr>
<tr>
<td>Libretto</td>
<td>10,00</td>
<td>16,00</td>
<td>17,50</td>
</tr>
</tbody>
</table>
<p>Let’s put them into a diagram:</p>
<h3 id="fig-4-network-size-median-by-genre-and-century">Fig. 4: Network Size (Median), by Genre and Century</h3>
<div class="blog-chart-barline" id="barchart5"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig05-transposed.tsv", function(error, data) {
// multi lines need multi colors
var color = d3.scale.ordinal().range(["#6495ed","#e51843","#475003","#9c8305","#d3c47c"]);
// multiple lines need an preprocessing:
console.log("Inital Data", data);
var labelVar = 'Genre';
// collect all keys
var varNames = d3.keys(data[0]).filter(function (key) { return key !== labelVar });
console.log("varNames", varNames)
color.domain(varNames);
var seriesData = varNames.map(function (name) {
return {
name: name,
values: data.map(function (d) {
return {name: name, label: d[labelVar], value: +d[name]};
})
};
});
console.log("seriesData", seriesData);
// end preprocessing
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 4) { var width = base * 20 * sum - margin.left - margin.right; }
else if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "70" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("70") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart5").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.Genre; }));
var minval = d3.min(seriesData, function (c) {
return d3.min(c.values, function (d) { return d.value; });
});
var maxval = d3.max(seriesData, function (c) {
return d3.max(c.values, function (d) { return d.value; });
});
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else {var ylabel = "Frequency" }
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("linear")
.tension("0.3");
var series = svg.selectAll(".series")
.data(seriesData)
.enter().append("g")
.attr("class", "series");
series.append("path")
.attr("class", "line")
.attr("d", function (d) { return line(d.values); })
.style("stroke", function (d) { return color(d.name); })
.style("stroke-width", "4px")
.style("fill", "none");
// data points
series.selectAll('circle')
.data(function (seriesData) { return seriesData.values; })
.enter().append('circle')
.attr('cx', function (d) { return x(d.label) + x.rangeBand() / 2 ; })
.attr('cy', function (d) { return y(d.value); })
.attr('r', 5)
.style("fill", function (d) { return color(d.name); });
series.selectAll(".label")
.data(function (seriesData) { return seriesData.values; })
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x( d.label ) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", function (d) { return color(d.name); })
.text( function(d) { return d.value ;} );
// draw legend
var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
// draw legend colored rectangles
legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);
// draw legend text
legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d) { return d });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<!-- <div class ="blog-chart-barline" id="barchart5"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig05.tsv", function(error, data) {
// multi lines need multi colors
var color = d3.scale.ordinal().range(["#6495ed","#e51843","#475003","#9c8305","#d3c47c"]);
// multiple lines need an preprocessing:
console.log("Inital Data", data);
var labelVar = 'Genre';
// collect all keys
var varNames = d3.keys(data[0]).filter(function (key) { return key !== labelVar });
console.log("varNames", varNames)
color.domain(varNames);
var seriesData = varNames.map(function (name) {
return {
name: name,
values: data.map(function (d) {
return {name: name, label: d[labelVar], value: +d[name]};
})
};
});
console.log("seriesData", seriesData);
// end preprocessing
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 4) { var width = base * 20 * sum - margin.left - margin.right; }
else if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "65" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("65") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart5").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.Genre; }));
var minval = d3.min(seriesData, function (c) {
return d3.min(c.values, function (d) { return d.value; });
});
var maxval = d3.max(seriesData, function (c) {
return d3.max(c.values, function (d) { return d.value; });
});
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else {var ylabel = "Frequency" }
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("linear")
.tension("0.3");
var series = svg.selectAll(".series")
.data(seriesData)
.enter().append("g")
.attr("class", "series");
series.append("path")
.attr("class", "line")
.attr("d", function (d) { return line(d.values); })
.style("stroke", function (d) { return color(d.name); })
.style("stroke-width", "4px")
.style("fill", "none");
// data points
series.selectAll('circle')
.data(function (seriesData) { return seriesData.values; })
.enter().append('circle')
.attr('cx', function (d) { return x(d.label) + x.rangeBand() / 2 ; })
.attr('cy', function (d) { return y(d.value); })
.attr('r', 5)
.style("fill", function (d) { return color(d.name); });
series.selectAll(".label")
.data(function (seriesData) { return seriesData.values; })
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x( d.label ) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", function (d) { return color(d.name); })
.text( function(d) { return d.value ;} );
// draw legend
var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
// draw legend colored rectangles
legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);
// draw legend text
legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d) { return d });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
-->
<p>And now have a look at the table with the density values and the corresponding diagram:</p>
<h3 id="table-3-density-mean-by-genre-and-century">Table 3: Density (Mean), by Genre and Century</h3>
<table>
<thead>
<tr>
<th>Genre</th>
<th>18th Century</th>
<th>19th Century</th>
<th>20th Century</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tragedy</td>
<td>0,56</td>
<td>0,49</td>
<td>0,58</td>
</tr>
<tr>
<td>Comedy</td>
<td>0,71</td>
<td>0,59</td>
<td>0,75</td>
</tr>
<tr>
<td>Libretto</td>
<td>0,67</td>
<td>0,60</td>
<td>0,75</td>
</tr>
</tbody>
</table>
<h3 id="fig-5-density-mean-by-genre-and-century">Fig. 5: Density (Mean), by Genre and Century</h3>
<div class="blog-chart-barline" id="barchart6"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig06-transposed.tsv", function(error, data) {
// multi lines need multi colors
var color = d3.scale.ordinal().range(["#6495ed","#e51843","#475003","#9c8305","#d3c47c"]);
// multiple lines need an preprocessing:
console.log("Inital Data", data);
var labelVar = 'Genre';
// collect all keys
var varNames = d3.keys(data[0]).filter(function (key) { return key !== labelVar });
console.log("varNames", varNames)
color.domain(varNames);
var seriesData = varNames.map(function (name) {
return {
name: name,
values: data.map(function (d) {
return {name: name, label: d[labelVar], value: +d[name]};
})
};
});
console.log("seriesData", seriesData);
// end preprocessing
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 4) { var width = base * 20 * sum - margin.left - margin.right; }
else if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "65" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("65") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart6").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.Genre; }));
var minval = d3.min(seriesData, function (c) {
return d3.min(c.values, function (d) { return d.value; });
});
var maxval = d3.max(seriesData, function (c) {
return d3.max(c.values, function (d) { return d.value; });
});
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else {var ylabel = "Frequency" }
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("linear")
.tension("0.3");
var series = svg.selectAll(".series")
.data(seriesData)
.enter().append("g")
.attr("class", "series");
series.append("path")
.attr("class", "line")
.attr("d", function (d) { return line(d.values); })
.style("stroke", function (d) { return color(d.name); })
.style("stroke-width", "4px")
.style("fill", "none");
// data points
series.selectAll('circle')
.data(function (seriesData) { return seriesData.values; })
.enter().append('circle')
.attr('cx', function (d) { return x(d.label) + x.rangeBand() / 2 ; })
.attr('cy', function (d) { return y(d.value); })
.attr('r', 5)
.style("fill", function (d) { return color(d.name); });
series.selectAll(".label")
.data(function (seriesData) { return seriesData.values; })
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x( d.label ) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", function (d) { return color(d.name); })
.text( function(d) { return d.value ;} );
// draw legend
var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
// draw legend colored rectangles
legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);
// draw legend text
legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d) { return d });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>So a closer look at the evolution over two centuries shows even more clearly the proximity of comedy and libretto and the persistent distance from the tragedy. Let’s keep in mind that our corpus only contains texts from 1731 to 1929, therefore, the 18th and the 20th century are only partially covered. Nevertheless, we can recognise some particularities at second glance.</p>
<p>First, it is interesting that the distances regarding the network densities remain fairly constant (Fig. 5), but not regarding network sizes (Fig. 4). Especially in the 18th century, network-size differences between the three genres are not as clear as in the 19th century, whereas differences regarding network densities are even slightly bigger than in the 19th century. This would be further proof that the network size, i.e., the number of characters in a play, is indeed an important factor influencing all other values, but there is no strict correlation. Because if there was one, the density values of the three genres would have to be very close to each other in the 18th century. Yet this is not the case, which could indicate that the above-mentioned genre conventions are another crucial factor for all network values and shouldn’t be underestimated.</p>
<p>Second, we can observe how the tragedy stands out in the 19th century. In other words: When looking at our network data, the 19th century proves to be a time of strong generic differences, at least in regard to the structural data we elevated.</p>
<p>All in all, what we presented in this post are so far mere indications. We will have to look further into our data in order to better understand the evolution of subgenres over time as well as the impact of genre conventions on network measures. We also want to build larger generic subcorpora in the future. For example, it is very tempting to analyse the structure of the corpus of bourgeois tragedies discussed in Cornelia Mönch’s dissertation <a href="https://books.google.com/books?id=mVXbspJS54kC&printsec=frontcover">“Abschrecken oder Mitleiden. Das deutsche bürgerliche Trauerspiel im 18. Jahrhundert. Versuch einer Typologie”</a> (1993). But, as they say, a lot of water will certainly flow down the river Rhine before we get there. We will continue to report in this blog. Stay tuned.</p>
<figure>
<img src="https://dlina.github.io/images/rheinschleife.jpg" alt="Great bow in the Rhine at Boppard. Source: Wikimedia Commons." style="width:56.25rem" />
</figure>
<center><small>Great bow in the Rhine at Boppard. Source: <a href="https://commons.wikimedia.org/wiki/File:Boppard_Rheinschleife.jpg">Wikimedia Commons</a>.</small></center>
<!-- ### Fig. 7: Number of Characters (SD), by Genre and Century -->
<!-- <div class ="blog-chart-barline" id="barchart7"></div>
<script>
d3.tsv("/data/lit-genre-figs/lit-genre-fig07-transposed.tsv", function(error, data) {
// multi lines need multi colors
var color = d3.scale.ordinal().range(["#6495ed","#e51843","#475003","#9c8305","#d3c47c"]);
// multiple lines need an preprocessing:
console.log("Inital Data", data);
var labelVar = 'Genre';
// collect all keys
var varNames = d3.keys(data[0]).filter(function (key) { return key !== labelVar });
console.log("varNames", varNames)
color.domain(varNames);
var seriesData = varNames.map(function (name) {
return {
name: name,
values: data.map(function (d) {
return {name: name, label: d[labelVar], value: +d[name]};
})
};
});
console.log("seriesData", seriesData);
// end preprocessing
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 4) { var width = base * 20 * sum - margin.left - margin.right; }
else if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "60" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("60") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart7").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.Genre; }));
var minval = d3.min(seriesData, function (c) {
return d3.min(c.values, function (d) { return d.value; });
});
var maxval = d3.max(seriesData, function (c) {
return d3.max(c.values, function (d) { return d.value; });
});
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else {var ylabel = "Frequency" }
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style("text-anchor", "end")
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("linear")
.tension("0.3");
var series = svg.selectAll(".series")
.data(seriesData)
.enter().append("g")
.attr("class", "series");
series.append("path")
.attr("class", "line")
.attr("d", function (d) { return line(d.values); })
.style("stroke", function (d) { return color(d.name); })
.style("stroke-width", "4px")
.style("fill", "none");
// data points
series.selectAll('circle')
.data(function (seriesData) { return seriesData.values; })
.enter().append('circle')
.attr('cx', function (d) { return x(d.label) + x.rangeBand() / 2 ; })
.attr('cy', function (d) { return y(d.value); })
.attr('r', 5)
.style("fill", function (d) { return color(d.name); });
series.selectAll(".label")
.data(function (seriesData) { return seriesData.values; })
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x( d.label ) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", function (d) { return color(d.name); })
.text( function(d) { return d.value ;} );
// draw legend
var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
// draw legend colored rectangles
legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);
// draw legend text
legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d) { return d });
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
-->
<p><a href="https://dlina.github.io/Network-Values-by-Genre/">Comedy vs. Tragedy: Network Values by Genre</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on July 31, 2015.</p>https://dlina.github.io/Our-Talk-at-DH20152015-07-13T00:00:00+02:002015-07-13T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p><em>That’s right, we transcribed the talk we gave at the DH2015 in Sydney, on 2 July 2015, entitled “Digital Network Analysis of Dramatic Texts”. Please note that our grammar might appear a bit jetlagged here and there. ;) We were the last group to speak in a very interesting network-analysis centric session chaired by <a href="https://twitter.com/glennhroe">Glenn Roe</a>. If you take a veeery close look (hehe) at this panorama pic, you will recognise us setting up the room together with the other speakers, <a href="https://twitter.com/epyllia">Elisa Beshero-Bondar</a> and <a href="https://twitter.com/quadrismegistus">Ryan Heuser</a> (big hello there!):</em></p>
<figure>
<img src="https://dlina.github.io/images/photos/2015-07-02_10'52_setting-up_panorama_sydney.jpg" alt="Setting up the room for DH2015 talk" style="width:56.25rem" />
</figure>
<p><em>Since we used <a href="http://lab.hakim.se/reveal-js/">reveal.js</a> as presentation framework, we can easily reference individual slides so you can follow both our transcript and the slides simultaneously. (For further reference: Our original abstract can be found <a href="http://dh2015.org/abstracts/xml/FISCHER_Frank_Digital_Network_Analysis_of_Dramati/FISCHER_Frank_Digital_Network_Analysis_of_Dramatic_Text.html">here</a>.) But let’s now start with our presentation:</em></p>
<p><strong>Slide 0/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html">Title</a></strong></p>
<p><strong>Slide 1/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/1">TOC</a></strong></p>
<h2 id="1-approach">1. Approach</h2>
<p><strong>Slide 2/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2">Basic Ideas</a></strong></p>
<p>The tradition of structural approaches in Literary Studies reaches back to (at least) the different flavours of European structuralism developed since the 1960s. Our project sets out to continue this tradition, but our new take on the issue is to apply an automated data analysis to identify and characterise structural features of literary texts, or, more precisely: of dramatic texts. The long-term objective of our project is to gather and provide structural data which can be used, for example, to describe different compositional types of plays. What we mean by <em>compositional types</em>, or, <em>types of structural composition</em>, is best illustrated by an example.</p>
<p><strong>Slide 2/1: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2/1">Different Styles of Structural Composition</a></strong></p>
<p>Let’s have a look at these two network graphs generated during our data analysis. The graph on the left visualises the scenic interactions between characters in Goethe’s neo-classical play from 1787, “Iphigenie auf Tauris” (<a href="https://en.wikipedia.org/wiki/Iphigenia_in_Tauris_(Goethe)">“Iphigenia in Tauris”</a>), which is influenced by Aristotelian poetics. The graph on the right side visualises the interactions in a work also written by Goethe, the historical play <a href="https://en.wikipedia.org/wiki/Götz_von_Berlichingen_(Goethe)">“Götz von Berlichingen”</a>, which, for his part, is strongly influenced by Shakespearean poetics. We cannot discuss this in detail here. But even at first glance you can clearly see the structural differences. The two works are composed in very different ways, exhibiting two very different types of structural or compositional style.</p>
<p><strong>Slide 2/2: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2/2">The Digital Spectator</a></strong></p>
<p>In our project, we are looking into these different structural styles in the context of the history of dramatic texts. As mentioned before, we are stepping into the tradition of structuralistic approaches, and we combine these approaches with methods developed in the field of Social Network Analysis – as done by several scholars since the early 21st century, by de Nooy, Rydberg-Cox or Moretti, to name but a few. We are borrowing our specific definition of structure from Social Network Analysis, meaning: The structure of a dramatic text we are analysing originates from interactions between the characters in a play. To be more precise: A relation between two characters as we define it is given if both characters are performing a speech act in a given segment of a play, generally in a scene. So, if character A and character B are speaking in a scene, they are – following our definition – linked to each other.</p>
<p>This definition is inspired by works of Romanian mathematician <a href="https://en.wikipedia.org/wiki/Solomon_Marcus">Solomon Marcus</a>. In his intriguing study “Mathematical Poetics” from 1973, Marcus suggests a concept which one might call “the digital spectator”. This is right up our alley, because with out digitally-driven method we are simulating this digital spectator who is not looking at a performance of the play on stage, but at an XML file. By means of our definition of interaction we are collecting our data; we are calculating network measures, generating graphs, running statistics and doing some other quantitative-analysis stuff.</p>
<p><strong>Slide 2/3: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2/3">465 Network Graphs</a></strong></p>
<p>And we are doing this, at the moment, with a corpus of 465 German-language plays – which you can see here. (By the way: You can download this fancy poster containing the network graphs of our whole corpus of 465 texts <a href="http://dx.doi.org/10.6084/m9.figshare.1461761">from figshare</a>.) We are planning to include plays from other periods in our analysis, and we will, in the future, include plays written in other languages. But for the time being, our focus is on German plays from two centuries.</p>
<p><strong>Slide 2/4: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/2/4">Workflow</a></strong></p>
<p>Let’s now introduce our workflow and its three main steps: data mining, data editing and the display and analysis of our data.</p>
<h2 id="2-data-mining">2. Data Mining</h2>
<p><strong>Slide 3/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/3">Corpus</a></strong></p>
<p>What makes our analysis quite difficult is that we are dealing with (excuse our French:) dirty data. If you happen to work with TEI documents like the ones from the <a href="http://www.folgerdigitaltexts.org/">Folger Shakespeare Library</a> corpus, it’s pretty easy to do the kind of structural analysis we have in mind. But such a well-tagged TEI corpus is rare. Typically, we can consider ourselves lucky if we can obtain a digitised text of a focal text, and even luckier if it contains some basic markup. This applies, for example, to the bigger corpora containing German-language literature. For example, the quite large and freely available TextGrid Repository (comprising texts from around 1500 to the 1930s) features TEI markup converted automatically from a proprietary format. For that reason, we have to deal with quite a basic TEI markup and loads of different tagging errors. We had two options, basically: We could try to initiate a 6-figure project and manually improve the faulty tagging over several years; or, we could try to extract just the specific data we need and take it from there, improving just the bits essential for our approach. Lacking several hundreds of thousands of Euros, we chose the second way.</p>
<p>In order to build our corpus, we first had to extract all dramatic texts from the TextGrid Repository. This might sound relatively easy, but it’s not. We wrote <a href="/A-Not-So-Simple-Question/">a larger blog post on this subject</a>, but to make it short: We ended up with 666 dramatic texts containing some basic TEI markup.</p>
<p><strong>Slide 3/1: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/3/1">DLINA Corpus 15.07 (“Codename Sydney”)</a></strong></p>
<p>Out of this corpus, we constructed our own subcorpus, its current version is entitled “DLINA Corpus 15.7 (Codename: Sydney)”. This corpus comprises 465 dramatic texts, so we discarded some 200 files. There are several reasons for excluding texts from our corpus. First, we wanted to limit our research to a specific time span, beginning with the modernisation of German drama at the outset of Enlightenment era, which means – following a well-established academic position – to start with the 1730s works of <a href="https://en.wikipedia.org/wiki/Johann_Christoph_Gottsched">Johann Christoph Gottsched</a>. We further ruled out foreign-language plays, translations, mere pantomime plays, and fragments. Plus, we sorted out a few texts with very defective TEI markup. All in all, our Sydney corpus comprises 465 dramatic texts in German language ranging from 1731 to 1929.</p>
<p>This was just our starting point. The next step was turned out to be another major task, the data editing.</p>
<h2 id="3-data-editing">3. Data Editing</h2>
<p><strong>Slide 4: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/4">Extracting Structural Data</a></strong></p>
<p>As mentioned before, the TEI markup in the TextGrid Repository data was quite rudimentary – and often erroneous. So we had to find a way to edit and improve the data. Because we are just interested in a specific kind of structural data, we decided not to regularly edit the original XML files but to selectively extract those data we are interested in, and then just edit these specific data. This was the moment in which we invented what we call the “DLINA zwischenformat”, which roughly translates as DLINA intermediary format’, or simply ‘DLINA data file’. This intermediary format can be considered a structural abstraction of the fulltext TEI documents in our corpus. It is an XML file which is validated against <a href="https://raw.githubusercontent.com/dlina/project/master/rules/lina.rnc">a specific RNG schema</a>. A <em>zwischenformat</em> file is created for each drama. It stores metadata, the actual structural data, and documentation (optional).</p>
<p>The structural data stored are the acts and scenes, the speakers occurring in the respective segments, and (optionally) some amounts, e.g., the number of speech acts, words, etc. for these characters. This DLINA data file not only makes improving the data quality easier but it also allows for a relatively quick way of gathering new data by basically just writing down the structure of a play and the speakers in the segments. You may have a look <a href="/Introducing-Our-Zwischenformat/">at our blog post that introduces the DLINA intermediary format</a>. Or maybe you like to explore all the 465 DLINA files which we generated from our corpus and which are <a href="https://github.com/dlina/project/tree/master/data/zwischenformat">stored on GitHub</a>.</p>
<p><strong>Slide 4/1: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/4/1">Editing Process</a></strong></p>
<p>However, just generating the DLINA data files was not enough. This is because the raw DLINA data files still retained some of the dirty data from the original XMLs – it was, in other words, full of bugs. So, manual intervention was often necessary to improve the data quality and correct errors in the source data.</p>
<p>In this editing process we had to face some errors that came about due to the automated conversion to TEI from the proprietary XML format; and we had to face some, so to say, intrinsic problems, i.e., characteristics typical for a play.</p>
<p>One recurrent problem of the first group were OCR errors in the <code class="highlighter-rouge"><speaker></code> name. One of the phenomenons resulting from intrinsic characteristics of a play was that there were different ways of refer to a character. For example, a character’s full name might be given on the first appearance and only the first name on further appearances. In this case, we had to manually identify both names and attribute them to the one character they refer to.</p>
<p>There were a lot more problems we had to solve while editing the DLINA intermediary format. We established a larger set of editing rules, a complete documentation including examples can be found <a href="/Editing-Rules/">on our blog</a>.</p>
<p><strong>Slide 4/2: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/4/2">Outlook</a></strong></p>
<p>We will further improve our editing process. At this time, for example, we are developing a GUI for some of the more simple editing procedures. This GUI will also include some gamification elements, so maybe we will have some crowd-editing-option in the future.</p>
<h2 id="4-display-and-analysis">4. Display and Analysis</h2>
<p><strong>Slide 5/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5">Four Types of Visualisation</a></strong></p>
<p>After editing and cleaning our 465 data files, we did two things: We started to publish our data and commented on it in larger blog posts, and we ran some statistics, also with detailed comments. As we stressed before and can’t stress enough, this project is still very much a work in progress, but we can and we will show you some promising first results. As mentioned before, all our data is stored on our GitHub, and therefore very transparent. On top of that, we built a small network-data publishing machine to provide easy access to our data. We created a homepage for every of the 465 plays. There is a list of all these plays if you click <a href="https://dlina.github.io/linas/">on this link</a>. On each of the homepages you’ll find 4 links leading to further information on the particular play.</p>
<p><strong>Slide 5/1: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5/1">Example: G. E. Lessing’s “Emilia Galotti” (1772)</a></strong></p>
<p>One of the pages shows the network graph, with edges between all the people speaking in the same segment. There is a static graph and one with sticky nodes. Another page shows a matrix of encounters. And there is a page where you can have a look at our intermediary source file. That’s our data in plain daylight! Eventually, there is one page that contains a bar chart with word counts for each character of the play. You can also interactively sort them.</p>
<p><strong>Slide 5/2: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5/2">Skit (The Biggest Chatterboxes in German Literature)</a></strong></p>
<p>Which brings me to a little skit, a little interlude. With the kind of data we gathered, it was easy for us to make a list of the biggest chatterboxes in German literature, of course, only based on our middle-sized corpus. And for all of you who didn’t do their German Literature 101: It doesn’t matter, I’m sure you will at least know Faust and his counterpart Mephistopheles, from Goethe’s play “Faust, part 1”. And both of them are very talkative, earning places 3 and 4 of this top-10 of the biggest chatterboxes in German literature. Again, there’s <a href="/The-Biggest-Chatterbox-in-German-Literature/">a blog post on the subject</a>, but of course, this one is a bit “tongue-in-cheek” and not part of our actual research.</p>
<p>Let’s rather have a look at some more meaningful facts. We actually started out to process our data by means of Social Network Analysis. Again, our measures are currently very basic, for example, we’re computing the size of our drama networks, their density, their average degree and so on. For now, let us acquaint you with just two charts we’re currently discussing in the group and with other colleagues.</p>
<p><strong>Slide 5/3: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5/3">Network Size (Median) by Decade (1730–1930)</a></strong></p>
<p>Here you can see the evolution of network sizes between 1730 and 1930. On the x-axis you can see the 20 decades. The y-axis features the median values of the number of characters of all the plays of a decade.</p>
<p>Let’s now try some cautious-cautious interpretation of this diagram. Something is happening there, that’s for sure, but what exactly? Well, some of the ups and downs we did expect. Like for example: The increase of this value in the second half of the 18th century might be associated with the beginning reception of Shakespeare in Germany, which lead, among other things, to the rise of the Historical Play in German literature. Or, another quick glimpse, the dropping values at the end of the 19th century might be associated with the rise of the Naturalistic Drama, which – to make a long story short – returned to the ideas of something like a Aristotelian poetics.</p>
<p>We published many more charts on our website, we also started to discuss them there, and this process will continue, of course. If you like, please have a look – and join the discussion. For now, we just show you one last chart, one that introduces another idea we will address with our statistical approaches. I’m talking about concepts of genre, or rather: subgenre.</p>
<p><strong>Slide 5/4: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/5/4">Network Density (Mean) by Genre and Century</a></strong></p>
<p>While editing our intermediary files we also included basic genre data, with the main focus on the usual suspects, major genres like tragedy, comedy, and opera libretti. With this kind of genre data we could now build subcorpora to have a look at genres and their specific network measures.</p>
<p>And we immediately noticed some interesting things. What we can see here is a multiple-line chart featuring the arithmetic means of density values by genre and century. Just looking at this single value over time, we can conclude that comedies and libretti implement a very similar structural composition over the centuries, while character networks of tragedies (the lowermost line in our diagram) show a much lower density. What is more, the values shown are pretty consistent over the centuries. This might be a first indication that we could actually cluster genres of dramatic texts by just looking at a few basic measures.</p>
<p>But as stated before, today we’re only talking about very basic data. Actually, we calculated a lot more network data and started to look into them. But we should not run to conclusions too fast. It is still a long way to integrate our network analysis of dramatic texts into a holistic study of literary evolution. We will be pushing out more data on our blog in the next few months. For example, we’re putting the finishing touches on an article on Network Values by Genre, should be ready in two weeks or three.</p>
<h2 id="5-further-research">5. Further Research</h2>
<p><strong>Slide 6/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/6">Yada Yada</a></strong></p>
<p>Wrapping things up, here is a slide with some notes on further research ideas. We need more statistical data, and we need to interpret them thoroughly. In addition, we will enlarge our German-language corpus. We will also look into existing foreign-language corpora which also opens up the field of comparative studies. I’m especially thinking of Paul Fièvre’s excellent corpus comprising more than 750 French plays, but we will also be looking into a collection of American drama and we’re also happy to cooperate with other scholars on the subject.
But our first and foremost task will be to find ways to contribute to traditional Literary Studies, to evaluate existing hypotheses reached by close-reading approaches, by traditional means, so to speak. Plus, we will try to reach an own set of interpretations and hold them against established hypotheses in the field of Literary Studies. That is and should be our long-term plan at least.</p>
<h2 id="6-bibliography">6. Bibliography</h2>
<p><strong>Slide 7/0: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/7">Literary Theory, Social Network Analysis</a></strong></p>
<p><strong>Slide 7/1: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/7/1">Literary Studies & SNA</a></strong></p>
<p><strong>Slide 7/2: <a href="https://dlina.github.io/presentations/2015-sydney/sydney.html#/7/2">Literary Studies & SNA (Cont’d)</a></strong></p>
<p><em>That was it. <strong>Thanks a lot</strong> for all the feedback we got, for the nice talks after the session and throughout the whole conference. Also if we aren’t even halfway there, it is nice to see how the network analysis of literary texts progresses. It’s definitely something to look out for at upcoming DH conferences.</em></p>
<p><a href="https://dlina.github.io/Our-Talk-at-DH2015/">Our Talk at DH2015 in Sydney (Full Text and Slides)</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on July 13, 2015.</p>https://dlina.github.io/200-Years-of-Literary-Network-Data2015-06-25T00:00:00+02:002015-06-25T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>After creating <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">our corpus</a> and <a href="/Introducing-Our-Zwischenformat/">extracting the structural data</a> that are of interest to us it’s time to run some statistics. As it is with statistical data, they can evoke manifold interpretations and sometimes have the inclination to speak in riddles. We will certainly need a few more months to make sense of all the values we computed and collected.</p>
<p>Nevertheless, we’re prepared to offer at least some observations and insights already, all of which is still very much a work in progress. Our statistical analyses are quite rudimentary for the time being, more complex calculations will follow. However, some things can already be recognised in our data, or at least we can put them in front of you and open them up for discussion.</p>
<p>We already gave an example of how to ask our data in our last posting on the <a href="/The-Biggest-Chatterbox-in-German-Literature/">biggest chatterboxes in German literature</a>. But these kinds of rankings are only one thing; our main purpose is to look at the network values we computed in the context of Social Network Analysis (SNA). Again, we will start with very rudimentary data and concentrate on the following five measures:</p>
<ul>
<li><strong>Number of characters</strong>, i.e., the number of characters appearing in each drama network; equates to the ‘size’ of any given network.</li>
<li><strong>Maximum degree</strong>, i.e., the highest degree of an actor of a drama network; degree here refers to the sum of scenic co-presences of a character in a play (that is, how many of the other characters does a character ‘meet’/’speak to’ throughout the whole play).</li>
<li><strong>Average degree</strong>, i.e., the average of all character degrees of a dramatic text.</li>
<li><strong>Density</strong>, i.e., the ratio of the number of <em>actual</em> co-presences to the number of <em>potential</em> co-presences among all the characters of a play; the density value is always somewhere between 0 and 1: if it is 1, every character speaks to every other character at least once.</li>
<li><strong>Average path length</strong>, which is (<a href="https://en.wikipedia.org/wiki/Network_science#Average_path_length">quote Wikipedia:</a>) “calculated by finding the shortest path between all pairs of nodes, adding them up, and then dividing by the total number of pairs. This shows us, on average, the number of steps it takes to get from one member of the network to another.”</li>
</ul>
<p>As stated before, these are very basic measures. But let’s go ahead and have a look at what these measures tell us about our <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">Sydney corpus</a> that includes 465 German-language plays from about 1730 to 1930.</p>
<p>In order to observe literary evolution throughout time, we grouped our dramatic texts by decades. This decision is contingent, of course, and we will also experiment with other periodisations (see below). But for a first look into the data, this approach will fulfill its purpose.</p>
<p>First example, a table referring to the “Number of characters” of a play, revealing the average, median and standard-deviation values:</p>
<h3 id="table-number-of-characters">Table: Number of Characters</h3>
<table>
<thead>
<tr>
<th>Decade</th>
<th>N</th>
<th>Average</th>
<th>Median</th>
<th>Standard Deviation</th>
</tr>
</thead>
<tbody>
<tr>
<td>1730</td>
<td>5</td>
<td>11,6</td>
<td>11</td>
<td>3,51</td>
</tr>
<tr>
<td>1740</td>
<td>18</td>
<td>8,33</td>
<td>8</td>
<td>2,4</td>
</tr>
<tr>
<td>1750</td>
<td>10</td>
<td>9,2</td>
<td>8,5</td>
<td>3,58</td>
</tr>
<tr>
<td>1760</td>
<td>15</td>
<td>11,2</td>
<td>10</td>
<td>9,65</td>
</tr>
<tr>
<td>1770</td>
<td>36</td>
<td>13,42</td>
<td>12,5</td>
<td>11,74</td>
</tr>
<tr>
<td>1780</td>
<td>20</td>
<td>18,1</td>
<td>15,5</td>
<td>11,36</td>
</tr>
<tr>
<td>1790</td>
<td>20</td>
<td>27,1</td>
<td>20,5</td>
<td>28,42</td>
</tr>
<tr>
<td>1800</td>
<td>23</td>
<td>27,96</td>
<td>15</td>
<td>27,26</td>
</tr>
<tr>
<td>1810</td>
<td>24</td>
<td>32,75</td>
<td>23</td>
<td>22,62</td>
</tr>
<tr>
<td>1820</td>
<td>31</td>
<td>27,29</td>
<td>25</td>
<td>14,24</td>
</tr>
<tr>
<td>1830</td>
<td>31</td>
<td>39,55</td>
<td>25</td>
<td>45,32</td>
</tr>
<tr>
<td>1840</td>
<td>43</td>
<td>19,35</td>
<td>17</td>
<td>11,09</td>
</tr>
<tr>
<td>1850</td>
<td>16</td>
<td>21,81</td>
<td>17,5</td>
<td>13,47</td>
</tr>
<tr>
<td>1860</td>
<td>11</td>
<td>24,45</td>
<td>21</td>
<td>18,83</td>
</tr>
<tr>
<td>1870</td>
<td>14</td>
<td>21,29</td>
<td>23</td>
<td>6,28</td>
</tr>
<tr>
<td>1880</td>
<td>14</td>
<td>24,86</td>
<td>23</td>
<td>12,7</td>
</tr>
<tr>
<td>1890</td>
<td>36</td>
<td>18,06</td>
<td>15</td>
<td>13,2</td>
</tr>
<tr>
<td>1900</td>
<td>49</td>
<td>11,88</td>
<td>9</td>
<td>8,83</td>
</tr>
<tr>
<td>1910</td>
<td>33</td>
<td>22,85</td>
<td>18</td>
<td>17,46</td>
</tr>
<tr>
<td>1920</td>
<td>16</td>
<td>29,25</td>
<td>24,5</td>
<td>15,7</td>
</tr>
</tbody>
</table>
<p>Let’s now acquaint you with some visualisations by putting our data into some diagrams:</p>
<h3 id="fig-01-number-of-characters-median">Fig. 01: Number of Characters (Median)</h3>
<div class="blog-chart-barline" id="barchart1"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig01.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart1").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (Median)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (Median)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>The standard-deviation values look like this:</p>
<h3 id="fig-02-number-of-characters-standard-deviation">Fig. 02: Number of Characters (Standard Deviation)</h3>
<div class="blog-chart-barline" id="barchart2"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig02.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart2").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (SD)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (SD)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>As you can see, there is something going on in our corpus. For example, in the second half of the 18th century, we witness a period of gradual increase in the number of characters that can be brought in connection with the renunciation of classical drama poetics and the beginning reception of Shakespeare. We also recognise a peak in the 1830s, not least due to the success of the historical drama in this period. In the late 19th century, we can observe a significant reduction in the number of characters, probably an effect owed to the naturalistic drama and its recourse to the classical poetics and their idea of the <a href="https://en.wikipedia.org/wiki/Classical_unities">three unities</a>.</p>
<p>At the same time, it is significant how the standard deviation goes up towards the end of the 18th century. This indicates an increased number of different structural styles of drama composition. What we can observe here is a differentiation of dramatic production, in structural terms, away from the uniformity of the years from 1730 to 1750. This, however, changes again in the mid-19th century.</p>
<p>We don’t want to further discuss these statistical values at this point, especially because we don’t want to espouse any monocausal explanations.</p>
<p>Instead, let’s throw a glance at some more charts dedicated the other values, i.e., Max Degree, Average Degree, Density and Average Path Length.</p>
<h3 id="fig-03-max-degree-median">Fig. 03: Max Degree (Median)</h3>
<div class="blog-chart-barline" id="barchart3"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig03.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart3").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Max Degree (Median)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Max Degree (Median)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-04-average-degree-average">Fig. 04: Average Degree (Average)</h3>
<div class="blog-chart-barline" id="barchart4"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig04.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart4").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Average Degree (Average)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Average Degree (Average)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-05-density-average">Fig. 05: Density (Average)</h3>
<div class="blog-chart-barline" id="barchart5"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig05.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart5").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Density (Average)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Density (Average)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-06-average-path-length-average">Fig. 06: Average Path Length (Average)</h3>
<div class="blog-chart-barline" id="barchart6"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig06.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart6").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Average Path Length (Average)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Average Path Length (Average)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>As stated above, we will evaluate and discuss these results later.</p>
<p>Before closing this post, we want to suggest one more way to analyse our data. We already mentioned that the classification by decades is rather arbitrary. However, there’s another option to pursue this idea. Why don’t we sort our corpus by already established periodisations of German literature and take it from there? Does our data reproduce established divisions into literary epochs?</p>
<p>This question must be approached with great caution. Established divisions into literary epochs do not just rely on a set of very specific structural elements (like our approach), no, they are, of course, much richer. We are absolutely not able to evaluate whether the known divisons into literary epochs are ‘correct’ or anything. That sort of thing is not possible with that kind of structural data. But anyhow, we can always check how our data relates to the established division into literary periods.</p>
<p>For that purpose, we picked two different divisions into epochs. The first was developed in the context of German Structuralism (cf., inter alia, Titzmann 1991a, Titzmann 1991b, Titzmann 2002, Titzmann 2012a, Titzmann 2012b, Wünsch 1991, Wünsch 1998, Wünsch 2007). The other classification was pulled from the timespans of the separate volumes of “Hansers Sozialgeschichte der deutschen Literatur vom 16. Jahrhundert bis in die Gegenwart” (Grimminger 1980–2009).</p>
<p>In the context of German structuralism, the following epoch classification are discussed (all time spans are give or take, of course):</p>
<ul>
<li>1720–1750: Literatursystem ‘Frühaufklärung’ (‘Early Enlightenment’)</li>
<li>1750–1770: Literatursystem ‘Empfindsamkeit’ (‘Sentimentalism’)</li>
<li>1770–1830: Literatursystem ‘Goethezeit’</li>
<li>1830–1850: Literatursystem ‘Biedermeier’</li>
<li>1850–1890: Literatursystem ‘Realismus’</li>
<li>1890–1930: Literatursystem ‘Frühe Moderne’</li>
</ul>
<p>The separate volumes of “Hansers Sozialgeschichte der deutschen Literatur” are divided like this:</p>
<ul>
<li>1680–1789 (Vol. 3)</li>
<li>1789–1815 (Vol. 4)</li>
<li>1815–1848 (Vol. 5)</li>
<li>1848–1890 (Vol. 6)</li>
<li>1890–1918 (Vol. 7)</li>
<li>1918–1933 (Vol. 8)</li>
</ul>
<p>So let’s see how our network values relate to these periodisations (this time around, we’re limiting this venture to the number of characters and network density).</p>
<p>The first four charts are dedicated to the Structuralist periodisation (since our Sydney corpus contains texts only from 1730 to 1930, the X-axes start at 1730):</p>
<h3 id="fig-07-number-of-characters-median-time-spans-according-to-structuralist-approach">Fig. 07: Number of Characters (Median), time spans according to Structuralist approach</h3>
<div class="blog-chart-barline" id="barchart7"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig07.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart7").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (Median)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (Median)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-08-number-of-characters-standard-deviation-time-spans-according-to-structuralist-approach">Fig. 08: Number of Characters (Standard Deviation), time spans according to Structuralist approach</h3>
<div class="blog-chart-barline" id="barchart8"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig08.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart8").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (SD)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (SD)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-09-density-average-time-spans-according-to-structuralist-approach">Fig. 09: Density (Average), time spans according to Structuralist approach</h3>
<div class="blog-chart-barline" id="barchart9"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig09.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart9").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Density (Average)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Density (Average)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-10-density-standard-deviation-time-spans-according-to-structuralist-approach">Fig. 10: Density (Standard Deviation), time spans according to Structuralist approach</h3>
<div class="blog-chart-barline" id="barchart10"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig10.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart10").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Density (SD)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Density (SD)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<p>Let’s now map our values onto the time spans suggested by the volumes of “Hansers Sozialgeschichte” (yet again: our Sydney corpus contains texts only from 1730 to 1930; hence, our X-axes are limited to this period of time):</p>
<h3 id="fig-11-number-of-characters-median-time-spans-according-to-hansers-sozialgeschichte">Fig. 11: Number of Characters (Median), time spans according to “Hansers Sozialgeschichte”</h3>
<div class="blog-chart-barline" id="barchart11"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig11.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart11").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (Median)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (Median)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-12-number-of-characters-standard-deviation-time-spans-according-to-hansers-sozialgeschichte">Fig. 12: Number of Characters (Standard Deviation), time spans according to “Hansers Sozialgeschichte”</h3>
<div class="blog-chart-barline" id="barchart12"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig12.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart12").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Number of Characters (SD)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Number of Characters (SD)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-13-density-average-time-spans-according-to-hansers-sozialgeschichte">Fig. 13: Density (Average), time spans according to “Hansers Sozialgeschichte”</h3>
<div class="blog-chart-barline" id="barchart13"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig13.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart13").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Density (Average)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Density (Average)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h3 id="fig-14-density-standard-deviation-time-spans-according-to-hansers-sozialgeschichte">Fig. 14: Density (Standard Deviation), time spans according to “Hansers Sozialgeschichte”</h3>
<div class="blog-chart-barline" id="barchart14"></div>
<script>
d3.tsv("/data/lit-history-figs/lit-history-fig14.tsv", type, function(error, data) {
var margin = {top: 20, right: 20, bottom: 200, left: 60};
var sum = data.map(function(d) { return d.name; }).length;
var base = 10;
if (sum < 7) { var width = base * 12 * sum - margin.left - margin.right; }
else if (sum < 25) { var width = base * 3.5 * sum - margin.left - margin.right; }
else if (sum < 50 ){ var width = ( base * 1.7 ) * sum - margin.left - margin.right; }
else { var width = ( base * 1 ) * sum - margin.left - margin.right; }
if ( "50" == "" ){var includeHeight = 100 ;}
else { var includeHeight = parseInt("50") ;}
var height = (800 * includeHeight / 100) - margin.top - margin.bottom;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
.ticks(10);
var svg = d3.select("#barchart14").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
x.domain(data.map(function(d) { return d.label; }));
var minval = d3.min(data, function(d) { return d.value; });
var maxval = d3.max(data, function(d) { return d.value; })
if ( maxval <= 1 ){ y.domain([0, 1]); }
else if ( (minval - ((maxval - minval) / 4)) < 0 ) { y.domain([0, maxval]); }
else { y.domain([minval - ((maxval - minval) / 4), maxval]); }
var passedLabel = "Density (SD)"
// label the domain with density if there is no value greater than 1
if (maxval <= 1){ var ylabel = "Density"}
else if ( passedLabel == "" ){var ylabel = "Frequency" }
else { var ylabel = "Density (SD)"}
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.selectAll("text")
.style({"text-anchor": "end", "font-size": "0.7em"})
.attr("dx", "-.8em")
.attr("dy", ".15em")
.attr("transform", "rotate(-65)" );
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style({"text-anchor": "end", "font-size": "0.7em"})
.text(ylabel);
// here we go for a line
var line = d3.svg.line()
.x(function(d) {
return x(d.label) + x.rangeBand() / 2;
})
.y(function(d) {
return y(d.value);
})
.interpolate("cardinal")
.tension("0.95");
svg.append("g")
.append("path")
.data(data)
.attr("class", "line")
.attr("d", line(data))
.style( { "fill": "none", "stroke": "#6495ed", "stroke-width": "3px" });
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 5)
.attr("cx", function(d) { return x(d.label) + x.rangeBand() / 2 ;})
.attr("cy", function(d) { return y( d.value );})
.style("fill", "steelblue");
svg.selectAll(".label")
.data(data)
.enter().append("text")
.attr("class", "dotlabel")
.attr("x", function(d) { return x (d.label) + x.rangeBand() / 2 + 3 ;})
.attr("y", function(d) { return y( d.value ) - 5 ;})
.style("fill", "black").text( function(d) { return d.value ;} );
});
function type(d) {
d.value = +d.value;
return d;
}
</script>
<h2 id="disclaimer">Disclaimer</h2>
<p>All results we’re presenting here are initial explorations of our corpus of 465 dramatic pieces and the network data we pulled out of the texts. Their significance is limited. But we do have network data that can be toyed around with, and that is what we are going to do in the near future. We will have to readjust and we will have te recalculate things. On that note, always bear in mind to never trust any statistics you didn’t forge yourself. Right?</p>
<h3 id="bibliography">Bibliography</h3>
<ul>
<li>Rolf Grimminger et al., <em>Hansers Sozialgeschichte der deutschen Literatur vom 16. Jahrhundert bis in die Gegenwart</em>, München 1980–2009.</li>
<li>Michael Titzmann (ed.), <em>Modelle des literarischen Strukturwandels</em>, Tübingen 1991.</li>
<li>Michael Titzmann, <em>Skizze einer integrativen Literaturgeschichte und ihres Ortes in einer Systematik der Literaturwissenschaft</em>, in: Michael Titzmann (ed.), <em>Modelle des literarischen Strukturwandels</em>, Tübingen 1991, 395–438.</li>
<li>Michael Titzmann, <em>Epoche und Literatursystem. Ein terminologisch-methodologischer Vorschlag</em>, in: <em>Epochen. Mitteilungen des Deutschen Germanistenverbandes</em> 49.3 (2002), 294–307.</li>
<li>Michael Titzmann: <em>Probleme des Epochenbegriffs in der Literaturgeschichtsschreibung</em>, in: Michael Titzmann, <em>Anthropologie der Goethezeit. Studien zur Literatur und Wissensgeschichte</em>, Berlin/Boston 2012, 31–67.</li>
<li>Michael Titzmann, <em>“Empfindung” und “Leidenschaft”. Strukturen, Kontexte, Transformationen der Affektivität/Emotionalität in der deutschen Literatur in der 2. Hälfte des 18. Jahrhunderts</em>, in: Michael Titzmann: <em>Anthropologie der Goethezeit. Studien zur Literatur und Wissensgeschichte</em>, Berlin/Boston 2012, 333–371.</li>
<li>Marianne Wünsch, <em>Vom späten “Realismus” zur “Frühen Moderne”. Versuch eines Modells des literarischen Strukturwandels</em>, in: Michael Titzmann (ed.): <em>Modelle des literarischen Strukturwandels</em>, Tübingen 1991, 187–203.</li>
<li>Marianne Wünsch, <em>Die Fantastische Literatur der Frühen Moderne (1890–1930). Definition. Denkgeschichtlicher Kontext. Strukturen</em>, München 1998.</li>
<li>Marianne Wünsch, <em>Realismus (1850–1890). Zugänge zu einer literarischen Epoche</em>, Kiel 2007.</li>
</ul>
<p><a href="https://dlina.github.io/200-Years-of-Literary-Network-Data/">200 Years of Literary Network Data</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 25, 2015.</p>https://dlina.github.io/The-Biggest-Chatterbox-in-German-Literature2015-06-23T00:00:00+02:002015-06-23T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>The DLINA <em>zwischenformat</em> we <a href="/Introducing-Our-Zwischenformat/">recently introduced</a> also stores amounts of speech acts, words, lines, chars. Truth be told, we will always have to cope with some erroneous and inaccurate markup contained in the TextGrid Repository TEI files here and there, but now we can roughly specify how many speech acts are executed by each character, how many words are uttered by each of them, and the amount of letters used by everybody. These values were elevated from all dramas of our <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">Sydney corpus</a>, i.e., 465 dramas written or published inbetween 1731 and 1929.</p>
<p>A complete list of all 9,913 characters contained in our corpus can be found <a href="https://github.com/dlina/project/blob/master/data/zwischenformat/output/amount-list.csv">here</a> (i.e., the average cast list of a play has 21 characters).</p>
<p>But today we’re not interested in the list as a whole (we’ll get back to that later), but the <strong>top 20</strong>. So may we acquaint you with the biggest chatterboxes of German literature (omitting the ones who are not in our corpus, of course):</p>
<table>
<thead>
<tr>
<th> </th>
<th>Character</th>
<th>Title</th>
<th>Author</th>
<th>Chars</th>
<th>Words</th>
<th>Speech acts</th>
<th>Additional data</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>GEORG</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>133443</td>
<td>20859</td>
<td>952</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>2</td>
<td>DUFROY</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>107534</td>
<td>16588</td>
<td>885</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>3</td>
<td>FAUST</td>
<td>Faust. Der Tragödie erster Teil</td>
<td>Goethe, Johann Wolfgang</td>
<td>97546</td>
<td>9037</td>
<td>225</td>
<td><a href="http://dlina.github.io/243/">http://dlina.github.io/243/</a></td>
</tr>
<tr>
<td>4</td>
<td>MEPHISTOPHELES</td>
<td>Faust. Der Tragödie erster Teil</td>
<td>Goethe, Johann Wolfgang</td>
<td>92536</td>
<td>8408</td>
<td>257</td>
<td><a href="http://dlina.github.io/243/">http://dlina.github.io/243/</a></td>
</tr>
<tr>
<td>5</td>
<td>GOTHLAND</td>
<td>Herzog Theodor von Gothland</td>
<td>Grabbe, Christian Dietrich</td>
<td>86529</td>
<td>16325</td>
<td>508</td>
<td><a href="http://dlina.github.io/158/">http://dlina.github.io/158/</a></td>
</tr>
<tr>
<td>6</td>
<td>HOLLRIEDER</td>
<td>Sonnenfinsternis</td>
<td>Holz, Arno</td>
<td>85600</td>
<td>13544</td>
<td>663</td>
<td><a href="http://dlina.github.io/174/">http://dlina.github.io/174/</a></td>
</tr>
<tr>
<td>7</td>
<td>ONKEL LUDWIG</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>79066</td>
<td>12322</td>
<td>864</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>8</td>
<td>LIBUSSA</td>
<td>Die Gründung Prags</td>
<td>Brentano, Clemens</td>
<td>70723</td>
<td>13139</td>
<td>308</td>
<td><a href="http://dlina.github.io/384/">http://dlina.github.io/384/</a></td>
</tr>
<tr>
<td>9</td>
<td>FRANZ</td>
<td>Franz von Sickingen</td>
<td>Lassalle, Ferdinand</td>
<td>67829</td>
<td>12445</td>
<td>219</td>
<td><a href="http://dlina.github.io/287/">http://dlina.github.io/287/</a></td>
</tr>
<tr>
<td>10</td>
<td>CARDENIO</td>
<td>Halle</td>
<td>Arnim, Ludwig Achim von</td>
<td>67167</td>
<td>12299</td>
<td>237</td>
<td><a href="http://dlina.github.io/301/">http://dlina.github.io/301/</a></td>
</tr>
<tr>
<td>11</td>
<td>MARIANNE</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>66707</td>
<td>10383</td>
<td>766</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>12</td>
<td>ANATOL</td>
<td>Anatol</td>
<td>Schnitzler, Arthur</td>
<td>61885</td>
<td>11526</td>
<td>723</td>
<td><a href="http://dlina.github.io/89/">http://dlina.github.io/89/</a></td>
</tr>
<tr>
<td>13</td>
<td>FIESCO</td>
<td>Die Verschwörung des Fiesco zu Genua</td>
<td>Schiller, Friedrich</td>
<td>61633</td>
<td>10412</td>
<td>326</td>
<td><a href="http://dlina.github.io/451/">http://dlina.github.io/451/</a></td>
</tr>
<tr>
<td>14</td>
<td>MEPHISTOPHELES</td>
<td>Faust. Der Tragödie zweiter Teil</td>
<td>Goethe, Johann Wolfgang von</td>
<td>61231</td>
<td>10845</td>
<td>240</td>
<td><a href="http://dlina.github.io/201/">http://dlina.github.io/201/</a></td>
</tr>
<tr>
<td>15</td>
<td>CROMWELL</td>
<td>Ein Faust der That</td>
<td>Bleibtreu, Karl</td>
<td>61034</td>
<td>10581</td>
<td>257</td>
<td><a href="http://dlina.github.io/322/">http://dlina.github.io/322/</a></td>
</tr>
<tr>
<td>16</td>
<td>LA BELLA CENCI</td>
<td>Sonnenfinsternis</td>
<td>Holz, Arno</td>
<td>60956</td>
<td>10000</td>
<td>453</td>
<td><a href="http://dlina.github.io/174/">http://dlina.github.io/174/</a></td>
</tr>
<tr>
<td>17</td>
<td>DOKTOR FAUST</td>
<td>Doktor Faust</td>
<td>Soden, Julius von</td>
<td>60696</td>
<td>10640</td>
<td>543</td>
<td><a href="http://dlina.github.io/450/">http://dlina.github.io/450/</a></td>
</tr>
<tr>
<td>18</td>
<td>TASSO</td>
<td>Torquato Tasso</td>
<td>Goethe, Johann Wolfgang von</td>
<td>60095</td>
<td>11338</td>
<td>123</td>
<td><a href="http://dlina.github.io/82/">http://dlina.github.io/82/</a></td>
</tr>
<tr>
<td>19</td>
<td>FRANZ VON MOOR</td>
<td>Die Räuber</td>
<td>Schiller, Friedrich</td>
<td>57676</td>
<td>10303</td>
<td>172</td>
<td><a href="http://dlina.github.io/8/">http://dlina.github.io/8/</a></td>
</tr>
<tr>
<td>20</td>
<td>CARLOS</td>
<td>Don Carlos, Infant von Spanien</td>
<td>Schiller, Friedrich</td>
<td>55514</td>
<td>10444</td>
<td>333</td>
<td><a href="http://dlina.github.io/217/">http://dlina.github.io/217/</a></td>
</tr>
</tbody>
</table>
<p>The appearance of Arno Holz and his play <em>Ignorabimus</em> seems natural, given that it’s the longest play in our corpus (cf. <a href="/Longest-German-Language-Theatre-Plays/">the corresponding blog entry</a>).</p>
<p>But let’s have a closer look at the first four lines regarding only two dramas, aforementioned <em>Ignorabimus</em> and Goethe’s <em>Faust, part I</em>. Their values document quite some structural differences between the two texts, or rather, they indicate a completely different way of speaking:</p>
<table>
<thead>
<tr>
<th> </th>
<th>Character</th>
<th>Title</th>
<th>Author</th>
<th>Chars</th>
<th>Words</th>
<th>Speech acts</th>
<th>Additional data</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>GEORG</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>133443</td>
<td>20859</td>
<td>952</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>2</td>
<td>DUFROY</td>
<td>Ignorabimus</td>
<td>Holz, Arno</td>
<td>107534</td>
<td>16588</td>
<td>885</td>
<td><a href="http://dlina.github.io/390/">http://dlina.github.io/390/</a></td>
</tr>
<tr>
<td>3</td>
<td>FAUST</td>
<td>Faust. Der Tragödie erster Teil</td>
<td>Goethe, Johann Wolfgang</td>
<td>97546</td>
<td>9037</td>
<td>225</td>
<td><a href="http://dlina.github.io/243/">http://dlina.github.io/243/</a></td>
</tr>
<tr>
<td>4</td>
<td>MEPHISTOPHELES</td>
<td>Faust. Der Tragödie erster Teil</td>
<td>Goethe, Johann Wolfgang</td>
<td>92536</td>
<td>8408</td>
<td>257</td>
<td><a href="http://dlina.github.io/243/">http://dlina.github.io/243/</a></td>
</tr>
</tbody>
</table>
<p>Simply put: In Arno Holz’s play, characters speak much more often, but their utterances are quite short and so are the words they use. In Goethe’s play, the characters speak less often, but their speeches are much longer, as are the words they use.</p>
<p>In any case, this difference can be explained with the whole different eras that spawned the two plays, the temporal distance amounts to more than a century. Perhaps we’re witnessing the difference of two historical styles here: Isn’t this difference all about pre-modern vs. modern drama?
We will discuss this further, of course, since this is the kind of quantitative evidence we are looking for when researching structural styles of dramatic texts.</p>
<p>But let’s leave the stage to the poets for now. At first, we’ll have some text by our winner, Arno Holz, followed by some notorious Goethe lines:</p>
<h3 id="1124-characters-from-ignorabimus-1914">1124 characters from <em>Ignorabimus</em> (1914)</h3>
<blockquote>
<p>GEORG<br />
<em>in diesem Augenblick durch die Tür links; schlanke, nervöse Erscheinung; in ihrer ganzen Haltung den ehemaligen Offizier noch verratend; das dunkle Haar an den Schläfen bereits stark ergraut; Schnurrbart noch dunkel, die Augen hellgrau und durchdringend.</em><br />
Guten Morgen!<br />
MARIANNE<br />
<em>herzklopfend aufgestanden; ihn groß anstarrend; sie hat unwillkürlich versucht, die Blumen etwas zu verbergen.</em><br />
…<br />
GEORG<br />
<em>unruhig, dabei eine Zigarette rauchend, auf und ab; seine Sprechweise ist hastig knapp.</em><br />
Du brauchst die Dinger nicht zu verstecken! … Laßt euch nicht stören!<br />
ONKEL LUDWIG<br />
<em>die Blumen ergreifend und sie vor sich hinlegend; ruhig.</em><br />
Gib sie mir, Kind. Ich werde sie mir oben auf meine stille Stube stellen.<br />
MARIANNE<br />
<em>die sich erst jetzt etwas gefaßt hat; stockend; zu Georg.</em><br />
Hat dir der Diener … deinen Tee schon gebracht?<br />
GEORG<br />
<em>durch dessen Ton fast permanent etwas wie Unruhe, federnde Unzufriedenheit oder Gereiztheit klingt.</em><br />
Danke. Ich rauche! … Hatte nur so aus Gewohnheit geschellt. Reflexbewegung! Kann ihn wieder wegtragen. Pferdegetrappel.<br />
ONKEL LUDWIG<br />
<em>ablenkend; nach dem Garten hin.</em><br />
Eine Hitze draußen …<br />
GEORG<br />
<em>kurz; sachlich.</em><br />
Ja.</p>
</blockquote>
<h3 id="4565-characters-from-faust-der-tragödie-erster-teil-1808">4565 characters from <em>Faust. Der Tragödie erster Teil</em> (1808)</h3>
<blockquote>
<p>FAUST.<br />
Habe nun, ach! Philosophie,<br />
Juristerei und Medizin,<br />
Und leider auch Theologie<br />
Durchaus studiert, mit heißem Bemühn.<br />
Da steh’ ich nun, ich armer Tor,<br />
Und bin so klug als wie zuvor!<br />
Heiße Magister, heiße Doktor gar,<br />
Und ziehe schon an die zehen Jahr’<br />
Herauf, herab und quer und krumm<br />
Meine Schüler an der Nase herum –<br />
Und sehe, daß wir nichts wissen können!<br />
Das will mir schier das Herz verbrennen.<br />
Zwar bin ich gescheiter als alle die Laffen,<br />
Doktoren, Magister, Schreiber und Pfaffen;<br />
Mich plagen keine Skrupel noch Zweifel,<br />
Fürchte mich weder vor Hölle noch Teufel –<br />
Dafür ist mir auch alle Freud’ entrissen,<br />
Bilde mir nicht ein, was Rechts zu wissen,<br />
Bilde mir nicht ein, ich könnte was lehren,<br />
Die Menschen zu bessern und zu bekehren.<br />
Auch hab’ ich weder Gut noch Geld,<br />
Noch Ehr’ und Herrlichkeit der Welt;<br />
Es möchte kein Hund so länger leben!<br />
Drum hab’ ich mich der Magie ergeben,<br />
Ob mir durch Geistes Kraft und Mund<br />
Nicht manch Geheimnis würde kund;<br />
Daß ich nicht mehr mit sauerm Schweiß<br />
Zu sagen brauche, was ich nicht weiß;<br />
Daß ich erkenne, was die Welt<br />
Im Innersten zusammenhält,<br />
Schau’ alle Wirkenskraft und Samen,<br />
Und tu’ nicht mehr in Worten kramen.</p>
</blockquote>
<blockquote>
<p>O sähst du, voller Mondenschein,<br />
Zum letztenmal auf meine Pein,<br />
Den ich so manche Mitternacht<br />
An diesem Pult herangewacht:<br />
Dann über Büchern und Papier,<br />
Trübsel’ger Freund, erschienst du mir!<br />
Ach! könnt’ ich doch auf Bergeshöhn<br />
In deinem lieben Lichte gehn,<br />
Um Bergeshöhle mit Geistern schweben,<br />
Auf Wiesen in deinem Dämmer weben,<br />
Von allem Wissensqualm entladen,<br />
In deinem Tau gesund mich baden!</p>
</blockquote>
<blockquote>
<p>Weh! steck’ ich in dem Kerker noch?<br />
Verfluchtes dumpfes Mauerloch,<br />
Wo selbst das liebe Himmelslicht<br />
Trüb durch gemalte Scheiben bricht!<br />
Beschränkt von diesem Bücherhauf,<br />
Den Würme nagen, Staub bedeckt,<br />
Den, bis ans hohe Gewölb’ hinauf,<br />
Ein angeraucht Papier umsteckt;<br />
Mit Gläsern, Büchsen rings umstellt,<br />
Mit Instrumenten vollgepfropft,<br />
Urväter-Hausrat drein gestopft –<br />
Das ist deine Welt! das heißt eine Welt!</p>
</blockquote>
<blockquote>
<p>Und fragst du noch, warum dein Herz<br />
Sich bang in deinem Busen klemmt?<br />
Warum ein unerklärter Schmerz<br />
Dir alle Lebensregung hemmt?<br />
Statt der lebendigen Natur,<br />
Da Gott die Menschen schuf hinein,<br />
Umgibt in Rauch und Moder nur<br />
Dich Tiergeripp’ und Totenbein.</p>
</blockquote>
<blockquote>
<p>Flieh! auf! hinaus ins weite Land!<br />
Und dies geheimnisvolle Buch,<br />
Von Nostradamus’ eigner Hand,<br />
Ist dir es nicht Geleit genug?<br />
Erkennest dann der Sterne Lauf,<br />
Und wenn Natur dich unterweist,<br />
Dann geht die Seelenkraft dir auf,<br />
Wie spricht ein Geist zum andern Geist.<br />
Umsonst, daß trocknes Sinnen hier<br />
Die heil’gen Zeichen dir erklärt:<br />
Ihr schwebt, ihr Geister, neben mir;<br />
Antwortet mir, wenn ihr mich hört!</p>
</blockquote>
<blockquote>
<p><em>Er schlägt das Buch auf und erblickt das Zeichen des Makrokosmus.</em></p>
</blockquote>
<blockquote>
<p>Ha! welche Wonne fließt in diesem Blick<br />
Auf einmal mir durch alle meine Sinnen!<br />
Ich fühle junges, heil’ges Lebensglück<br />
Neuglühend mir durch Nerv’ und Adern rinnen.<br />
War es ein Gott, der diese Zeichen schrieb,<br />
Die mir das innre Toben stillen,<br />
Das arme Herz mit Freude füllen<br />
Und mit geheimnisvollem Trieb<br />
Die Kräfte der Natur rings um mich her enthüllen?<br />
Bin ich ein Gott? Mir wird so licht!<br />
Ich schau’ in diesen reinen Zügen<br />
Die wirkende Natur vor meiner Seele liegen.<br />
Jetzt erst erkenn’ ich, was der Weise spricht:<br />
›Die Geisterwelt ist nicht verschlossen;<br />
Dein Sinn ist zu, dein Herz ist tot!<br />
Auf, bade, Schüler, unverdrossen<br />
Die ird’sche Brust im Morgenrot!‹</p>
</blockquote>
<blockquote>
<p><em>Er beschaut das Zeichen.</em></p>
</blockquote>
<blockquote>
<p>Wie alles sich zum Ganzen webt,<br />
Eins in dem andern wirkt und lebt!<br />
Wie Himmelskräfte auf und nieder steigen<br />
Und sich die goldnen Eimer reichen!<br />
Mit segenduftenden Schwingen<br />
Vom Himmel durch die Erde dringen,<br />
Harmonisch all das All durchklingen!</p>
</blockquote>
<blockquote>
<p>Welch Schauspiel! Aber ach! ein Schauspiel nur!<br />
Wo fass’ ich dich, unendliche Natur?<br />
Euch Brüste, wo? Ihr Quellen alles Lebens,<br />
An denen Himmel und Erde hängt,<br />
Dahin die welke Brust sich drängt –<br />
Ihr quellt, ihr tränkt, und schmacht’ ich so vergebens?</p>
</blockquote>
<blockquote>
<p><em>Er schlägt unwillig das Buch um und erblickt das Zeichen des Erdgeistes.</em></p>
</blockquote>
<blockquote>
<p>Wie anders wirkt dies Zeichen auf mich ein!<br />
Du, Geist der Erde, bist mir näher;<br />
Schon fühl’ ich meine Kräfte höher,<br />
Schon glüh’ ich wie von neuem Wein,<br />
Ich fühle Mut, mich in die Welt zu wagen,<br />
Der Erde Weh, der Erde Glück zu tragen,<br />
Mit Stürmen mich herumzuschlagen<br />
Und in des Schiffbruchs Knirschen nicht zu zagen.<br />
Es wölkt sich über mir –<br />
Der Mond verbirgt sein Licht –<br />
Die Lampe schwindet!<br />
Es dampft – Es zucken rote Strahlen<br />
Mir um das Haupt – Es weht<br />
Ein Schauer vom Gewölb’ herab<br />
Und faßt mich an!<br />
Ich fühl’s, du schwebst um mich, erflehter Geist.<br />
Enthülle dich!<br />
Ha! wie’s in meinem Herzen reißt!<br />
Zu neuen Gefühlen<br />
All’ meine Sinnen sich erwühlen!<br />
Ich fühle ganz mein Herz dir hingegeben!<br />
Du mußt! du mußt! und kostet’ es mein Leben!</p>
</blockquote>
<p><a href="https://dlina.github.io/The-Biggest-Chatterbox-in-German-Literature/">The Biggest Chatterbox in German Literature</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 23, 2015.</p>https://dlina.github.io/Editing-Rules2015-06-22T00:00:00+02:002015-06-22T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<h2 id="introduction">Introduction</h2>
<p>After the structural data have been extracted and put into the <a href="/Introducing-Our-Zwischenformat/">DLINA zwischenformat</a>, manual intervention is often necessary to improve the data quality and correct errors in the source data. Especially the TextGrid data proved to be quite problematic due to OCR errors and false tagging.</p>
<p>Some of the “external” problems we encountered are (that is, problems not inherent to the text per se but introduced through automated or manual conversion to a computer-readable format and creating the markup):</p>
<ul>
<li>no or insufficient structural data encoded,</li>
<li>OCR errors in a <code class="highlighter-rouge"><speaker></code> names (strings),</li>
<li>stage directions interpreted as part of a speaker’s name.</li>
</ul>
<p>Additionally, there are a few “internal” phenomena – i.e. characteristics typical for a play – that have to be taken into account:</p>
<ul>
<li>different ways of referring to a person – e.g., the full name might be given on the first appearance and only the first name on further appearances,</li>
<li>collectives or groups of speakers, e.g., “Alle” (all), “Einige” (some), “Andere” (others),</li>
<li>indeterminate speakers, e.g., “Ein Diener” (a servant), “Erster Ritter” (first knight) which might refer to different characters throughout a play.</li>
</ul>
<p>In order to get around these problems, we had to manually edit the DLINA data files. We established a fixed set of rules (see below) to cover the most common problems and added comments to the data files if the changes involved non-trivial interpretation.</p>
<h2 id="rules-for-editing-our-zwischenformat-dlina-data-files">Rules for editing our <em>zwischenformat</em> (DLINA data files)</h2>
<ul>
<li>Rule 1 – Add the schema files as a PI</li>
<li>Rule 2 – Edit the metadata header</li>
<li>Rule 3 – Identification of characters</li>
<li>Rule 4 – Multiple speakers (explicit)</li>
<li>Rule 5 – Multiple speakers (implicit)</li>
<li>Rule 6 – Multiple speakers (collective)</li>
<li>Rule 7 – Same day, different shit</li>
<li>Rule 8 – Collectives as part of a collective</li>
</ul>
<h2 id="rule-1-add-the-schema-files-as-a-processing-instruction--example">Rule 1: Add the schema files as a Processing Instruction – example</h2>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="cp"><?xml version="1.0" encoding="UTF-8"?></span>
<span class="cp"><?xml-model href="http://raw.githubusercontent.com/DLiNa/project/master/rules/lina.rnc"?></span>
<span class="cp"><?xml-model href="http://raw.githubusercontent.com/DLiNa/project/master/rules/lina.sch"?></span></code></pre></figure>
<h2 id="rule-2-edit-the-metadata-header--example">Rule 2: Edit the metadata header – example</h2>
<p>The TextGrid sources come with false and/or incomplete tagging of metadata in its (usually two) <code class="highlighter-rouge"><tei:teiHeader></code>. This information has to be brought into a consistent state and crucial information has to be added. This usually means:</p>
<ul>
<li>removing surplus <code class="highlighter-rouge"><title></code> tags,</li>
<li>if applicable, adding <code class="highlighter-rouge"><subtitle></code> and <code class="highlighter-rouge"><genretitle></code> (the former usually including a self-attributed genre like “Ein Trauerspiel in 5 Akten” and the latter containing the genre in a normalised way, in this case: “Trauerspiel”; to make things comparable, we’re considering adding attribute lists for the major genres),</li>
<li>adding known dates (when the play was written, first printed and premiered),</li>
<li>adding the URI of the data source(s) – in case we had to add structural information, a second <code class="highlighter-rouge"><source></code> tag is added.</li>
</ul>
<h3 id="before-editing">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"> <span class="nt"><head></span>
<span class="nt"><title></span>Dramen<span class="nt"></title></span>
<span class="nt"><title></span>Gottsched, Johann Christoph<span class="nt"></title></span>
<span class="nt"><title></span>Der sterbende Cato<span class="nt"></title></span>
<span class="nt"><author></span>Gottsched, Johann Christoph<span class="nt"></author></span>
<span class="nt"><date</span> <span class="na">when=</span><span class="s">"1730"</span><span class="nt">/></span>
<span class="nt"><source></span> <span class="nt"></source></span>
<span class="nt"></head></span></code></pre></figure>
<h3 id="after-editing">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><header></span>
<span class="nt"><title></span>Der sterbende Cato<span class="nt"></title></span>
<span class="nt"><subtitle></span>Ein Trauerspiel<span class="nt"></subtitle></span>
<span class="nt"><genretitle></span>Trauerspiel<span class="nt"></genretitle></span>
<span class="nt"><author></span>Gottsched, Johann Christoph<span class="nt"></author></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"print"</span> <span class="na">when=</span><span class="s">"1732"</span><span class="nt">></span>1732<span class="nt"></date></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"premiere"</span> <span class="na">when=</span><span class="s">"1731"</span><span class="nt">></span>1731<span class="nt"></date></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"written"</span> <span class="na">when=</span><span class="s">"1730"</span><span class="nt">></span>1730<span class="nt"></date></span>
<span class="nt"><source></span>https://textgridlab.org/1.0/tgcrud-public/rest/textgrid:nks0.0/data<span class="nt"></source></span>
<span class="nt"></header></span></code></pre></figure>
<h2 id="rule-3-identification-of-characters--example1">Rule 3: Identification of characters – example 1</h2>
<p>The easiest case is two similar and easily understandable names for one character. Often, a character is introduced by a full name, possibly including a title or an article, and later referred to only by the given name or the title alone. Another possibility is a simple typo in a character’s name.
Here, we move the <code class="highlighter-rouge"><alias></code> of one <code class="highlighter-rouge"><character></code> (usually the less frequent, or the one containing a typo) to the “right” one.</p>
<h3 id="before-editing-1">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ODOARDO GALOTTI<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"odoardo_galotti"</span><span class="nt">></span>
<span class="nt"><name></span>ODOARDO GALOTTI<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ODOARDO <span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"odoardo"</span><span class="nt">></span>
<span class="nt"><name></span>ODOARDO<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h3 id="after-editing-1">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ODOARDO GALOTTI<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"odoardo_galotti"</span><span class="nt">></span>
<span class="nt"><name></span>ODOARDO GALOTTI<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"odoardo"</span><span class="nt">></span>
<span class="nt"><name></span>ODOARDO<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h2 id="rule-3-identification-of-characters--example2">Rule 3: Identification of characters – example 2</h2>
<p>A second, less obvious possibility is that a character is not visible on stage but its voice can be heard. In these cases, we add an <code class="highlighter-rouge"><alias></code> to the lina:character and add an <code class="highlighter-rouge">@type="voiceOf"</code>.
The idea behind the attribute is to be later able to differentiate between a character actually on stage and one merely heard.</p>
<h3 id="before-editing-2">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>MARIANE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"mariane"</span><span class="nt">></span>
<span class="nt"><name></span>MARIANE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span> MARIANENS STIMME <span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"marianens_stimme"</span><span class="nt">></span>
<span class="nt"><name></span>MARIANENS STIMME<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h3 id="after-editing-2">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>MARIANE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"mariane"</span><span class="nt">></span>
<span class="nt"><name></span>MARIANE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"marianens_stimme"</span> <span class="na">type=</span><span class="s">"voiceOf"</span><span class="nt">></span>
<span class="nt"><name></span>MARIANENS STIMME<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h2 id="rule-4-multiple-speakers-explicit--example">Rule 4: Multiple speakers (explicit) – example</h2>
<p>A common “internal” phenomenon of plays is two or more characters speaking at the same time. In the easy cases they are explicitly named, separated by comma or a conjunction like “und”/”and”. In these cases, in the <code class="highlighter-rouge">//lina:text//lina:sp</code> we partition <code class="highlighter-rouge">@who</code> to its constituents, removing any comma or conjunction. Additionally, the <code class="highlighter-rouge">lina:character</code> in <code class="highlighter-rouge">lina:personae</code> is deleted.</p>
<h3 id="before-editing-3">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#madame_welldorf_und_luise"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"5"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"21"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h3 id="after-editing-3">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#madame_welldorf #luise"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"5"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"21"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h2 id="rule-5-multiple-speakers-implicit--example">Rule 5: Multiple speakers (implicit) – example</h2>
<p>In the “implicit” case, no names are given for the speakers, but are referred to by their role or some attribute they have in common.
Here, the surplus <code class="highlighter-rouge"><character></code> is deleted and the <code class="highlighter-rouge">@who</code> expanded to contain a pointer to all the individual characters.</p>
<h3 id="before-editing-4">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ERSTE MAGD<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"erste_magd"</span><span class="nt">></span>
<span class="nt"><name></span>ERSTE MAGD<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ZWEITE MAGD<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"zweite_magd"</span><span class="nt">></span>
<span class="nt"><name></span>ZWEITE MAGD<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>DIE BEIDEN MÄGDE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"die_beiden_mägde"</span><span class="nt">></span>
<span class="nt"><name></span>DIE BEIDEN MÄGDE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span>
<span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#die_beiden_mägde"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"32"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h3 id="after-editing-4">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ERSTE MAGD<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"erste_magd"</span><span class="nt">></span>
<span class="nt"><name></span>ERSTE MAGD<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>ZWEITE MAGD<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"zweite_magd"</span><span class="nt">></span>
<span class="nt"><name></span>ZWEITE MAGD<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span>
<span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#erste_magd #zweite_magd"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"32"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h2 id="rule-6-multiple-speakers-collective--example1">Rule 6: Multiple speakers (collective) – example 1</h2>
<p>When no explicit names are given but an easily discernable collective, the <code class="highlighter-rouge"><character></code> for the collective name is deleted and the <code class="highlighter-rouge">@who</code> edited to contain the names of all characters speaking.</p>
<h3 id="before-editing-5">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>MADAME WELLDORF<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"madame_welldorf"</span><span class="nt">></span>
<span class="nt"><name></span>MADAME WELLDORF<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>LUISE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"luise"</span><span class="nt">></span>
<span class="nt"><name></span>LUISE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>BEIDE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"beide"</span><span class="nt">></span>
<span class="nt"><name></span>BEIDE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span>
<span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#beide"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"5"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"21"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h3 id="after-editing-5">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>MADAME WELLDORF<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"madame_welldorf"</span><span class="nt">></span>
<span class="nt"><name></span>MADAME WELLDORF<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>LUISE<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"luise"</span><span class="nt">></span>
<span class="nt"><name></span>LUISE<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span>
<span class="nt"><text></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#madame_welldorf #luise"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"5"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"21"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></text></span></code></pre></figure>
<h2 id="rule-6-multiple-speakers-collective--example2">Rule 6: Multiple speakers (collective) – example 2</h2>
<p>Often, multiple speakers are not given explicitly but rather a collective reference is given, e.g., “Einige” (“some”), “Alle” (“all”), “the Borg”, etc.
In these cases it often is necessary to revert to close reading to discern who is actually meant. Usually, we add a <code class="highlighter-rouge"><change></code> to the <code class="highlighter-rouge"><documentation></code> section if the expansion to explicit names is not obvious, requires lengthy close reading or a lot of interpretation.</p>
<h3 id="before-editing-6">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><div></span>
<span class="nt"><head></span>1. Akt<span class="nt"></head></span>
<span class="nt"><div></span>
<span class="nt"><head></span>Erster Akt<span class="nt"></head></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#mana"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"12"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"177"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"10"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"966"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#sora"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"18"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"193"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"15"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"972"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#feria"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"14"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"168"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"11"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"958"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#lato"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"13"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"88"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"13"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"439"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#alle"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"9"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"41"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#andrason"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"39"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1428"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"23"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7989"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#mela"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"6"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"38"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"6"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"228"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></div></span>
<span class="nt"></div></span></code></pre></figure>
<p>Inspecting speech act and stage direction</p>
<blockquote>
<p><em>Andrason kommt.</em><br />
FERIA.<br />
Sei uns willkommen! herzlich willkommen!<br />
ALLE.<br />
Willkommen!<br />
ANDRASON.<br />
Ich umarme dich, meine Schwester! Ich grüße euch, meine Kinder!
Eure Freude macht mich glücklich, eure Liebe tröstet mich.<br /></p>
</blockquote>
<h3 id="after-editing-6">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><div></span>
<span class="nt"><head></span>1. Akt<span class="nt"></head></span>
<span class="nt"><div></span>
<span class="nt"><head></span>Erster Akt<span class="nt"></head></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#mana"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"12"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"177"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"10"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"966"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#sora"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"18"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"193"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"15"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"972"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#feria"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"14"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"168"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"11"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"958"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#lato"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"13"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"88"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"13"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"439"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#mana #sora #feria #lato #mela"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"9"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"41"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#andrason"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"39"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1428"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"23"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7989"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#mela"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"6"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"38"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"6"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"228"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></div></span>
<span class="nt"></div></span></code></pre></figure>
<h2 id="rule-7-same-name-for-different-characters--example">Rule 7: Same name for different characters – example</h2>
<p>Sometimes, two different characters are referred to by the same name, e.g., a servant to the president and a servant to the prince are both named “servant”.
Here, it is necessary to add a <code class="highlighter-rouge"><character></code> for the second individuum, give both an easily recognisable name and ID and edit the <code class="highlighter-rouge">@who</code> attributes to reflect which of these it refers to.</p>
<h3 id="before-editing-7">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>EIN KAMMERDIENER<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"ein_kammerdiener"</span><span class="nt">></span>
<span class="nt"><name></span>EIN KAMMERDIENER<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>KAMMERDIENER<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"kammerdiener"</span><span class="nt">></span>
<span class="nt"><name></span>KAMMERDIENER<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<p>Inspecting speech acts and stage directions</p>
<blockquote>
<p><strong>1. Akt</strong><br />
<strong>Fünfte Szene</strong><br />
[…]<br />
PRÄSIDENT.<br />
Zwar du bist mir gewiß. Ich halte dich an deiner eigenen Schurkerei, wie den Schröter am Faden!<br />
EIN KAMMERDIENER<br />
<em>tritt herein.</em><br />
Hofmarschall von Kalb –<br />
PRÄSIDENT.<br />
Kommt, wie gerufen. – Er soll mir angenehm sein.<br />
<em>Kammerdiener geht.</em></p>
</blockquote>
<blockquote>
<p><strong>2. Akt</strong><br />
<strong>Zweite Szene</strong><br />
<em>Ein alter Kammerdiener des Fürsten, der ein Schmuckkästchen trägt.</em><br />
[…]<br />
KAMMERDIENER.<br />
Seine Durchlaucht der Herzog empfehlen sich Mylady zu Gnaden, und
schicken Ihnen diese Brillanten zur Hochzeit. Sie kommen soeben erst aus Venedig.</p>
</blockquote>
<h3 id="after-editing-7">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>EIN KAMMERDIENER (PRÄSIDENT)<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"ein_kammerdiener_präsident"</span><span class="nt">></span>
<span class="nt"><name></span>EIN KAMMERDIENER<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>EIN KAMMERDIENER (FÜRST)<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"kammerdiener_fürst"</span><span class="nt">></span>
<span class="nt"><name></span>EIN KAMMERDIENER (FÜRST)<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h2 id="rule-8-collectives-as-part-of-a-collective--example">Rule 8: Collectives as part of a collective – example</h2>
<p>Especially in dramas with several large crowds, subdivisions of these crowds take action and speak out while there is no explicit reference to who is actually part of this subdivision (no Six-of-Twelve here). Usually, these groups include none of the major characters and the utterances – while important for the atmosphere of the setting – are quite short.
Here, we decided to not partition the collective, but rather to build it up: “Some of the crowd”, “Others of the crowd” etc. are considered an <code class="highlighter-rouge"><alias></code> of the larger collectives <code class="highlighter-rouge"><character></code>.</p>
<h3 id="before-editing-8">Before editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>DAS VOLK<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"das_volk"</span><span class="nt">></span>
<span class="nt"><name></span>DAS VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>DAS GANZE VOLK<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"das_ganze_volk"</span><span class="nt">></span>
<span class="nt"><name></span>DAS GANZE VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>EINIGE VOM VOLK<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"einige_vom_volk"</span><span class="nt">></span>
<span class="nt"><name></span>EINIGE VOM VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>STIMMEN AUS DEM VOLK<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"stimmen_aus_dem_volk"</span><span class="nt">></span>
<span class="nt"><name></span>STIMMEN AUS DEM VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span></code></pre></figure>
<h3 id="after-editing-8">After editing</h3>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><character></span>
<span class="nt"><name></span>DAS VOLK<span class="nt"><name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"das_volk"</span><span class="nt">></span>
<span class="nt"><name></span>DAS VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"das_ganze_volk"</span><span class="nt">></span>
<span class="nt"><name></span>DAS GANZE VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"stimmen_aus_dem_volk"</span><span class="nt">></span>
<span class="nt"><name></span>STIMMEN AUS DEM VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"einige_vom_volk"</span><span class="nt">></span>
<span class="nt"><name></span>EINIGE VOM VOLK<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span></code></pre></figure>
<h2 id="conclusion-and-caveat">Conclusion and caveat</h2>
<p>Using these rules, we were able to work around most of the problems. The resulting data are much more consistent than what we started out with.
But one always has to bear in mind that improving the data is still limited by some constraints of the source texts:</p>
<ul>
<li>We had to assume that the structure as given in the source files was generally correct; in a few cases, we manually added the missing information to the sources as the results were grossly wrong as was the case with Goethe’s “Götz von Berlichingen” where no scenes were tagged.</li>
<li>Characters that are not tagged as a <code class="highlighter-rouge"><speaker></code> will not be recognised. If two speakers speak collectively and are tagged <code class="highlighter-rouge"><sp>Kolja und Mitja</sp></code> in the source, the script will correctly recognise both speakers. However, there are instances of incorrect tagging where only one speaker is tagged (and the other might “disappear” into a stage direction). In these cases, the second speaker will not be recognised and thus not be present in the <em>zwischenformat</em> data. Usually, it is impossible to recognise these errors at first glance.</li>
<li>Stage directions might be tagged as parts of a speech, and vice versa. This will result in erroneous amounts in the <em>zwischenformat’s</em> <code class="highlighter-rouge"><lina:sp></code>. Our worst case is a missing speaker, for example if all utterances of a character were falsely tagged as stage directions.</li>
</ul>
<p><a href="https://dlina.github.io/Editing-Rules/">Editing Rules</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 22, 2015.</p>https://dlina.github.io/Introducing-Our-Zwischenformat2015-06-21T00:00:00+02:002015-06-21T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Our research interest focuses primarily on <em>structural aspects</em> of dramatic texts. The structural data is extracted from <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">the 465 dramatic texts that constitute our Sydney corpus</a> and then screened and edited before it can be evaluated statistically with regard to literary history.</p>
<p>The structural abstraction is provided by a PHP script that processes the TEI files, collects all the data needed for our purpose and puts it in our own <em>zwischenformat</em> (roughly translates as ‘intermediary format’, the DLINA data format we developed for this project and announced <a href="/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">in our previous post</a>). The script and what it produces, our <em>zwischenformat</em>, represent a structure-oriented form of data mining, so to speak.</p>
<p>Let’s assume that the basic structure of a drama looks as follows (without paratexts):</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><segment></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker1"</span><span class="nt">></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker2"</span><span class="nt">></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker3"</span><span class="nt">></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker1"</span><span class="nt">></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker3"</span><span class="nt">></sp></span>
...
<span class="nt"></segment></span>
<span class="nt"><segment></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker4"</span><span class="nt">></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#speaker2"</span><span class="nt">></sp></span>
...
<span class="nt"></segment></span>
...</code></pre></figure>
<p>The <code class="highlighter-rouge"><segment></code>s represent the predefined structures of a drama: acts and scenes. Our script will extract the structure of segments and speakers from the full-text TEI files and write it into our <em>zwischenformat</em>. The actual content of the speeches is disregarded and represented by the number of speech acts, words, lines, and string length (in characters) instead, each of which are summarised per occuring character identified via its <code class="highlighter-rouge">who</code> attribute. Now we’re able to see at a glance how many words each character is contributing to a play, and we’re able to do that for the whole Sydney corpus. Stay tuned for a post on the greatest chatterboxes in German literature, hehe!</p>
<p>Anyhow, the result looks something like this:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><text></span>
<span class="nt"><div></span>
<span class="nt"><head></span>Vierte Szene<span class="nt"></head></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#ferdinand"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"481"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"2"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"2585"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#luise"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"208"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"1057"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></div></span>
<span class="nt"></text></span></code></pre></figure>
<p>The representation of drama structure (segmentations, speakers) is at the core of our <em>zwischenformat</em>. But it does even more. It captures metadata and it creates complete cast lists for each drama by making use of the <code class="highlighter-rouge">who</code> attributes.</p>
<p>Our <em>zwischenformat</em> consists of three main parts (each of which is required):</p>
<ul>
<li><code class="highlighter-rouge"><header></code> (the metadata)</li>
<li><code class="highlighter-rouge"><personae></code> (a cast list created by help of all <code class="highlighter-rouge">who</code> attributes)</li>
<li><code class="highlighter-rouge"><text></code> (drama segmentation and speakers)</li>
</ul>
<p>Plus, there is also an optional part:</p>
<ul>
<li><code class="highlighter-rouge"><documentation></code> (for documenting non-trivial editing decisions)</li>
</ul>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><documentation></span>
<span class="nt"><change</span> <span class="na">n=</span><span class="s">"1"</span> <span class="na">type=</span><span class="s">"expandCollective"</span> <span class="na">who=</span><span class="s">"peertrilcke"</span><span class="nt">></span>
<span class="nt"><path></span>/play/text[1]/div[4]/div[2]/div[1]<span class="nt"></path></span>
<span class="nt"><orig></span>#die_abziehenden<span class="nt"></orig></span>
<span class="nt"><corr></span>#fritz_kleinmichel #berta #kämpe #frau_piepenbrink #bellmaus #bolz #piepenbrink<span class="nt"></corr></span>
<span class="nt"><comment></span>Siehe Text: "Fritz Kleinmichel mit seiner Braut, Kämpe mit Kleinmichel, Frau Piepenbrink mit Bellmaus, zuletzt Bolz mit Piepenbrink"; "Braut" i.e. Berta<span class="nt"></comment></span>
<span class="nt"></change></span>
<span class="nt"></documentation></span></code></pre></figure>
<p>A complete yet very short and simple one-act drama would be represented like this by our <em>zwischenformat</em>:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="cp"><?xml version="1.0" encoding="UTF-8"?></span>
<span class="cp"><?xml-model href="http://raw.githubusercontent.com/DLiNa/project/master/rules/lina.rnc"?></span>
<span class="cp"><?xml-model href="http://raw.githubusercontent.com/DLiNa/project/master/rules/lina.sch"?></span>
<span class="nt"><play</span> <span class="na">xmlns=</span><span class="s">"http://lina.digital"</span><span class="nt">></span>
<span class="nt"><header></span>
<span class="nt"><title></span>Die Urgrossmutter<span class="nt"></title></span>
<span class="nt"><subtitle></span>Eine Tragi-Komödie in einem Aufzuge<span class="nt"></subtitle></span>
<span class="nt"><genretitle></genretitle></span>
<span class="nt"><author></span>Scheerbart, Paul<span class="nt"></author></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"print"</span> <span class="na">when=</span><span class="s">"1904"</span> <span class="nt">/></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"premiere"</span> <span class="nt">/></span>
<span class="nt"><date</span> <span class="na">type=</span><span class="s">"written"</span> <span class="nt">/></span>
<span class="nt"><source></span>https://textgridlab.org/1.0/tgcrud-public/rest/textgrid:tv6f.0/data<span class="nt"></source></span>
<span class="nt"></header></span>
<span class="nt"><personae></span>
<span class="nt"><character></span>
<span class="nt"><name></span>URGROSSMUTTER<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"urgrossmutter"</span><span class="nt">></span>
<span class="nt"><name></span>URGROSSMUTTER<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>MANELLA<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"manella"</span><span class="nt">></span>
<span class="nt"><name></span>MANELLA<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"><character></span>
<span class="nt"><name></span>CONSTANTIN<span class="nt"></name></span>
<span class="nt"><alias</span> <span class="na">xml:id=</span><span class="s">"constantin"</span><span class="nt">></span>
<span class="nt"><name></span>CONSTANTIN<span class="nt"></name></span>
<span class="nt"></alias></span>
<span class="nt"></character></span>
<span class="nt"></personae></span>
<span class="nt"><text></span>
<span class="nt"><div></span>
<span class="nt"><head></span>Personen<span class="nt"></head></span>
<span class="nt"></div></span>
<span class="nt"><div></span>
<span class="nt"><head></span>[Stücktext]<span class="nt"></head></span>
<span class="nt"><div></span>
<span class="nt"><head></span>[Stücktext]<span class="nt"></head></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#urgrossmutter"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"17"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"497"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"7"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"2795"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#manella"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"22"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"3"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"154"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"><sp</span> <span class="na">who=</span><span class="s">"#constantin"</span><span class="nt">></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"13"</span> <span class="na">unit=</span><span class="s">"speech_acts"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"154"</span> <span class="na">unit=</span><span class="s">"words"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"10"</span> <span class="na">unit=</span><span class="s">"lines"</span><span class="nt">/></span>
<span class="nt"><amount</span> <span class="na">n=</span><span class="s">"948"</span> <span class="na">unit=</span><span class="s">"chars"</span><span class="nt">/></span>
<span class="nt"></sp></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></text></span>
<span class="nt"></play></span></code></pre></figure>
<p>The <em>zwischenformat</em> is validated against:</p>
<ul>
<li><a href="http://raw.githubusercontent.com/dlina/project/master/rules/lina.rnc">http://raw.githubusercontent.com/dlina/project/master/rules/lina.rnc</a></li>
<li><a href="http://raw.githubusercontent.com/dlina/project/master/rules/lina.sch">http://raw.githubusercontent.com/dlina/project/master/rules/lina.sch</a></li>
</ul>
<p>The raw <em>zwischenformat</em> versions of our Sydney corpus can be found here (i.e., the 465 files extracted from the TextGrid Repository before we started editing them):</p>
<ul>
<li><a href="https://github.com/dlina/project/tree/master/data/zwischenformat/raw_lina_data">https://github.com/dlina/project/tree/master/data/zwischenformat/raw_lina_data</a></li>
</ul>
<p>The edited <em>zwischenformat</em> files can be found here (this is the deluxe version of our corpus, so to speak, the basis for all further analyses and visualisations; our editing rules will be published at a later point):</p>
<ul>
<li><a href="https://github.com/dlina/project/tree/master/data/zwischenformat">https://github.com/dlina/project/tree/master/data/zwischenformat</a></li>
</ul>
<p>And now:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><div></span>
<span class="nt"><sp</span><span class="err">="#everybody_and_their_aunt"</span><span class="nt">></span>
<span class="nt"><p></span>Long live the zwischenformat!<span class="nt"></p></span>
<span class="nt"></sp></span>
<span class="nt"></div></span></code></pre></figure>
<p><a href="https://dlina.github.io/Introducing-Our-Zwischenformat/">Introducing Our 'Zwischenformat'</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 21, 2015.</p>https://dlina.github.io/Introducing-DLINA-Corpus-15-07-Codename-Sydney2015-06-20T00:00:00+02:002015-06-20T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Our working corpus is based on the 666 dramas extracted from the TextGrid Repository (<a href="http://dlina.github.io/A-Not-So-Simple-Question/">the not-so simple extraction process was described by Frank and Mathias in an earlier post</a>). This blog post will describe the criteria for selecting <strong>465 dramas</strong> from said repository to represent our working corpus. The version number <strong>15.07</strong> is referring to ‘July 2015’ as we’re going to present our results at the DH2015 conference on July 2, 2015. Further versions of the DLINA Corpus will receive according versioning numbers. As the imminent reason for needing a reliable corpus with clean data is the upcoming conference in Sydney, it was also very easy to pick a codename for the corpus.</p>
<p>Anyway, in order to build our corpus for Sydney we started with a quick survey and picked out 497 plays that seemed suitable. I.e., we ruled out 169 of the TextGrid plays by following these assumptions:</p>
<ul>
<li>
<p>Our corpus should be limited to a specific time span: We will start with the German Enlightenment drama focussing on the modernisation of the German drama in the first half of the 18th century, a process associated with the name of <a href="https://en.wikipedia.org/wiki/Johann_Christoph_Gottsched">Johann Christoph Gottsched</a>. It is a well-established academic position that Gottsched’s dramatic writings as well as his dramatic theory hallmark a turning point in the history of German drama (see, for example, Rochow 1994, Catholy 1982, Koopmann 1979). Therefore, we ruled out 147 dramas that saw the day of light <em>before</em> Gottsched’s <em>Der sterbende Cato</em> (printed in 1732).</p>
</li>
<li>
<p>We also discarded:</p>
<ul>
<li>foreign-language originals,</li>
<li>translations,</li>
<li>mere pantomime plays, that is to say, plays that don’t feature <code class="highlighter-rouge"><sp></code>eech elements,</li>
<li>fragments, i.e., texts that were clearly left unfinished by their author.</li>
</ul>
</li>
</ul>
<p>While we were editing our data using our very own <em>zwischenformat</em> (roughly translatable as “intermediate format”, an according blog post will be published shortly) we sorted out another 32 texts for the following reasons:</p>
<ul>
<li>if the TEI markup was too defective (missing <code class="highlighter-rouge"><speaker></code> elements and such),</li>
<li>if additional texts turned out to be fragments that had slipped our attention before,</li>
<li>if the structure of a text proved to be too complicated (the treatment of 11 dramas had to be postponed for this reason).</li>
</ul>
<p>All in all, our <strong>DLINA Corpus 15.07 (Codename: Sydney)</strong> comprises <strong>465 dramatic texts</strong>, in the shape of 465 XML <em>zwischenformat</em> files.</p>
<h3 id="bibliography">Bibliography</h3>
<ul>
<li>Christian Rochow, <em>Das Drama hohen Stils. Aufklärung und Tragödie in Deutschland (1730–1790)</em>, Heidelberg 1994 (<a href="http://d-nb.info/940506319">DNB</a>)</li>
<li>Eckehard Catholy, <em>Das deutsche Lustspiel. Von der Aufklärung bis zur Romantik</em>, Stuttgart 1982 (<a href="http://d-nb.info/820164496">DNB</a>)</li>
<li>Helmut Koopmann, <em>Drama der Aufklärung. Kommentar zu einer Epoche</em>, München 1979 (<a href="http://d-nb.info/790381419">DNB</a>)</li>
</ul>
<p><a href="https://dlina.github.io/Introducing-DLINA-Corpus-15-07-Codename-Sydney/">Introducing DLINA Corpus 15.07 (Codename: Sydney)</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 20, 2015.</p>https://dlina.github.io/Working-With-Inconsistent-Metadata2015-06-19T00:00:00+02:002015-06-19T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>As we <u>underlined</u> before, we can’t stop celebrating the fact that there are so many literary corpora on the web today. Just a fortnight ago, Martin Müller released the <a href="https://scalablereading.northwestern.edu/2015/06/07/shakespeare-his-contemporaries-shc-released/">Shakespeare His Contemporares (SHC) collection</a>, a corpus of early English modern drama, encoded in <a href="https://github.com/TEIC/TEI-Simple">TEI Simple</a>. We will definitely look into this corpus at a later point, but today we will again be bothering you with the depths of the <a href="https://textgridrep.de/">TextGrid Repository</a>. No worries, today’s blog entry won’t be as excessive as the one we published <a href="/A-Not-So-Simple-Question/">yesterday</a>. ;)</p>
<p>If you’re trying to work with corpora you didn’t create yourself, you will always have the problem of inconsistent metadata. They may be inconsistent or incomplete (or simply missing). Maybe the corpus builders just didn’t have the same metadata needs as you.</p>
<p>So let’s get back to our drama collection derived from the TextGrid Repository, kind of picking up our recent blog post on the <a href="/Longest-German-Language-Theatre-Plays/">top 10 longest German-language theatre plays contained in this very corpus</a>. Today we want to look at the available metadata and try to put all the hundreds of play in a chronological order by just relying on the (inconsistent) metadata provided in the documents.</p>
<p>There are many purposes for doing so, one being the creation of a subcorpus of, let’s say, 18th-century drama. For this, you will need metadata that tells you when a theatre piece was written, or published, or when it premiered. Now, TEI provides a <a href="http://www.tei-c.org/release/doc/tei-p5-doc/de/html/ref-creation.html"><code class="highlighter-rouge"><creation></code></a> element to include information like that. Yet, it is not used consistently in the TextGrid Repository. In many cases, the <code class="highlighter-rouge"><creation></code> slot is left empty. In other cases, it features something like this: <code class="highlighter-rouge"><date notBefore="1837" notAfter="1872"/></code>, the mentioned years being the lifespan of an author. In a way, this information is still helpful to narrow down a text’s date of origin, but it is as vague as can be, of course.</p>
<p>So for the sake of putting all the hundreds of theatre pieces in chronological order, we had to work around this problem. Luckily, the TextGrid Repository also provides some publication info within the <code class="highlighter-rouge"><note></code> element, something like this:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><note></span>Erstdruck in: »Urania«, 1826. Uraufführung am 22.12.1823, Königliches Theater, Berlin.<span class="nt"></note></span></code></pre></figure>
<p>In this example, we’ve got two year specifications, 1823 for the premiere, 1826 for the first print. It is always possible that a piece was written years or decades before it premiered or before it was printed (take, for example, <a href="https://de.wikipedia.org/wiki/Urfaust">Goethe’s “Urfaust”</a>). If we had the resources, we would definitely try to add the missing metadata by hand. But what we were trying to do here is working with what we have to narrow down the date of origin of a play. So in the mentioned example, we would opt for the earlier date, 1823.</p>
<p>Our decision tree would thus look something like this:</p>
<ol>
<li>Look for an exact year in <code class="highlighter-rouge"><creation></code>. If no such year is provided then:</li>
<li>Look for the earliest year mentioned within the <code class="highlighter-rouge"><note></code> element. If that doesn’t yield a satisfactory result then:</li>
<li>Take the author’s year of death as the latest possible year of creation of a piece.</li>
</ol>
<p>For easier processing, we decided to use the detected year as part of the filename, followed by the name of the author and the title of the play. You can have a look at the result <a href="https://github.com/DLiNa/project/tree/master/data/textgrid-repository-dramas">at the respective GitHub folder</a>. Due to our treatment, the plays are automatically listed in chronological order, with the little exception of the 10 Greek and Roman plays written BC (to be found at the end of the file list).</p>
<p>As we stressed before, we chose this approach just to approximate the dates of origin. Such an approach never replaces the proper integration of metadata. For example, all Shakespeare plays are referenced by the year 1616 (rule 3 of our decision tree), due to the lack of better metadata. Again, we could start to repair this by hand, but that was not the purpose of this venture. If your corpus is big enough and you can’t just fix all the metadata with your bare hands, this is what you can do to get an approximation.</p>
<p>But let’s cut to the chase. Let’s have a look at the XQuery we used to work out the year specifications from the metadata provided. The query creates a list of <a href="https://en.wikipedia.org/wiki/Bash_(Unix_shell)">Bash</a> commands to replace the original filenames with the filename schema we described above. The last five lines starting with the <code class="highlighter-rouge">mv</code> command feature problematic filenames. It was a bit late yesterday and we, errm, decided to hardcode so we could eventually call it a day (the collection is still the same we used <a href="/A-Not-So-Simple-Question/">for our previous post</a>):</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">xquery version "3.0";
declare namespace tei = "http://www.tei-c.org/ns/1.0";
let $collection := '/db/data/textgrid-repository-dramas/'
return
((for $filename in xmldb:get-child-resources($collection)
let $doc := doc($collection || $filename)
let $noteStmt := for $item in tokenize($doc//tei:notesStmt/tei:note, '\W')[matches(., '\d{3,4}')] return number($item)
let $noteStmt := number(min($noteStmt))[1]
let $noteStmt := if ($doc//tei:creation/tei:date/@when or $doc//tei:creation/tei:date/string-length() = 4)
then min((number(min($doc//tei:creation/tei:date/@when)), if ($doc//tei:creation/tei:date/string-length() = 4) then number($doc//tei:creation/tei:date/string()) else ()) )
else $noteStmt
let $noteStmt := if ($noteStmt gt min($doc//tei:profileDesc/tei:creation/tei:date/number(@notAfter)))
then $doc//tei:profileDesc/tei:creation/tei:date/@notAfter
else $noteStmt
(: ok, if we still have no date, we look for the pubStmt and compare with creation@notAfter :)
let $noteStmt :=
if (string($noteStmt) = 'NaN')
then
let $pub := number($doc//tei:biblFull/tei:publicationStmt/tei:date/@when)
let $creation := number($doc//tei:profileDesc/tei:creation/tei:date/@notAfter)
return
min(($pub, $creation))
else
$noteStmt
let $noteStmt :=
if (string($noteStmt) = 'NaN') then number($doc//tei:profileDesc/tei:creation/tei:date/@notAfter) else $noteStmt
let $noteStmt := if (string-length($noteStmt) = 3) then 'BC0' || $noteStmt else $noteStmt
let $target := $noteStmt || '_' || replace(string(($doc//tei:author)[1]), '\s+', '_') || '_-_' || replace(($doc//tei:fileDesc[1]/tei:titleStmt/tei:title/string())[1], '\s+', '_')
let $mv :=
"mv '" || replace(xmldb:decode($filename), "[']", "'\\$0'") || "' '" || replace($target, "[']", "'\\$0'") || ".xml'
"
(:replace($mv, "[!|\(|\)|,|'|:|;|-]", '\\$0'):)
return
$mv)
, "
mv 'Aischylos_-_Der_gefesselte_Proemetheus_(-0525--0456).xml' 'BC0470_Aischylos_-_Der_gefesselte_Proemetheus.xml'
mv 'Aischylos_-_Die_Orestie_(-0525--0456).xml' 'BC0456_Aischylos_-_Die_Orestie.xml'
mv 'Euripides_-_Iphigenie_in_Aulis_(-0480--0406).xml' 'BC0406_Euripides_-_Iphigenie_in_Aulis.xml'
mv 'Euripides_-_Medea_(-0480--0406).xml' 'BC0431_Euripides_-_Medea.xml'
mv 'Plautus,_Titus_Maccius_-_Amphitryon_(-0250--0184).xml' 'BC0207_Plautus,_Titus_Maccius_-_Amphitryon.xml'
" )</code></pre></figure>
<p>Let’s conclude this rather dry blog post with some eye candy. We will introduce our <a href="https://github.com/lehkost/dramavis"><strong>“dramavis”</strong></a> script at a later point, but here is what it does. Among other things, it creates network graphs out of theatre pieces. The resulting PNGs can be glued together using ImageMagick and this is what we did to create a superposter of all the 666 dramas contained in the TextGrid Repository. Attention: In this initial version of the poster, the graphs are mostly erroneous due to inconsistent markup. We mainly used these graphs to find and correct markup errors since it’s a lot easier to look at a graph than read thousands of lines of TEI markup. The cleaning of dirty network data based on problematic markup is something we will address later. But for now, here’s a small version of our superposter in JPG format, <a href="http://dx.doi.org/10.6084/m9.figshare.1454476">the actual PNG version weighs 74 MB and was uploaded to Fighshare</a> where you can download it in all its dubious beauty:</p>
<figure>
<img src="https://dlina.github.io/images/tgrep-untouched-dirty-data-superposter-900px.jpg" alt="TextGrid Repository Superposter" style="width:56.25rem" />
</figure>
<p>Well, this must be how <a href="https://en.wikipedia.org/wiki/Vasco_Núñez_de_Balboa">Núñez de Balboa</a> felt when he first saw the Pacific Ocean. ;) But apart from looking nice, this little superposter of 666 theatre plays can definitely be part of a distant-reading strategy once it is based on reliable network data, and this is definitely where we’re headed.</p>
<p><a href="https://dlina.github.io/Working-With-Inconsistent-Metadata/">Working With Inconsistent Metadata</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 19, 2015.</p>https://dlina.github.io/A-Not-So-Simple-Question2015-06-18T00:00:00+02:002015-06-18T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<h1 id="how-many-dramatic-pieces-are-contained-in-the-textgrid-repository">How Many Dramatic Pieces Are Contained in the TextGrid Repository?</h1>
<p><em>Simple question, seemingly. Before we try to answer it, a little heads-up: This blog post is <strong>ridiculously long</strong>. It can be regarded a proof-of-concept of what <a href="https://twitter.com/mareike2405">Mareike König</a> recently said at the <a href="http://cms.uni-konstanz.de/wissenschaftsforum/veranstaltungen/veranstaltungsarchiv/veranstaltungen-2015/die-zukunft-der-wissensspeicher/">“Wissensspeicher” conference</a> in Düsseldorf in the beginning of March: “Blogs have no space constraints.” (<a href="http://www.lisa.gerda-henkel-stiftung.de/blogs_als_wissensorte_der_forschung?nav_id=5594">In <strong>this video</strong>, 17:45 mins. in.</a>) True that! So here we go:</em></p>
<p>Corpus building is a crucial task of many Digital Humanities projects and it is great to see a number of new corpora appear on a fairly regular basis. Many of these text collections feature markup following the <a href="http://www.tei-c.org/Guidelines/">TEI Guidelines</a>. However, the mere existence of a corpus and its application of standardised formats doesn’t relieve you from working your way through its peculiarities. The purpose of this article is to demonstrate how you can start this process, our example being the vast TextGrid Repository and its subset of German-language drama.</p>
<p>The TextGrid Repository is the largest TEI-tagged corpus of German literature and released freely under a CC-BY 3.0 licence. It contains thousands of literary texts from around 1500 to the 1930s: novels, theatre pieces, poems, etc. The corpus is accessible through <a href="http://www.textgridrep.de/">a web interface here</a>, but it can also be downloaded in its entirety so you can toy around with it in your own environment.</p>
<h2 id="using-the-web-interface">Using the Web Interface</h2>
<p>The answer to the question posed in the title of this post seems to be a piece of cake. But it really isn’t, for several reasons. By trying to find the correct answer we turn the corpus upside down which will help us to gain insights on what to expect from the corpus when we start to build our theories around it.</p>
<p>Now, the first approach to answer our not-so-simple question leads us to <a href="http://www.textgridrep.de/">the TextGrid Rep search form</a>. If we make use of existing metadate and enter <a href="http://www.textgridrep.de/results.html?query=genre%3A"drama"&target=both"><code class="highlighter-rouge">genre:"drama"</code></a> as search term, the TextGrid Rep search engine returns <strong>1462 results</strong>. These are far too many due to the fact that a search in the repository also considers ‘work objects’ according to TextGrid’s metadata schema (<a href="https://dev2.dariah.eu/wiki/download/attachments/12189756/Metadata-Cheatsheet.pdf?api=v2">see the corresponding cheat sheet here</a>).</p>
<p>If we limit our search to just XML documents we get a much better approximation, so let’s specify our search term: <a href="http://www.textgridrep.de/results.html?query=genre%3A"drama"+format%3A"text%2Fxml"&target=both"><code class="highlighter-rouge">genre:"drama" format:"text/xml"</code></a>. <strong>And we’re down to 690!</strong> This is a promising answer and the good news is that we’re halfway there. Easy as pie, so far. But wait. We wouldn’t have written this article if it was that easy, right? The second half of our trip will take a lot (like, a lot) longer. But we will learn a plethora of things about our corpus and its constraints.</p>
<p>When we started getting acquainted with our corpus we found certain anomalies:</p>
<ul>
<li>Some dramas are split into parts, each of which comes in its own XML document frame and has an own TEI header with the genre information we took advantage of before. These parts are counted as own drama when just looking for genre info in TEI headers and, for this reason, distort our results.</li>
<li>The second big problem are doublets. There are several dramatic pieces that appear twice or even three times. This happens due to co-authorship. E.g., O. F. Berg und David Kalisch both authored the dramatic text “Berlin, wie es weint und lacht” (1858). The full text appears only once in the corpus, but there’s a reference to the text for every co-author and it features another genre value which falsely increases the number of dramatic pieces we are counting.</li>
</ul>
<p>To get rid of those things, we need to dive deep and therefore we need tools that are a bit more flexible, in this case, an XML database that we can use to build our own queries. So let’s download the whole corpus and load it into a local eXist-db instance.</p>
<h2 id="exist-db-an-open-source-native-xml-database">eXist-db, an Open-Source Native XML Database</h2>
<p>If you haven’t done so already, please go ahead and <a href="http://exist-db.org/exist/apps/homepage/index.html">download eXist-db</a>. After installing and starting it, you can access it via your browser <a href="http://localhost:8080">on port 8080 of localhost</a>. Just let it run for the time being.</p>
<h2 id="loading-data-into-our-own-xml-database">Loading Data into Our Own XML Database</h2>
<p>The corpus can be downloaded as one integral ZIP file <a href="http://www.textgrid.de/Digitale-Bibliothek">from the TextGrid website</a>. There are two versions of the corpus. The differences are explained on the website but aren’t that noteworthy, let’s just go ahead and download <a href="http://www.textgrid.de/fileadmin/digitale-bibliothek/literatur-nur-texte-2.zip">the second version (390 MB, zipped)</a>. Unzip the file. All XML files are contained in the “12-publication” folder. There is one XML file for every author, 695 altogether (there are several Goethe files, but nevermind). Apart from these exceptions, all the works of the same author are all contained in one file.</p>
<p>Let’s load all the XML files into our XML database:</p>
<p>On the eXist-db <a href="http://localhost:8080">dashboard</a>, click on “Collections” and enter login and password (if you haven’t specified any login data, enter “admin” as login and leave the password field empty). Now click on the icon “New collection” (third from right) and create a new folder for our collection. Let’s call it “data” where from now on we will put all our data (hence the name, for good practice!). Let’s create a subfolder called “tgrep” for our repository and then click on “Upload resources” (the icon on the far right). Look for the folder we unzipped earlier, change into the “12-publication” folder, mark all XML files (CTRL + A is your friend) so all of them will be loaded into our collection. This will take some time, around 5 minutes, exactly the time you need to squeeze two or three oranges and set you up with a glass of fresh juice.</p>
<p>If you wonder what’s contained in the XML files, just double-click on one. A new browser tab will open and give away the plain XML with some syntax highlighting to easily differentiate between TEI elements, plain text, URLs, etc. The document starts with the <code class="highlighter-rouge"><teiCorpus></code> element, meaning that the file contains several works. According to the TextGrid metadata schema based on <a href="https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records">FRBR</a> there may be several <code class="highlighter-rouge"><teiCorpus></code> nested within the root element. So there are several hierarchies which in this case are not uniform, but let’s leave that for now.</p>
<p>The genre of a text is specified within the TEI element <a href="http://www.tei-c.org/release/doc/tei-p5-doc/de/html/ref-textClass.html">textClass</a>, the schema (<a href="https://projects.gwdg.de/projects/digitale-bibliothek/repository/revisions/master/entry/schemas/tei_digibib.xsd#L6924">an .xsd file</a>) specifies that the genre info in this corpus is contained within <code class="highlighter-rouge"><tei:term></code>.</p>
<p>So once again, how many dramatic pieces are contained in the TextGrid Repository???</p>
<h2 id="building-an-xquery">Building an XQuery</h2>
<p>Let’s start with reproducing our 690-result with a basic XQuery. This is to show you that we can easily reproduce the results of the search form.</p>
<p>So we want to find all works that are marked as “drama” in the genre-specific metadata. As indicated before, the TEI element <code class="highlighter-rouge"><textClass></code> contains info on the genre. So let’s count all occurrences in the whole TextGrid Repository by using <strong>eXide</strong>, “a cool, handy, fully integrated editor for working with XQuery, XML, and other resources stored in eXist” (<a href="https://books.google.de/books?id=0evSBQAAQBAJ&pg=PA29">O’Reilly</a>). Close the Collection Browser and click on the “eXide – XQuery IDE” logo. You should see a fresh sheet for your own queries.</p>
<p>First of all, we need to declare a namespace for technical reasons, just insert as line two:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">declare namespace tei = "http://www.tei-c.org/ns/1.0";</code></pre></figure>
<p>To address our imported collection we write in the next line:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">collection('/db/data/tgrep/')</code></pre></figure>
<p>If we now want to count the occurences, we can use a count function. Just wrap a <code class="highlighter-rouge">count()</code> around the specified collection. Then we have to determine what to count, so let’s have a look on the genre information as described above: <code class="highlighter-rouge">//tei:textClass/tei:keywords/tei:term[text() = 'drama'])</code></p>
<p>Eventually, our query looks like this:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">count(collection('/db/data/tgrep/')//tei:textClass/tei:keywords/tei:term[text() = 'drama'])</code></pre></figure>
<p>To evaluate it, just click on the “Eval” button and see what happens (after some seconds, anyway).</p>
<p>Most of the stuff in this query is a so-called <a href="https://en.wikipedia.org/wiki/XPath">XPath</a>. Basically, XPath is a language for browsing through and operate on your XML documents. XPath, XSLT and XQuery share the same function set. We can get the same results by using a loop, which helps us generating more readable and sometimes more efficient queries. This is becoming more important in a further step:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">count(
for $occurrence in collection('/db/data/tgrep/')//tei:textClass/tei:keywords/tei:term[text() = 'drama']
return $occurrence)</code></pre></figure>
<p>Click on “Eval” and wait some seconds after which the output window returns a number, but what is this: 703? <strong>Are there, all of a sudden, 703 dramas in our corpus?</strong> Rhetorical question, of course not. So what happened? Obviously, there are some appearances of “drama” outside of TEI documents. So let’s specify our query and look just for occurrences of “drama” as a genre in TEI documents:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">count(collection('/db/data/tgrep/')//tei:textClass[ancestor::tei:TEI]/tei:keywords/tei:term[text() = 'drama'])</code></pre></figure>
<p>We added the part <code class="highlighter-rouge">[ancestor::tei:TEI]</code> which tells the engine that we look for the occurrence in TEI documents only, and we leave the <code class="highlighter-rouge">teiCorpus</code> uncounted. “TEI” here is the root element of a TEI document. <strong>And look, we end up at 690, good!</strong> We just reproduced the result we got from the search form. The nice thing about reproducing this result is that we don’t stop here. With XQuery we can do much more.</p>
<p>For example, let’s try to substract the 690 from the 703 pieces found earlier. This is interesting as it points us to a bunch of subcorpora in the repository containing a number of dramas. By executing the following query …</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">collection('/db/data/tgrep/')//tei:textClass[*not*(ancestor::tei:TEI)]/tei:keywords/tei:term[text() = 'drama']/base-uri()</code></pre></figure>
<p>… we get 13 evidences. More precisely, we get the resource addresses within the database (comparable to the file name):</p>
<ul>
<li>/db/data/tgrep/Literatur-Arnim%2C-Ludwig-Achim-von.xml</li>
<li>/db/data/tgrep/Literatur-Goethe%2C-Johann-Wolfgang-001.xml</li>
<li>/db/data/tgrep/Literatur-Grabbe%2C-Christian-Dietrich.xml</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml</li>
<li>/db/data/tgrep/Literatur-Hebbel%2C-Friedrich.xml</li>
<li>/db/data/tgrep/Literatur-Immermann%2C-Karl.xml</li>
<li>/db/data/tgrep/Literatur-Metastasio%2C-Pietro.xml</li>
<li>/db/data/tgrep/Literatur-Scheerbart%2C-Paul.xml</li>
<li>/db/data/tgrep/Literatur-Schiller%2C-Friedrich.xml</li>
<li>/db/data/tgrep/Literatur-Schnitzler%2C-Arthur.xml</li>
<li>/db/data/tgrep/Literatur-Scribe%2C-Eugene.xml</li>
<li>/db/data/tgrep/Literatur-Wagner%2C-Richard.xml</li>
</ul>
<p>So what about these 13 evidences? They describe a <code class="highlighter-rouge">teiCorpus</code>, but they are not part of a TEI document themselves. So they describe a subcorpus aggregating several dramatic texts.</p>
<p>Why does this happen? Because some dramas are split into several TEI subdocuments. How do we find out which? Here’s our query:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">collection('/db/data/tgrep/')//tei:textClass[not(ancestor::tei:TEI)]/tei:keywords/tei:term[text() = 'drama']/concat(base-uri(), ': ', (ancestor::tei:teiCorpus[1]//tei:fileDesc[1]/tei:titleStmt/tei:title/string())[1], ' > ', count(ancestor::tei:teiCorpus[1]//tei:TEI))</code></pre></figure>
<p>Yields the following output:</p>
<ul>
<li>/db/data/tgrep/Literatur-Arnim%2C-Ludwig-Achim-von.xml: Halle und Jerusalem > 4</li>
<li>/db/data/tgrep/Literatur-Goethe%2C-Johann-Wolfgang-001.xml: Faust. Eine Tragödie > 5</li>
<li>/db/data/tgrep/Literatur-Grabbe%2C-Christian-Dietrich.xml: Die Hohenstaufen > 2</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Panspiele > 4</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Die goldnen Straßen > 3</li>
<li>/db/data/tgrep/Literatur-Hebbel%2C-Friedrich.xml: Die Nibelungen > 5</li>
<li>/db/data/tgrep/Literatur-Immermann%2C-Karl.xml: Alexis > 3</li>
<li>/db/data/tgrep/Literatur-Metastasio%2C-Pietro.xml: L’isola disabitata > 2</li>
<li>/db/data/tgrep/Literatur-Scheerbart%2C-Paul.xml: Revolutionäre Theaterbibliothek > 23</li>
<li>/db/data/tgrep/Literatur-Schiller%2C-Friedrich.xml: Wallenstein > 4</li>
<li>/db/data/tgrep/Literatur-Schnitzler%2C-Arthur.xml: Marionetten > 3</li>
<li>/db/data/tgrep/Literatur-Scribe%2C-Eugene.xml: La dame blanche > 2</li>
<li>/db/data/tgrep/Literatur-Wagner%2C-Richard.xml: Der Ring des Nibelungen > 4</li>
</ul>
<p>The number at the end of each line shows us how many separate texts are contained in each subcorpus. So, Wagner’s “Ring of the Nibelungs”: check. Etc. etc. But there are still problems. E.g., Hebbel’s <a href="https://de.wikipedia.org/wiki/Die_Nibelungen_(Hebbel)">“Nibelungs”</a>, in reality, consist of merely 3 parts, not 5. So let’s refine our query to leave out all TEI documents that aren’t marked as “drama”:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">collection('/db/data/tgrep/')//tei:textClass[not(ancestor::tei:TEI)]/tei:keywords/tei:term[text() = 'drama']/concat(base-uri(), ': ', (ancestor::tei:teiCorpus[1]//tei:fileDesc[1]/tei:titleStmt/tei:title/string())[1], ' > ', count(ancestor::tei:teiCorpus[1]//tei:TEI[descendant::tei:term/text() = 'drama']))</code></pre></figure>
<ul>
<li>/db/data/tgrep/Literatur-Arnim%2C-Ludwig-Achim-von.xml: Halle und Jerusalem > 2</li>
<li>/db/data/tgrep/Literatur-Goethe%2C-Johann-Wolfgang-001.xml: Faust. Eine Tragödie > 5</li>
<li>/db/data/tgrep/Literatur-Grabbe%2C-Christian-Dietrich.xml: Die Hohenstaufen > 2</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Panspiele > 4</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Die goldnen Straßen > 3</li>
<li>/db/data/tgrep/Literatur-Hebbel%2C-Friedrich.xml: Die Nibelungen > 3</li>
<li>/db/data/tgrep/Literatur-Immermann%2C-Karl.xml: Alexis > 3</li>
<li>/db/data/tgrep/Literatur-Metastasio%2C-Pietro.xml: L’isola disabitata > 2</li>
<li>/db/data/tgrep/Literatur-Scheerbart%2C-Paul.xml: Revolutionäre Theaterbibliothek > 22</li>
<li>/db/data/tgrep/Literatur-Schiller%2C-Friedrich.xml: Wallenstein > 4</li>
<li>/db/data/tgrep/Literatur-Schnitzler%2C-Arthur.xml: Marionetten > 3</li>
<li>/db/data/tgrep/Literatur-Scribe%2C-Eugene.xml: La dame blanche > 2</li>
<li>/db/data/tgrep/Literatur-Wagner%2C-Richard.xml: Der Ring des Nibelungen > 4</li>
</ul>
<p>What do we have here? We received a list with all segmented dramas. How do we check if these numbers are reliable? Well, this one is not for the computer to decide, but for the humanist’s eye. Goethe’s “Faust”, in our repository, still consists of these <a href="http://www.textgridrep.de/browse.html?id=textgrid:11d4b.0">5 files</a>:</p>
<ul>
<li>Zueignung</li>
<li>Vorspiel auf dem Theater</li>
<li>Prolog im Himmel</li>
<li>Faust. Der Tragödie erster Teil</li>
<li>Faust. Der Tragödie zweiter Teil</li>
</ul>
<p>We could argue that the whole “Faust” is <em>one</em> integral piece. We could argue that Wagner’s “Ring of the Nibelung” is <em>one</em> piece. But we probably can’t declare the same thing for Scheerbart’s “Revolutionäre Theaterbibliothek” which consists of 22 pieces, and we probably shouldn’t count them as one.</p>
<p>Why this strange segmentation of some of the plays? This has to do with the origin of the TextGrid Repository, the zeno.org project. As we can see <a href="http://www.zeno.org/Literatur/M/Goethe,+Johann+Wolfgang/Dramen/Faust.+Eine+Tragödie">at the zeno.org website</a>, Goethe’s Faust is split into 5 parts there when it really should be split into 2 parts only, “Faust, part 1”, and “Faust, part 2”.</p>
<p>So let’s use the human brain and some semesters of studying literature (hehe) and decide what to count as a separate text and what not:</p>
<ul>
<li>/db/data/tgrep/Literatur-Arnim%2C-Ludwig-Achim-von.xml: Halle und Jerusalem > 2
<ul>
<li>double drama, new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Goethe%2C-Johann-Wolfgang-001.xml: Faust. Eine Tragödie > 5
<ul>
<li>two originary parts, new amount of plays: 2</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Grabbe%2C-Christian-Dietrich.xml: Die Hohenstaufen > 2
<ul>
<li>remains 2</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Panspiele > 4
<ul>
<li>no overlaps in personnel, remains 4</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Hauptmann%2C-Carl.xml: Die goldnen Straßen > 3
<ul>
<li>no overlaps in personnel, remains 3</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Hebbel%2C-Friedrich.xml: Die Nibelungen > 3
<ul>
<li>Hebbel himself describes the 3 parts as “one integral tragedy”, new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Immermann%2C-Karl.xml: Alexis > 3
<ul>
<li>overlaps in personnel, new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Metastasio%2C-Pietro.xml: L’isola disabitata > 2
<ul>
<li>one of the 2 parts is the Italian original, new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Scheerbart%2C-Paul.xml: Revolutionäre Theaterbibliothek > 22
<ul>
<li>completely several dramas, remains 22</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Schiller%2C-Friedrich.xml: Wallenstein > 4
<ul>
<li>new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Schnitzler%2C-Arthur.xml: Marionetten > 3
<ul>
<li>no overlaps in personnel, new amount of plays: 3</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Scribe%2C-Eugene.xml: La dame blanche > 2
<ul>
<li>one of the 2 parts is the French original, new amount of plays: 1</li>
</ul>
</li>
<li>/db/data/tgrep/Literatur-Wagner%2C-Richard.xml: Der Ring des Nibelungen > 4
<ul>
<li>remains 4</li>
</ul>
</li>
</ul>
<p>You will notice that some of our decisions are contingent. E.g., there <em>are</em> overlaps in personnel in the two parts of Goethe’s “Faust”. And the two parts of “Faust” <em>have been</em> put on stage together (cf. Peter Stein’s <a href="https://de.wikipedia.org/wiki/Faust-Projekt">Faust-Projekt</a>). Yet we would still argue that they are two different pieces. Others may think otherwise.</p>
<p>So we have to substract the results of this equation from our 690 found dramas:</p>
<p><code class="highlighter-rouge">690-((2-1)+(5-2)+(2-2)+(4-4)+(3-3)+(3-1)+(3-1)+(2-1)+(22-22)+(4-1)+(3-3)+(2-1)+(4-4)) = 690-13</code></p>
<p><strong>And we’re down to 677 dramas.</strong> We’re almost there! But there’s another thing we came across while working on the corpus: doublets.</p>
<h2 id="how-to-find-doublets">How to Find Doublets</h2>
<p>Due to the specific mapping in the repository every work is assigned to all of its authors which falsely doubles the number of dramas in cases of co-authorship. The full text can be found in only one of those documents and the others just contain the title and a reference (<code class="highlighter-rouge">tei:ref</code>) to the full text. If a piece has two authors, it has got two TEI headers. So when looking for occurrences of the string “genre” in the TEI element <code class="highlighter-rouge">textClass</code>, we’re counting the drama twice. But altogether, that one’s easy-peasy, we just have to substract the redundant item.</p>
<p>But how do we find out how many theatre pieces are counted twice when using our previous query? This is the last step in order to answer our central question!</p>
<p>To determine the differences of the documents created by more than one author we have to look at the TEI code. The <code class="highlighter-rouge"><text></code> node we find <a href="https://textgridlab.org/1.0/tgcrud-public/rest/textgrid:qn2m.0/data">in Kalisch’s document</a> is not empty which makes it a bit more complicated:</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><text></span>
<span class="nt"><body></span>
<span class="nt"><div</span> <span class="na">type=</span><span class="s">"text"</span> <span class="na">xml:id=</span><span class="s">"tg4.3"</span><span class="nt">></span>
<span class="nt"><milestone</span> <span class="na">unit=</span><span class="s">"sigel"</span> <span class="na">n=</span><span class="s">"Berg-Berlin"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.1"</span><span class="nt">/></span>
<span class="nt"><head</span> <span class="na">type=</span><span class="s">"h4"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.3"</span><span class="nt">></span>O. F. Berg / David Kalisch<span class="nt"></head></span>
<span class="nt"><head</span> <span class="na">type=</span><span class="s">"h2"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.4"</span><span class="nt">></span>
<span class="nt"><ref</span> <span class="na">cRef=</span><span class="s">"/Literatur/M/Berg, O. F./Drama/Berlin, wie es weint und lacht"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.4.1"</span><span class="nt">></span>Berlin, wie es weint und lacht<span class="nt"></ref></span>
<span class="nt"></head></span>
<span class="nt"><head</span> <span class="na">type=</span><span class="s">"h4"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.5"</span><span class="nt">></span>Volksstück mit Gesang<span class="nt"></head></span>
<span class="nt"><head</span> <span class="na">type=</span><span class="s">"h4"</span> <span class="na">xml:id=</span><span class="s">"tg4.3.6"</span><span class="nt">></span>in 3 Aufzügen und 11 Bildern<span class="nt"></head></span>
<span class="nt"></div></span>
<span class="nt"></body></span>
<span class="nt"></text></span></code></pre></figure>
<p>The <code class="highlighter-rouge">cRef</code> attribute tells us that the actual text is to be found in the XML file dedicated to the <em>other</em> co-author, in this case, O. F. Berg. Now, to be able to distinguish between actual documents containing the dramatic text and documents that only contain a reference, we have to find a distinctive feature. Let’s try this one: The <a href="http://textgridrep.org/textgrid:kjgj.0">referenced TEI document</a> contains <code class="highlighter-rouge"><div></code> elements featuring <code class="highlighter-rouge">subtype="work:no"</code> attributes (this is to make sure that single scenes are not marked as separate “works”). The <a href="https://textgridlab.org/1.0/tgcrud-public/rest/textgrid:qn2m.0/data">Kalisch document</a> doesn’t have this feature, so that’s a good way to differentiate between the two. Mind you, you can always find other XML propoerties that suit you better, for example, look for <code class="highlighter-rouge"><sp></code> elements (Berg has it, Kalisch does not). But anyway, let’s execute a query that gives us all the documents lacking the mentioned <code class="highlighter-rouge">subtype="work:no"</code> attribute:</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">for $item in collection('/db/data/tgrep')//tei:TEI
where $item/tei:teiHeader//tei:keywords/tei:term/string() = 'drama' and not($item//tei:text//tei:div/@subtype="work:no")
return ($item//tei:title)[1]</code></pre></figure>
<p>The result is a list of 27 <code class="highlighter-rouge">tei:title</code> elements:</p>
<ul>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Der gefesselte Proemetheus</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Fidelio</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Die Geisterinsel</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Iphigenie in Aulis</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Medea</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Die Fledermaus</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Die jüngste Walpurgisnacht</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Zueignung</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Vorspiel auf dem Theater</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Prolog im Himmel</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Der Widerspenstigen Zähmung</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Pension Schöller</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Traumulus</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Im weißen Rößl</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Berlin, wie es weint und lacht</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Die Pfandung</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Der Besuch um Mitternacht</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Ablaßkrämer</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Doktor Faustus</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Das Mirakel</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Prolog</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Die Familie Selicke</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Genoveva</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">König Ödipus</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Antigone</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Gespenstersonate</title></code></li>
<li><code class="highlighter-rouge"><title xmlns="http://www.tei-c.org/ns/1.0">Fidelio</title></code></li>
</ul>
<p>The goal of our query was to find documents that don’t feature the actual text of a drama. But the very first result (<a href="http://textgridrep.org/textgrid:jkhx.0">“Der gefesselte Proemetheus”</a>) shows us that we have to refine our query because this play does contain text but does not feature any <code class="highlighter-rouge"><div></code> element with a specified <code class="highlighter-rouge">subtype="work:no"</code> attribute. To correct our results, let’s exclude all documents that contain a <a href="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-l.html"><code class="highlighter-rouge"><tei:l></code></a> or a <a href="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-p.html"><code class="highlighter-rouge"><tei:p></code></a> element (because, obviously, then they <em>do</em> contain running text):</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">count(for $item in collection('/db/data/tgrep')//tei:TEI
where $item/tei:teiHeader//tei:keywords/tei:term/string() = 'drama'
and not($item//tei:text//tei:div/@subtype="work:no")
and not($item//tei:text//tei:l)
and not($item//tei:text//tei:p)
return $item//tei:sourceDesc/tei:biblFull/tei:titleStmt/tei:title)</code></pre></figure>
<p>Ok, let’s try to translate this query into a humanly readable form:</p>
<p>First off, we use a <code class="highlighter-rouge">for</code> loop like explained before. This seperates the single TEI documents starting with a TEI node (<code class="highlighter-rouge">//tei:TEI</code>) from our whole data set (<code class="highlighter-rouge">collection('/db/data/tgrep')</code>). We now operate on single documents until the loop finishes. Then we specify a condition for the documents we like to take into focus (the <code class="highlighter-rouge">where</code> part). We …</p>
<ul>
<li>select all documents where the genre specification is “drama”,</li>
<li>exclude all documents that contain a <code class="highlighter-rouge">tei:div</code> where the attribute subtype has a “work:no” value,</li>
<li>also exclude every document that contains at least a single <code class="highlighter-rouge">tei:l</code> and, finally,</li>
<li>exclude all documents with at least a single paragraph (<code class="highlighter-rouge">tei:p</code>).</li>
</ul>
<p>Regarding the exclusion part, we are aware of the ancestor elements of the node, so we exclude documents only if we find the <code class="highlighter-rouge">tei:div</code>, <code class="highlighter-rouge">tei:l</code> and <code class="highlighter-rouge">tei:p</code> inside <code class="highlighter-rouge">tei:text</code>. Our loop returns the number of documents that match our pattern. If we omit the <code class="highlighter-rouge">count</code> function we receive the actual title information from the <code class="highlighter-rouge">teiHeader</code> (<code class="highlighter-rouge">tei:sourceDesc/tei:biblFull/tei:titleStmt/tei:title</code>) and the author information as well. So our query returns 11 items which are:</p>
<ul>
<li>Arno Holz und Oskar Jerschke: Traumulus. Achtes bis zehntes Tausend, Dresden: Carl Reißner, 1909.Author in TG Rep: Jerschke, Oskar (<a href="http://textgridrep.de/browse.html?id=textgrid:qmsf.0">link</a>)</li>
<li>Carl Laufs: Pension Schöller. Nach einer Idee von W. Jacoby, elfte Auflage, Berlin: Eduard Bloch Theaterverlag, [o.J.].Author in TG Rep: Jacoby, Wilhelm (<a href="http://textgridrep.de/browse.html?id=textgrid:qm9f.0">link</a>)</li>
<li>Hermann Goetz: Der Widerspenstigen Zähmung. Komische Oper in vier Akten, nach Shakespeares gleichnamigen Lustspiel frei bearbeitet von Joseph Viktor Widmann, Musik von Hermann Goetz, Zürich, Wien, München: Apollo-Verlag, [ca. 1925].Author in TG Rep: Goetz, Hermann Gustav (<a href="http://textgridrep.de/browse.html?id=textgrid:nkbw.0">link</a>)</li>
<li>Johann Friedrich Reichardt: Die Geisterinsel. Ein Singspiel in drey Akten, in: Friedrich Wilhelm Gotter: Literarischer Nachlass, Gotha: J. Perthes, 1802, S. 419–564.Author in TG Rep: Einsiedel, Friedrich Hildebrand von (<a href="http://textgridrep.de/browse.html?id=textgrid:mv72.0">link</a>)</li>
<li>Johann Strauß: Die Fledermaus. Operette in drei Aufzügen, Text nach H. Meilhac und L. Halévy von C. Haffner und Richard Genée, hg. v. Wilhelm Zentner, Stuttgart: Reclam, 1976.Author in TG Rep: Genée, Richard (<a href="http://textgridrep.de/browse.html?id=textgrid:n7s2.0">link</a>)</li>
<li>Ludwig van Beethoven: Fidelio. Oper in zwei Aufzügen, hg. v. Wilhelm Zentner, Stuttgart: Reclam, 1970.Author in TG Rep: Breuning, Stephan von (<a href="http://textgridrep.de/browse.html?id=textgrid:krfk.0">link</a>)</li>
<li>Ludwig van Beethoven: Fidelio. Oper in zwei Aufzügen, hg. v. Wilhelm Zentner, Stuttgart: Reclam, 1970.Author in TG Rep: Treitschke, Georg Friedrich (<a href="http://textgridrep.de/browse.html?id=textgrid:wfsf.0">link</a>)</li>
<li>Naturalismus_– Dramen. Lyrik. Prosa. Herausgegeben und mit einem Nachwort von Ursula Münchow, Band 1: 1885–1891, Berlin und Weimar: Aufbau, 1970.Author in TG Rep: Schlaf, Johannes (<a href="http://textgridrep.de/browse.html?id=textgrid:v18n.0">link</a>)</li>
<li>O.F. Berg und D[avid] Kalisch: Berlin, wie es weint und lacht. Leipzig: Verlag von Phillipp Reclam jun., [o.J.] [Universal-Bibliothek Nr. 4689].Author in TG Rep: Kalisch, David (<a href="http://textgridrep.de/browse.html?id=textgrid:qn2n.0">link</a>)</li>
<li>Oskar Blumenthal und Gustav Kadelburg: Im weißen Rössl. 16. Auflage, Berlin: Eduard Bloch Verlag, [o.J.].Author in TG Rep: Kadelburg, Gustav (<a href="http://textgridrep.de/browse.html?id=textgrid:qmt8.0">link</a>)</li>
<li>Robert Schumann: Genoveva. Oper in vier Akten nach Tieck und Hebbel, Berlin: Eduard Bloch, [1960].Author in TG Rep: Schumann, Robert Alexander (<a href="http://textgridrep.de/browse.html?id=textgrid:vkgs.0">link</a>)</li>
</ul>
<p>The majority of these texts are libretti for operas written by two authors and one work written by three collaborators (Beethoven’s “Fidelio”, to be precise).</p>
<p>But let’s jump to our initial question and to the final answer. How many dramas are contained in the TextGrid Rep? For that to answer, we just have to substract these 11 doublets and we end up at: <strong>666 dramas!</strong> A bit diabolic, but, in the end, just a number. (Speaking of which, have you heard the story of Route 666 and how it was renamed to Route 491? <a href="https://en.wikipedia.org/wiki/U.S._Route_491#U.S._Route_666">It’s a fun story, you can read it on Wikipedia.</a>)</p>
<p>A list with all the 666 dramas can be obtained via our GitHub account. Or, you can generate it yourself using the following XQuery where we also added an option in order to prepare this list for a website. You can store this query (Shift+Ctrl+s), for example, within the <code class="highlighter-rouge">/db/apps/</code> collection using the filename <code class="highlighter-rouge">tgrep.xql</code> and call it via <a href="http://localhost:8080/exist/rest/db/apps/tgrep.xql">this link</a>.</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">xquery version "3.0";
declare namespace tei = "http://www.tei-c.org/ns/1.0";
declare option exist:serialize "method=html5 media-type=text/html";
<ol>
{for $item in collection('/db/data/tgrep')//tei:TEI
where
$item/tei:teiHeader//tei:keywords/tei:term/string() = 'drama'
and ($item//tei:text//tei:div/@subtype="work:no"
or $item//tei:text//tei:l
or $item//tei:text//tei:p)
order by ($item//tei:author)[1] || $item//tei:fileDesc[1]/tei:titleStmt/tei:title
return
<li>
{($item//tei:author)[1]/string() || ': ' || $item//tei:fileDesc[1]/tei:titleStmt/tei:title/string()}
</li>
}
</ol></code></pre></figure>
<p>Please mind that this list still contains 679 texts. We still have to substract the texts that belong to an integral play. As described before, we decided to bundle 5 dramatic pieces that consist of several parts and glued them together in a new XML file:</p>
<ul>
<li>Arnim: “Halle und Jerusalem”,</li>
<li>Goethe: “Faust, Teil 1”,</li>
<li>Hebbel: “Nibelungen”,</li>
<li>Immermann: “Alexis”,</li>
<li>Schiller: “Wallenstein”.</li>
</ul>
<p>Plus, we had to delete the two original (non-German) pieces (a French and an Italian one) to get down to our 666 pieces. Now our list only contains German-language texts of the genre ‘drama’. We uploaded the 666 XML files to our Github <a href="https://github.com/DLiNa/project/tree/master/data/textgrid-repository-dramas"><strong>here</strong></a>. A list of all the plays can be found <a href="https://github.com/DLiNa/project/blob/master/data/TextGrid-Repository---List-of-all-dramatic-texts.txt"><strong>here</strong> (in a .txt file)</a>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Whenever you obtain a corpus on the web, one that you didn’t build yourself, you have to deeply look into it to know your way around it. Trying to answer simple questions as we did in this blog post can help a great deal to lay the groundwork.</p>
<p>So now you made it. This paragraph concludes this 30.000-character blog post. Tomorrow we will deliver a shorter piece revolving around inconsistent metadata and what you can do about it. Howgh!</p>
<p><a href="https://dlina.github.io/A-Not-So-Simple-Question/">A (Not So) Simple Question and a Somewhat Diabolic Answer</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 18, 2015.</p>https://dlina.github.io/Longest-German-Language-Theatre-Plays2015-06-08T00:00:00+02:002015-06-08T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Ok, time for some Digital Humanities fun facts! We had another meeting today and, as always, were working our way through the vast <a href="http://www.textgridrep.de/">TextGrid Repository</a>. Since we’re only interested in the dramatic texts contained in the corpus, we had to find a way to automatically extract these kinds of texts which isn’t as easy as it sounds. Anyway, we finally managed to do so and also wrote a small (well …) 30.000-character piece on the subject which is to appear later. For the time being, the extracted dramas can be found as single XML files <a href="https://github.com/DLiNa/project/tree/master/data/textgrid-repository-dramas">here on our GitHub</a>.</p>
<p>When we were looking at the files we had the quick idea to make a list of the <strong>top 10 longest German-language theatre plays contained in the TextGrid Repository</strong>. And here they are, measured by their file size:</p>
<ol>
<li>Holz, Arno: Ignorabimus (2.1 MB)</li>
<li>Schiller, Friedrich: Wallenstein (1.99 MB)</li>
<li>Fouqué, Friedrich de la Motte: Der Held des Nordens (1.88 MB)</li>
<li>Brentano, Clemens: Die Gründung Prags (1.81 MB)</li>
<li>Baggesen, Jens: Der vollendete Faust oder Romanien in Jauer (1.69 MB)</li>
<li>Hebbel, Friedrich: Die Nibelungen (1.61 MB)</li>
<li>Immermann, Karl: Alexis (1.49 MB)</li>
<li>Rosner, Ferdinand: Oberammergauer Passionspiel (1.48 MB)</li>
<li>Grabbe, Christian Dietrich: Herzog Theodor von Gothland (1.40 MB)</li>
<li>Arnim, Ludwig Achim von: Halle und Jerusalem (1.35 MB)</li>
</ol>
<p>At least two thirds of each file is TEI markup (wild guess). In some cases, the markup is really bloating the file size, so here is another version of our top 10, this time measured by the number of words inside <a href="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-sp.html"><code class="highlighter-rouge"><sp></code></a> (since we’re talking about theatre plays here):</p>
<ol>
<li>Holz, Arno: Ignorabimus (100,283 words)</li>
<li>Arnim, Ludwig Achim von: Halle und Jerusalem (74,675 words)</li>
<li>Brentano, Clemens: Die Gründung Prags (70,672 words)</li>
<li>Fouqué, Friedrich de la Motte: Der Held des Nordens (63,074 words)</li>
<li>Schiller, Friedrich: Wallenstein (56,820 words)</li>
<li>Tieck, Ludwig: Prinz Zerbino oder die Reise nach dem guten Geschmack (56,759 words)</li>
<li>Holz, Arno: Sonnenfinsternis (53,909 words)</li>
<li>Rosner, Ferdinand: Oberammergauer Passionspiel (52,717 words)</li>
<li>Goethe, Johann Wolfgang: Faust. Der Tragödie zweiter Teil (46,180 words)</li>
<li>Müller, Friedrich (Maler Müller): Golo und Genovefa (45,904 words)</li>
</ol>
<p>As you can see, <a href="https://en.wikipedia.org/wiki/Arno_Holz">Arno Holz</a> rules them all! His monstrous naturalistic drama <em>Ignorabimus</em> from 1913 is a fair 500-pager as shows <a href="http://d-nb.info/573829322">a quick glance at the catalogue of the German National Library</a>.</p>
<p>For the fans, this is our query for the second list, using eXist-db (“textgrid-repository-dramas” is the name of our collection):</p>
<figure class="highlight"><pre><code class="language-xquery" data-lang="xquery">xquery version "3.0";
declare namespace tei = "http://www.tei-c.org/ns/1.0";
for $file in xmldb:get-child-resources('/db/data/textgrid-repository-dramas')
order by count(tokenize(string-join(doc('/db/data/textgrid-repository-dramas/' || $file)//tei:sp), '\W+')[. != '']) descending
return (count(tokenize(string-join(doc('/db/data/textgrid-repository-dramas/' || $file)//tei:sp), '\W+')[. != '']), $file)</code></pre></figure>
<p>Ok, there’s more where this came from, stay tuned! :-)</p>
<h3 id="update-one-hour-after-touchdown">Update (one hour after touchdown)</h3>
<p>Quickly answering a question raised by Nils <a href="https://twitter.com/umblaetterer/status/607945947348406273">on Twitter</a>: “Where is Karl Kraus: Die letzten Tage der Menschheit?!” Well, unfortunately, the <a href="http://de.wikipedia.org/wiki/Die_letzten_Tage_der_Menschheit">ultimate German-language mega drama</a> is not contained in the TextGrid Repository. But it would certainly crush all the other plays. We dug out the <a href="http://gutenberg.spiegel.de/">Gutenberg-DE DVD</a> and counted the words like this:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">w3m <span class="nt">-dump</span> <span class="nt">-I</span> <span class="s1">'iso-8859-1'</span> <span class="nt">-T</span> text/html letzttag.xml | wc <span class="nt">-w</span></code></pre></figure>
<p>Yielded <strong>187,696 words</strong>. To put it short: Karl Kraus beats Arno Holz any time. Please mind that we did not limit the Kraus word count to just the spoken words like we did with the XML files (by just counting the words uttered inside <code class="highlighter-rouge"><sp></code>). But even if we have to substract a couple of thousand words, the result remained the same.</p>
<p><a href="https://dlina.github.io/Longest-German-Language-Theatre-Plays/">Longest German-Language Theatre Plays</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on June 08, 2015.</p>https://dlina.github.io/Road-to-Sydney2015-05-08T00:00:00+02:002015-05-08T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>Met today to work on our stuff for <a href="http://dh2015.org/">Sydney</a>. Office panorama:</p>
<figure>
<img src="https://dlina.github.io/images/photos/2015-05-08_10'49_working_session_at_gcdh.jpg" alt="Office Panorama" style="width:56.25rem" />
</figure>
<p>Wanted to include a Sydney screenshot from International Karate (spirit of 1986!), but <strong><a href="http://csdb.dk/release/viewpic.php?id=108752&zoom=1">a link to the screenshot</a></strong> will do.</p>
<p><a href="https://dlina.github.io/Road-to-Sydney/">Road to Sydney</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on May 08, 2015.</p>https://dlina.github.io/Conference_in_Munich2015-03-10T00:00:00+01:002015-03-10T00:00:00-00:00Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilckehttps://dlina.github.io
<p>In a few days, March 12/13, we’re taking part at a conference at Bayerische Akademie der Wissenschaften, <strong>Computer-based analysis of drama and its uses for literary criticism and historiography</strong>:</p>
<ul>
<li>The CfP is <a href="http://dhd-blog.org/?p=3808">here</a>. The program can be found <a href="http://www.badw.de/de/veranstaltungen/_ergaenzungen/2015/402/2015_03_12_workshop-dennerlein_final.pdf">here (PDF)</a>.</li>
<li>Our presentation will be held on Thursday, 12 March 2015, 17:15, in German: <strong>Digitale Netzwerkanalyse dramatischer Texte</strong>.</li>
</ul>
<p>Update:</p>
<ul>
<li>The conference can be relived on Twitter: <strong><a href="https://twitter.com/search?q=%23CompDrama15">#CompDrama15</a></strong>.</li>
</ul>
<p><a href="https://dlina.github.io/Conference_in_Munich/">Conference in Munich</a> was originally published by Frank Fischer, Mathias Göbel, Dario Kampkaspar, Peer Trilcke at <a href="https://dlina.github.io">Network Analysis of Dramatic Texts</a> on March 10, 2015.</p>