On 23.10.2020 the eHumanities Center CRETA entered a new phase with the foundation of an association (a Verein according to German law). On the one hand, the founding of the association takes into account the fact that scientists beyond Stuttgart are now working on similar goals and CRETA is connected to them in many ways, for example through joint projects. On the other hand, some CRETA members are now or soon will no longer be working at Stuttgart University. CRETA is thus being transformed into a decentralized center, which will in future operate as an association. The association will then continue many of the activities — coaching, workshops, hackatorials.
The approach regarding the “Visual Comparison of Networks”, that occupational group of visualization experts describe in the section Tools and Demos, was accepted for presentation at the PacificVis conference (April 23-26 in Bangkok).
This approach allows for the analysis of the evolution that a narrative text’s characters and their relationships take over the course of a storyline. To this end, a series of graphs that represent the character constellations in several different passages of the text can be displayed in several visual forms and thus be compared with each other. The text passages in turn are interleaved with these visualizations, which allows for the consideration of the characters’ denominations within the respective immediate context. The characters’ relationships can be further characterized by means of a summary of this context. In the case of large and multiply interconnected character sets, analysts can interact with the visualizations in order to filter and focus the elements of the graph in such a way, that the partial structures that are of interest become evident.
In two usage scenarios, it was demonstrated how a literature scholar might tackle a series of typical analysis tasks by means of the approach at hand. The scenarios were based on the one hand on a novel in modern English that had been enriched with automatically extracted character annotations, and on the other hand on a Middle High German text in which the characters had been annotated manually. The tasks demonstrated by the example of these texts comprised:
Correction of mistakes resulting from the automatic character extraction.
Quick apprehension of a character’s characteristics and of its function in the storyline’s complex of actions.
Identification of groups of characters that appear predominantly within one of the selected passages and of central “bridge characters” who connect these groups.
Characterization of relationships that central characters maintain with the others.
Verification of the hypothesis that the character graph changes substantially over the course of a series of text passages.
Verification of the hypothesis that these successive constellations are only interconnected by a few central characters.
The article by Markus John, Martin Baumann, David Schuetz, Steffen Koch, and Thomas Ertl will appear at PacificVis titled „A Visual Approach for the Comparative Analysis of Character Networks in Narrative Texts“.
In the special issue “Digital Mediävistik” of the journal “Das Mittelalter. Perspektiven mediävistischer Forschung”, an article about Social Network Analysis of Middle High German Arthurian romances will soon be published. The article addresses the question of the relationship between fairy tales and Arthurian romances, but aims to systematically and methodically redefine it. For this purpose, we firstly identify properties of the European folktale, which we, secondly, operationalize for the computational analysis and apply to a text corpus consisting of classic Arthurian romances (Hartmann’s von Aue: ‚Erec‘, and ‚Iwein’, Wolfram’s von Eschenbach ‚Parzival‘).
The investigation is carried out using data-driven methods, primarily the Social Network Analysis, and focusses on various aspects of the characters. In this way, we gain a differentiated understanding of the relation between Arthurian romances and the ‘simple form’ of fairy tales on the one hand, and the differences within the selected texts on the other hand. We show that the complex results of the statistical analyses refuse clear interpretations and thus provide new insights into the well-known objects.
The special issue “Digitale Mediävistik” including this article by Manuel Braun and Nora Ketschik will be published presumably in June 2019.
The proposal for an extension of CRETA was recently accepted. From January, the Center for Reflected Text Analytics will be funded for 2 more years by the German Federal Ministry of Education and Research.
… canceled due to conference-host-based organisational difficulties.
This tutorial will be given by Nils Reiter, Gerhard Kremer, and Sarah Schulz;
it takes place at EADH in Galway, Ireland on December 7 (9:00-13:00).
Registration: Please register until Thursday, November 22nd high noon (11:59 AM Berlin time) using our organisator e-mail address <hackatorialims.uni-stuttgart.de>, letting us know which operating system you will work on (MacOS / Windows / Linux).
The aim of this tutorial is to give the participants concrete and practical insights into a standard case of automatic text analysis. Using the example of automatic recognition of entity references, we will discuss general assumptions, procedures, and methodological standards in machine learning. The participants can fathom and test the scope of such procedures when editing executable programming code.
There is no reason to blindly trust the results of machine learning tools in general and NLP tools in particular. The concrete insights into the “engine room” of machine learning methods allow participants to more realistically assess the potential and limitations of supervised text analysis tools. Perspectively, we hope to avoid the recurrent frustrations of using automatic text analysis techniques and their sometimes less than satisfactory results, and thus to promote the use and interpretation of the results of machine learning models. For their adequate usage in hermeneutic interpretation steps, the insight into influential technical details is indispensable. In particular, the type and origin of the training data is of importance for the quality of the machine-annotated data, as we will make clear in the tutorial.
In addition to a Python program for automatic annotation of entity references, with which we will work during the tutorial, we provide a heterogeneous, manually annotated corpus as well as the routines for evaluating and comparing annotations.
The Social Science working unit recently published a new methodological article on the analysis of complex theoretical concepts via the use of corpus analytic and computational linguistic methods.
We identify three fundamental challenges impeding the methodological quality and long-term reputation of these promising new technologies. First, generating and pre-processing very large text corpora is still a laborious and costly enterprise. Secondly, Social Scientists want to learn from text about societal context and reconstruct meaning along the lines of complex theoretical concepts. The semantically valid operationalization of complex social-scientific concepts, however, remains a problem. Thirdly, scholars need flexible data output and visualization options to connect the data generated by corpus-linguistic methods with the discipline’s existing research. Many tools designed for linguistic research questions do not provide options suitable for social scientific research. We will conclude that it is possible to solve these problems; however, hermeneutically sensitive uses of computer-linguistic methods will take much more time, work and creativity than often assumed. Moreover, there can be no one-size-fits-all solutions to these problems. Social scientists need and want to decide upon methodological questions in the light of their oftentimes highly specific research questions. The process of reflectively appropriating big-data methods in the Social Sciences has only just begun.
On October 11, CRETA member Roman Klinger gave a keynote talk titled Emotion Analysis: between Academia, Industry, Linguistics, Humanities, and Computer Science at the AI2Future unconference in Zagreb, Croatia.
Emotion analysis is a obvious extension to the popular task of sentiment analysis, however, methods and applications differ. In a first part of this talk, I provide a brief overview on the psychological background on emotions and their purpose and how this leads to challenges when they are estimated from text. I highlight open research questions and possible research directions. The second part is concerned with applications in a variety of areas: What can emotion analysis contribute to the humanities, for instance literary studies? Finally, I will briefly report on use-cases for emotion analysis in a text analysis platform for direct use by customers.
As part of the late summer school machine learning for language analysis”, Nils Reiter gave an introduction into machine learning for reflected text analysis at Cologne University. Part of the tutorial was a shared task for finding entity references on the CRETA data sets. The detailed agenda as well as supplementary material can be found here.
The workshop introduces the concept of reflected text analytics, and covers various relevant topics to that end. The core idea is to “lift the veil”: The participants learn both theoretical concepts and their practical implementation on real code, such that they are able to apply the learned concepts on their own research questions. Topics of the workshop will be: Annotation and concept development through annotation, programming with python for text processing and machine learning, machine learning in theory and application. Participants work on their own programs and data, write code and train models by themselves (under guidance). Previous knowledge is not required, but a laptop and internet connection.