In the special issue “Digital Mediävistik” of the journal “Das Mittelalter. Perspektiven mediävistischer Forschung”, an article about Social Network Analysis of Middle High German Arthurian romances will soon be published. The article addresses the question of the relationship between fairy tales and Arthurian romances, but aims to systematically and methodically redefine it. For this purpose, we firstly identify properties of the European folktale, which we, secondly, operationalize for the computational analysis and apply to a text corpus consisting of classic Arthurian romances (Hartmann’s von Aue: ‚Erec‘, and ‚Iwein’, Wolfram’s von Eschenbach ‚Parzival‘).
The investigation is carried out using data-driven methods, primarily the Social Network Analysis, and focusses on various aspects of the characters. In this way, we gain a differentiated understanding of the relation between Arthurian romances and the ‘simple form’ of fairy tales on the one hand, and the differences within the selected texts on the other hand. We show that the complex results of the statistical analyses refuse clear interpretations and thus provide new insights into the well-known objects.
The special issue “Digitale Mediävistik” including this article by Manuel Braun and Nora Ketschik will be published presumably in June 2019.
The proposal for an extension of CRETA was recently accepted. From January, the Center for Reflected Text Analytics will be funded for 2 more years by the German Federal Ministry of Education and Research.
… canceled due to conference-host-based organisational difficulties.
This tutorial will be given by Nils Reiter, Gerhard Kremer, and Sarah Schulz;
it takes place at EADH in Galway, Ireland on December 7 (9:00-13:00).
Registration: Please register until Thursday, November 22nd high noon (11:59 AM Berlin time) using our organisator e-mail address <hackatorialims.uni-stuttgart.de>, letting us know which operating system you will work on (MacOS / Windows / Linux).
The aim of this tutorial is to give the participants concrete and practical insights into a standard case of automatic text analysis. Using the example of automatic recognition of entity references, we will discuss general assumptions, procedures, and methodological standards in machine learning. The participants can fathom and test the scope of such procedures when editing executable programming code.
There is no reason to blindly trust the results of machine learning tools in general and NLP tools in particular. The concrete insights into the “engine room” of machine learning methods allow participants to more realistically assess the potential and limitations of supervised text analysis tools. Perspectively, we hope to avoid the recurrent frustrations of using automatic text analysis techniques and their sometimes less than satisfactory results, and thus to promote the use and interpretation of the results of machine learning models. For their adequate usage in hermeneutic interpretation steps, the insight into influential technical details is indispensable. In particular, the type and origin of the training data is of importance for the quality of the machine-annotated data, as we will make clear in the tutorial.
In addition to a Python program for automatic annotation of entity references, with which we will work during the tutorial, we provide a heterogeneous, manually annotated corpus as well as the routines for evaluating and comparing annotations.
The Social Science working unit recently published a new methodological article on the analysis of complex theoretical concepts via the use of corpus analytic and computational linguistic methods.
We identify three fundamental challenges impeding the methodological quality and long-term reputation of these promising new technologies. First, generating and pre-processing very large text corpora is still a laborious and costly enterprise. Secondly, Social Scientists want to learn from text about societal context and reconstruct meaning along the lines of complex theoretical concepts. The semantically valid operationalization of complex social-scientific concepts, however, remains a problem. Thirdly, scholars need flexible data output and visualization options to connect the data generated by corpus-linguistic methods with the discipline’s existing research. Many tools designed for linguistic research questions do not provide options suitable for social scientific research. We will conclude that it is possible to solve these problems; however, hermeneutically sensitive uses of computer-linguistic methods will take much more time, work and creativity than often assumed. Moreover, there can be no one-size-fits-all solutions to these problems. Social scientists need and want to decide upon methodological questions in the light of their oftentimes highly specific research questions. The process of reflectively appropriating big-data methods in the Social Sciences has only just begun.
On October 11, CRETA member Roman Klinger gave a keynote talk titled Emotion Analysis: between Academia, Industry, Linguistics, Humanities, and Computer Science at the AI2Future unconference in Zagreb, Croatia.
Emotion analysis is a obvious extension to the popular task of sentiment analysis, however, methods and applications differ. In a first part of this talk, I provide a brief overview on the psychological background on emotions and their purpose and how this leads to challenges when they are estimated from text. I highlight open research questions and possible research directions. The second part is concerned with applications in a variety of areas: What can emotion analysis contribute to the humanities, for instance literary studies? Finally, I will briefly report on use-cases for emotion analysis in a text analysis platform for direct use by customers.
As part of the late summer school machine learning for language analysis”, Nils Reiter gave an introduction into machine learning for reflected text analysis at Cologne University. Part of the tutorial was a shared task for finding entity references on the CRETA data sets. The detailed agenda as well as supplementary material can be found here.
The workshop introduces the concept of reflected text analytics, and covers various relevant topics to that end. The core idea is to “lift the veil”: The participants learn both theoretical concepts and their practical implementation on real code, such that they are able to apply the learned concepts on their own research questions. Topics of the workshop will be: Annotation and concept development through annotation, programming with python for text processing and machine learning, machine learning in theory and application. Participants work on their own programs and data, write code and train models by themselves (under guidance). Previous knowledge is not required, but a laptop and internet connection.