On 23.10.2020 the eHumanities Center CRETA entered a new phase with the foundation of an association (a Verein according to German law). On the one hand, the founding of the association takes into account the fact that scientists beyond Stuttgart are now working on similar goals and CRETA is connected to them in many ways, for example through joint projects. On the other hand, some CRETA members are now or soon will no longer be working at Stuttgart University. CRETA is thus being transformed into a decentralized center, which will in future operate as an association. The association will then continue many of the activities — coaching, workshops, hackatorials.
… canceled due to conference-host-based organisational difficulties.
This tutorial will be given by Nils Reiter, Gerhard Kremer, and Sarah Schulz;
it takes place at EADH in Galway, Ireland on December 7 (9:00-13:00).
Registration: Please register until Thursday, November 22nd high noon (11:59 AM Berlin time) using our organisator e-mail address <hackatorialims.uni-stuttgart.de>, letting us know which operating system you will work on (MacOS / Windows / Linux).
The aim of this tutorial is to give the participants concrete and practical insights into a standard case of automatic text analysis. Using the example of automatic recognition of entity references, we will discuss general assumptions, procedures, and methodological standards in machine learning. The participants can fathom and test the scope of such procedures when editing executable programming code.
There is no reason to blindly trust the results of machine learning tools in general and NLP tools in particular. The concrete insights into the “engine room” of machine learning methods allow participants to more realistically assess the potential and limitations of supervised text analysis tools. Perspectively, we hope to avoid the recurrent frustrations of using automatic text analysis techniques and their sometimes less than satisfactory results, and thus to promote the use and interpretation of the results of machine learning models. For their adequate usage in hermeneutic interpretation steps, the insight into influential technical details is indispensable. In particular, the type and origin of the training data is of importance for the quality of the machine-annotated data, as we will make clear in the tutorial.
In addition to a Python program for automatic annotation of entity references, with which we will work during the tutorial, we provide a heterogeneous, manually annotated corpus as well as the routines for evaluating and comparing annotations.
On October 11, CRETA member Roman Klinger gave a keynote talk titled Emotion Analysis: between Academia, Industry, Linguistics, Humanities, and Computer Science at the AI2Future unconference in Zagreb, Croatia.
Emotion analysis is a obvious extension to the popular task of sentiment analysis, however, methods and applications differ. In a first part of this talk, I provide a brief overview on the psychological background on emotions and their purpose and how this leads to challenges when they are estimated from text. I highlight open research questions and possible research directions. The second part is concerned with applications in a variety of areas: What can emotion analysis contribute to the humanities, for instance literary studies? Finally, I will briefly report on use-cases for emotion analysis in a text analysis platform for direct use by customers.
As part of the late summer school machine learning for language analysis”, Nils Reiter gave an introduction into machine learning for reflected text analysis at Cologne University. Part of the tutorial was a shared task for finding entity references on the CRETA data sets. The detailed agenda as well as supplementary material can be found here.
The workshop introduces the concept of reflected text analytics, and covers various relevant topics to that end. The core idea is to “lift the veil”: The participants learn both theoretical concepts and their practical implementation on real code, such that they are able to apply the learned concepts on their own research questions. Topics of the workshop will be: Annotation and concept development through annotation, programming with python for text processing and machine learning, machine learning in theory and application. Participants work on their own programs and data, write code and train models by themselves (under guidance). Previous knowledge is not required, but a laptop and internet connection.
The CUTE evaluation data set is now available. The data set consists of additional texts in the four genres (letter novel, medieval arthus legend, parliamentary debate, philosophical text).
The publicly available data set can be downloaded here (without the Adorno text, due to copyright). If you have registered for CUTE before, you can re-use your download link to receive the full archive (incl. Adorno) with both the training and test data. You can also register anew here.
Submissions to the CUTE shared task (track 1) should in either XMI or CoNLL formatted files. Please send your annotated files to <cuteims.uni-stuttgart.de> until Monday, Dec. 5th and specify a) what genre you worked on (if not all of them) and b) what kind of entity you annotated (if not all of them).
CRETA will be present at the DH2016 conference with two presentations and one workshop.
( Weiterlesen )
Titled Automatic Emotion Detection for Quantitative Literary Studies — A case study based on Franz Kafka’s “Das Schloss” und “Amerika”, Roman Klinger, Surayya Samat Suliya and Nils Reiter present a dictionary for emotions and first experiments for the automatic detection of emotions in literary texts. The paper can be downloaded here, the dictionaries as well as the code is publicly available here.
In the paper with the title Authorship attribution of Mediaeval German Text: style and contents in Apollonius von Tyrland, Sarah Schulz, Jonas Kuhn and Nils Reiter present stylometric works on the authorship of Heinrich von Neustadt’s Apollonius von Tyrland. The paper is available here.
While digitisation efforts are still under way, the workshop From Digitization to Knowledge (of which Nils Reiter is one of the organisers) focuses on the question of what to do with the digitised artefacts, as digitised artefacts are not supposed to be the end result of our efforts. Instead, we are interested in analysing, reading or understanding them.
The workshop takes place on July 11th, starting at 9:30 am.