… canceled due to conference-host-based organisational difficulties.
This tutorial will be given by Nils Reiter, Gerhard Kremer, and Sarah Schulz;
it takes place at EADH in Galway, Ireland on December 7 (9:00-13:00).
Registration: Please register until Thursday, November 22nd high noon (11:59 AM Berlin time) using our organisator e-mail address <hackatorialims.uni-stuttgart.de>, letting us know which operating system you will work on (MacOS / Windows / Linux).
The aim of this tutorial is to give the participants concrete and practical insights into a standard case of automatic text analysis. Using the example of automatic recognition of entity references, we will discuss general assumptions, procedures, and methodological standards in machine learning. The participants can fathom and test the scope of such procedures when editing executable programming code.
There is no reason to blindly trust the results of machine learning tools in general and NLP tools in particular. The concrete insights into the “engine room” of machine learning methods allow participants to more realistically assess the potential and limitations of supervised text analysis tools. Perspectively, we hope to avoid the recurrent frustrations of using automatic text analysis techniques and their sometimes less than satisfactory results, and thus to promote the use and interpretation of the results of machine learning models. For their adequate usage in hermeneutic interpretation steps, the insight into influential technical details is indispensable. In particular, the type and origin of the training data is of importance for the quality of the machine-annotated data, as we will make clear in the tutorial.
In addition to a Python program for automatic annotation of entity references, with which we will work during the tutorial, we provide a heterogeneous, manually annotated corpus as well as the routines for evaluating and comparing annotations.