New article: The analysis of “soft” concepts with “hard” corpus-analytical methods

The Social Science working unit recently published a new methodological article on the analysis of complex theoretical concepts via the use of corpus analytic and computational linguistic methods.
We identify three fundamental challenges impeding the methodological quality and long-term reputation of these promising new technologies. First, generating and pre-processing very large text corpora is still a laborious and costly enterprise. Secondly, Social Scientists want to learn from text about societal context and reconstruct meaning along the lines of complex theoretical concepts. The semantically valid operationalization of complex social-scientific concepts, however, remains a problem. Thirdly, scholars need flexible data output and visualization options to connect the data generated by corpus-linguistic methods with the discipline’s existing research. Many tools designed for linguistic research questions do not provide options suitable for social scientific research. We will conclude that it is possible to solve these problems; however, hermeneutically sensitive uses of computer-linguistic methods will take much more time, work and creativity than often assumed. Moreover, there can be no one-size-fits-all solutions to these problems. Social scientists need and want to decide upon methodological questions in the light of their oftentimes highly specific research questions. The process of reflectively appropriating big-data methods in the Social Sciences has only just begun.

This research article by Cathleen Kantner and Maximilian Overbeck is part of the edited volume ‘Computational Social Science’ edited by Andreas Blätte, Joachim Behnke, Kai-Uwe Schnapp and Claudius Wagemann: