The cooperation project applies two methods developed in computer science to the field of empirical communication science, which enable semi-automated content analyses to examine even huge amounts of data.
In this project, we apply two methods developed in computer science - Few-Shot Learning and Argument Mining - to the field of empirical communication science. Automated content analysis (ACA) should thus be able to examine even large amounts of data with little coding effort.
The approach required for this as well as the technical implementation will be developed in an exemplary study of positions and argument patterns on Twitter surrounding the Covid 19 pandemic.
As part of the project, we will provide scientific publications, best practices as well as software and e-learning resources that will enable communication science to tap into these new technologies from computer science and develop them further according to their own subject requirements.
For the communication of these new methods, our project focuses on young scientists, who will have the opportunity to acquire data competencies for ACA in methodological workshops.
Cover: Arno Senoner / unsplash
Content analysis is one of the central methods of empirical communication science. The ever-increasing amount and availability of digital public communication content makes (partial) automation of content analysis (ACA) imperative.
In computer science, two areas have been intensively researched in the recent past whose current results hold enormous potential for ACA and thus also for increasing data literacy in communication science as a whole: With pre-trained language models based on neural transformation networks (e.g. BERT) and Few-Shot (FS) text classification based on them, it is possible to identify content categories reliably with comparatively little training data. Argument mining methods also enable the automatic coding of argument components and stances.
This addresses two central desiderata of current content analysis research: the evaluation of very large text volumes with semantically complex categories.