SPARTA: AI-supported analysis of real-time Twitter data
SPARTA stands for “Society, Politics and Risk with Twitter Analysis”.
In this project, a team from IABG's Innovation Centre is working together with experts from the Institute of Political Science at the University of the Federal Armed Forces in Munich on a platform and various procedures for analysing social media data. The use cases "Elections" and "Violent Riots" are considered as examples.
This interdisciplinary team from the fields of political science and computer science has built up a state-of-the-art infrastructure for nowcasting and ex-post analyses of Twitter data. For this purpose, procedures for network and statistical analyses as well as in the field of natural language processing, such as for so-called "stance detection", were integrated and further developed.
The cooperation in the SPARTA project offers the unique opportunity to make innovative data-driven approaches applicable to social and political science issues. It pursues the overarching goal of making these approaches accessible to smaller companies and researchers without IT resources as part of an open construction kit with the help of a no/low code modelling option.
Use Case Elections
So far, the federal election in 2021 and the state election in NRW in 2022 have been accompanied by social media analyses as part of the Elections use case. Next up is the state election in Bavaria in October 2023.
Nowcasting provides insight into the digital election campaign in the weeks before the elections. For this purpose, various stages were developed for a nowcasting pipeline in order to recognise named entities in a tweet in real time (Named Entity Recognition) or to divide the tweet into different pieces of text (Chunking). An essential processing stage was the observation of the attitude/sentiment of Twitter users towards the parties and their leading candidates (stance detection). In addition, the extent of activity as well as the most frequently mentioned topics and hashtags related to the election campaign are statistically analysed with further stages.
In addition to these analysis methods, methods were implemented to classify toxicity in tweets using a language model (Toxicity Classification) and to detect hate speech based on a lexicon approach (Hate Speech Detection). In addition, various network analyses were implemented based on retweets, hashtags and the meaning of the content of the tweets.
In general, election campaigns are emotionalised and personalised. This is intensified on social media, which could also be observed on Twitter for the Bundestag elections and the NRW state elections. Among other things, this became visible in the large number of attacking hashtags and the rather negative attitude towards politicians and parties.
In general, it has been shown that individual parties are more adept at digital campaigning than others and use social platforms in a correspondingly targeted manner.
More details about the project, evaluations of the 2021 federal election as well as the 2022 state election in NRW can be found on the SPARTA project website.
Use Case Riots
In another use case, violent riots are analysed on the basis of historical Twitter data. Specifically, the "Capitol Hill Riots" in the USA at the beginning of January 2021 have been considered so far.
For this purpose, hashtags were first selected to find the most relevant discourses for the period before and after the storming of the Capitol (November 2020 - January 2021). Hashtag networks were then created from the collection of hashtags to identify ideological communities. Furthermore, a descriptive analysis of the networks took place to find out, for example, the most influential user of the network and the number of tweets over the time period.
Language analyses were carried out on the basis of these communities in order to highlight the specific issues relevant to the respective groupings. In addition, the tweets were analysed for toxic language. It could be seen that in the week after the riots, more toxic tweets were found from the environment of the right-wing communities.
In addition to quantitative analysis, the focus is also on the classification between facts and opinions. This is important as it can help improve sentiment analysis and lead to a better representation of online discourse.