Exploring the United States Presidential Debates via Text Mining

Exploring the United States Presidential Debates via Text Mining
Thomas Schmidt
Anastasiia Mosiienko
Christian Wolff
Text Mining, Politics, Presidential Debates, Digital Humanities, Sentiment Analysis, Topic Modeling


The United States Presidential Debates are an important part of political discourse having important influence on the United States elections every four years. In recent years the language and discourse of these debates have been subject of numerous research in the humanities but also in computational text analysis. However, while the focus is currently on the debates of one specific election cycle or a limited set of debates, to our knowledge, there is no research concerning diachronic analysis of presendential debates througout the decades with modern state-of-the-art text mining techniques.

Zielsetzung der Arbeit

We want to examine the United States presidential debates since the initiation of these debates in the sixties via modern methods of text mining. For this we want to first create a structured corpus of all debates. We are primarly interested in the following research questions: - How did language and content of these debates change overtime, are there diachronic developments - Can we identify general and diachronic developments and differences considering language and content of the presidential candidates of the two major parties: the democrats and the republicans

To pursue these research questions we intend to apply methods of text mining like sentiment analysis, topic modeling and others.

Konkrete Aufgaben

  • Related Work
  • Development of a research agenda
  • Acquisition and creation of an adequate corpus for the task at hand
  • Diachronic analysis of the data via text mining methods like ngram-analysis, sentiment analysis, lexical categories or topic modeling
  • report and visualization of the results

Erwartete Vorkenntnisse

Knowledge in Python and various text mining methods is necessary Knowledge in web mining, corpus preparation and the handling of large amounts of text data is helpful

Weiterführende Quellen

El-Falaky, M. S. (2015). Vote for me. A corpus linguistic analysis of American presidential debates using functional grammar. Arts and Social Sciences Journal, 6(4), 1-13.

Lukito, J., Sarma, P. K., Foley, J., & Abhishek, A. (2019, June). Using time series and natural language processing to identify viral moments in the 2016 US Presidential Debate. In Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science (pp. 54-64).

Savoy, J. (2018). Trump’s and Clinton’s Style and Rhetoric during the 2016 Presidential Election. Journal of Quantitative Linguistics, 25(2), 168-189.

Vrana, L., & Schneider, G. (2017). Saying Whatever It Takes: Creating and Analyzing Corpora from US Presidential Debate Transcripts.