arbeiten:absa-implicit-explicit-aspect-extraction

Methoden zur Extraktion impliziter und expliziter Aspekte bei der Sentiment Analyse

Thema:
Methoden zur Extraktion impliziter und expliziter Aspekte bei der Sentiment Analyse
Art:
BA
BetreuerIn:
Jakob Fehle
BearbeiterIn:
Lucas Müller
ErstgutachterIn:
Christian Wolff
ZweitgutachterIn:
Udo Kruschwitz
Status:
abgeschlossen
Stichworte:
Natural Language Processing, Machine Learning, Data Science, NLP, ML, Sentiment Analyse, Text Mining
angelegt:
2023-03-23
Antrittsvortrag:
2023-07-17

Hintergrund

In contrast to the most commonly used Sentiment Analysis Methods, which seek to detect the polarity of sentiments on a sentence and or document level, Aspect Based Sentiment Analysis follows a finegrained approach, where polarities of extracted terms referred to domain specific categories - referred as 'aspects' - are determined. Datasets created during the Semantic Evaluation Conferences in 2014 and 2016 for the purpose of Aspect-based Sentiment Analysis are widely used as benchmark datasets for evaluating different machine learning and deep learning approaches as well as competing against each other in terms of their performances in specific tasks, given by the evaluation of the datasets. My thesis aimes to evaluate different classification approaches with state of the art models within a distinct subtask of ABSA - Aspect Term Extraction. Aspect Term Extraction has the goal of extracting explicit and implicit Aspects expressed in a sentence and is a crucial step in ABSA before determining the polarities of extracted aspects. As my main contribution, i curated my own dataset, consisting of german reviews of three international airlines from TripAdvisor, which are then labelled in terms of aspects as implicit and explicit ones and its polarities (e.g. sentiments) by university students of Regensburg and by myself. Current SOTA (State-of-the-Art) Models such as Transformers (e.g. BERT) will be trained on the dataset and evaluated with respect to the approaches applied to classifying the aspects.

Zielsetzung der Arbeit

Information-Extraction of implicit and explicit Aspects for Aspect-Based Sentiment Analysis

Konkrete Aufgaben

Dataset Curation

  • Data Scraping and Cleaning
  • Qualitativ and Quantitative Data Analysis
  • Stratified Sampling of the Dataset

Annotation Process

  • Determine Labelling Scheme
  • Annotation by Students of the University of Regensburg

Evaluation (Annotation)

  • Evaluation of the Annotations

ML-Training

  • Training
  • Evaluation of trained Models with respect to each approach

Erwartete Vorkenntnisse

  • Basic Knowledge of Sentiment Analysis and Text Classification
  • Machine Learning / Deep Learning (e.g. Transformers)
  • Python (Library pandas, numpy, pytorch etc.)
  • Statistics (Sampling Methods, Evaluation of Annotation and ML-Models)

Weiterführende Quellen

  1. Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 27–35, Dublin, Ireland. Association for Computational Linguistics. https://doi.org/10.3115/v1/S14-2004
  2. J. Z. Maitama, N. Idris, A. Abdi, L. Shuib and R. Fauzi, „A Systematic Review on Implicit and Explicit Aspect Extraction in Sentiment Analysis,“ in IEEE Access, vol. 8, pp. 194166-194191, 2020, https://doi.org/10.1109/ACCESS.2020.3031217.
  3. Wojatzki, M. M., Ruppert, E., Zesch, T., & Biemann, C. (Eds.). (2018). Proceedings of the GermEval 2017 – Shared Task on Aspect-based Sentiment in Social Media Customer Feedback. https://doi.org/10.17185/duepublico/46421
  4. Verma, K., Davis, B. Implicit Aspect-Based Opinion Mining and Analysis of Airline Industry Based on User-Generated Reviews . SN COMPUT. SCI. 2, 286 (2021). https://doi.org/10.1007/s42979-021-00669-7