Towards machine learning based text categorization in the financial domain

Frederic Voigt, Jose Alcaraz Calero, Keshav Dahal, Qi Wang, Kai Von Luck

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Downloads (Pure)

Abstract

Despite the widespread research on text categorization in various Natural Language Processing (NLP) domains, there exists a noticeable void concerning its application to financial data. This study addresses this gap by employing pre-trained Bidirectional Encoder Representations from Trans-formers (BERT) models, fine-tuned specifically for the financial domain, to categorize newspaper articles focusing on financial topics. This is the first time that the dataset presented in this paper has been used. Further we evaluate the efficacy of established models in sentiment prediction using these rather long texts. Finally, we delve into the intricacies of company-specific sentiment and relevance prediction within these articles, acknowl-edging the prevalence of multiple companies being mentioned in one article, thus contributing to a more nuanced understanding of text analysis in the financial sector.
Original languageEnglish
Title of host publication2024 IEEE 3rd Conference on Information Technology and Data Science (CITDS)
PublisherIEEE
Number of pages6
ISBN (Electronic)9798350387889
ISBN (Print)9798350387896
DOIs
Publication statusPublished - 17 Dec 2024

Fingerprint

Dive into the research topics of 'Towards machine learning based text categorization in the financial domain'. Together they form a unique fingerprint.

Cite this