Automatic Authorship Classification for German Lyrics Using Naïve Bayes

Akshay Mendhakar; Mesian Tilmatine

doi:10.21248/jlcl.36.2023.242

Automatic Authorship Classification for German Lyrics Using Naïve Bayes

Authors

Akshay Mendhakar University of Warsaw - Faculty of Applied Linguistics https://orcid.org/0000-0001-6772-2030
Mesian Tilmatine Free University of Berlin - Department for Experimental and Neurocognitive Psychology https://orcid.org/0009-0001-1232-6353

DOI:

https://doi.org/10.21248/jlcl.36.2023.242

Keywords:

German Lyrics, Text Classification, Naïve Bayes, Machine Learning

Abstract

Text classification is a prevalent and essential machine-learning task. Machine learning classifiers have developed immensely since their inception. The naïve Bayes classifier is one of the most prominent supervised machine learning classifiers. In this experiment, we highlight the performance of Naïve Bayes for classifying of authors/artists on the German lyrics corpus (“Songkorpus”) and compare the classification results with other classifier algorithms. The corpus of investigation consists of six artists with 970 songs in total. Bayes model evaluation measures revealed a precision of 0.91, recall of 0.94, and F1-measure of 0.9. Furthermore, the classification performance with other classifier algorithms did not reveal any statistically significant difference in performance. The results of the study add to the high volume of reports on the classification accuracy of Naive Bayes for the task of lyrical classification.

Downloads

Published

2023-05-15

How to Cite

Mendhakar, A., & Tilmatine, M. (2023). Automatic Authorship Classification for German Lyrics Using Naïve Bayes . Journal for Language Technology and Computational Linguistics, 36(1), 171–182. https://doi.org/10.21248/jlcl.36.2023.242

Download Citation

Issue

Vol. 36 No. 1 (2023): Special Issue on Challenges in Computational Linguistics, Empiric Research & Multidisciplinary Potential of German Song Lyrics

Section

Research articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.