Exploring the Limits of LLMs for German Text Classification: Prompting and Fine-tuning Strategies Across Small and Medium-sized Datasets
DOI:
https://doi.org/10.21248/jlcl.38.2025.277Keywords:
LLM, text classification, German, prompting, fine-tuning, LLM fails, limitationsAbstract
Large Language Models (LLMs) are highly capable, state-of-the-art technologies and widely used as text classifiers for various NLP tasks, including sentiment analysis, topic classification, legal document analysis, etc. In this paper, we present a systematic analysis of the performance of LLMs as text classifiers using five German datasets from social media across 13 different tasks. We investigate zero- (ZSC) and few-shot classification (FSC) approaches with multiple LLMs and provide a comparative analysis with fine-tuned models based on Llama-3.2, EuroLLM, Teuken and BübleLM. We concentrate on investigating the limits of LLMs and on accurately describing our findings and overall challenges.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Elena Leitner, Georg Rehm

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.