A Study of Errors in the Output of Large Language Models for Domain-Specific Few-Shot Named Entity Recognition

Elena Volkanovska

doi:10.21248/jlcl.38.2025.281

A Study of Errors in the Output of Large Language Models for Domain-Specific Few-Shot Named Entity Recognition

Authors

Elena Volkanovska Institute of Linguistics and Literary Studies, Technische Universität Darmstadt https://orcid.org/0009-0002-8775-6136

DOI:

https://doi.org/10.21248/jlcl.38.2025.281

Keywords:

large language models, few-shot named entity recognition, error analysis

Abstract

This paper proposes an error classification framework for a comprehensive analysis of the output that large language models (LLMs) generate in a few-shot named entity recognition (NER) task in a specialised domain. The framework should be seen as an exploratory analysis complementary to established performance metrics for NER classifiers, such as F1 score, as it accounts for outcomes possible in a few-shot, LLMbased NER task. By categorising and assessing incorrect named entity predictions quantitatively, the paper shows how the proposed error classification could support a deeper cross-model and cross-prompt performance comparison, alongside a roadmap for a guided qualitative error analysis.

Downloads

Published

2025-07-08

How to Cite

Volkanovska, E. (2025). A Study of Errors in the Output of Large Language Models for Domain-Specific Few-Shot Named Entity Recognition. Journal for Language Technology and Computational Linguistics, 38(2), 31–42. https://doi.org/10.21248/jlcl.38.2025.281

Download Citation

Issue

Vol. 38 No. 2 (2025): LLM fails – Failed experiments with generative AI and what we can learn from them

Section

Research articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.