Rehm, Georg. “Language-Independent Text Parsing of Arbitrary HTML-Documents. Towards A Foundation For Web Genre Identification”. Journal for Language Technology and Computational Linguistics 20, no. 2 (July 1, 2005): 53–74. Accessed December 30, 2024. https://jlcl.org/article/view/75.