Rehm, G. (2005). Language-Independent Text Parsing of Arbitrary HTML-Documents. Towards A Foundation For Web Genre Identification. Journal for Language Technology and Computational Linguistics, 20(2), 53–74. https://doi.org/10.21248/jlcl.20.2005.75