(1)
Rehm, G. Language-Independent Text Parsing of Arbitrary HTML-Documents. Towards A Foundation For Web Genre Identification. JLCL 2005, 20, 53-74.