Return to Article Details Language-Independent Text Parsing of Arbitrary HTML-Documents. Towards A Foundation For Web Genre Identification Download Download PDF