cpDetector is a java library for codepage detection of documents.
It may be extended with custom strategies for codepage detectio and ships with a strategy that parses html and xml for charset attributes and a strategy that is based upon guessing and frequency analysis of characters (jchardet facade) A command line executeable is shipped that allows to detect and sort documents by their codepage.
Use Patent Claims
Include Install Instructions
These details are provided for information only. No information here is legal advice and should not be used as such.