People are increasingly acquiring digital images of the world and of documents; often these images contain Roman letters. When viewing web pages with Flash and other accessibility issues we are increasingly faced with "pictures of letters". Such text cannot be copy-n-pasted into other documents, resized, offered to Google Language Tools for translation, etc. The free OCR software currently available offers poor recognition rates for realistic images.
RecognizeThis! will identify regions of text, deskew, extract font metrics, isolate lines and words, recognize text, and emit UTF8. Initial goal is to achieve > 90% character recognition rates for upper & lowercase letters. Then we can work on other characters, and later on diacriticals used by western European languages. The recognizer will be font independent, and will prefer grayscale input of at least 200 dpi. Letters will be normalized to adjust for font size. Dealing with underlined and italicized text is not a current goal. The project will not tackle script, Hindi, Asian languages, etc.; even recognizing printed handwriting is beyond the scope. It is within scope to get rates which are competitive with or better than gocr and ocrad. Some healthy competition is always a good thing.
Unit tests will roundtrip known ASCII input through a ghostscript renderer followed by the recognizer. Automated unit tests will ship with releases and should work fine for all users unless specially marked. Standard tools will be used for internal documentation and for code coverage measurements.
Tags
No tags have been added
In a Nutshell, recognize-this...
No code available to analyze
Open Hub computes statistics on FOSS projects by examining source code and commit history in source code management systems. This project has no code locations, and so Open Hub cannot perform this analysis
Is this project's source code hosted in a publicly available repository? Do you know the URL? If you do, click the button below and tell us so that Open Hub can generate statistics! It's fast and easy - try it and see!
in 2016, 47% of companies did not have formal process in place to track OS code
...
search using multiple tags to find exactly what you need
No code available to analyze
Open Hub computes statistics on FOSS projects by examining source code and commit history in source code management systems. This project has no code locations, and so Open Hub cannot perform this analysis
Is this project's source code hosted in a publicly available repository? Do you know the URL? If you do, click the button below and tell us so that Open Hub can generate statistics! It's fast and easy - try it and see!
This site uses cookies to give you the best possible experience.
By using the site, you consent to our use of cookies.
For more information, please see our
Privacy Policy