Adaptive Information Extraction (ALP)The ALP package implements an information extraction algorithm, the Learning Pattern by Language Processing (LP) algorithm as described in F. Ciravegna, (LP)2, an Adaptive Algorithm for Information Extraction from Web-related Texts.
Simplified does the software its best to find rules to detect the start and the end of some text (also known as Named Entity Recognition). For example finding the Person Peter Vankman in the Text "Peter Venkman, Ph.D. is a fictional scientist and member of the Ghostbusters, appearing in the films Ghostbusters and Ghostbusters II"
What you need to have to start the rule learning process:
A somehow pre NLP'ized gate document (containg Part of Speech, LEMMA, or Gazetteer informations) Some manually annotated Tokens in the Document as true Positives (such as PERSON, ORGANIZATION) If everything works as expected you should get ready to use gate rules for the provided examples. Reports/statistics regarding precision, recall and b-fmeasure values are generated too.
The code uses but does not depend, on the gate framework (http://www.gate.ac.uk). The framework is therefore included in binary form in the java archiva in the distribution file.
The current version is 1.0-SNAPSHOT and available in the svn trunk. The older versions are deprecated.
Use Patent Claims
Include Install Instructions
These details are provided for information only. No information here is legal advice and should not be used as such.