Duke is a fast record linkage and deduplication engine written in Java. It provides both an API and a command-line interface, and supports incremental processing. There is also a genetic algorithm for automatically tuning configurations. Duke is based on Lucene.

Commercial Use
Modify
Distribute
Place Warranty
Sub-License
Private Use
Use Patent Claims
Hold Liable
Use Trademarks
Include Copyright
State Changes
Include License
Include Notice
These details are provided for information only. No information here is legal advice and should not be used as such.
| 30 Day SummarySep 30 2025 — Oct 30 2025
 | 12 Month SummaryOct 30 2024 — Oct 30 2025
 |