2
I Use This!
Inactive
Analyzed about 11 hours ago. based on code collected about 11 hours ago.

Project Summary

Duke is a fast record linkage and deduplication engine written in Java. It provides both an API and a command-line interface, and supports incremental processing. There is also a genetic algorithm for automatically tuning configurations. Duke is based on Lucene.

Tags

dedup deduplication java recordlinkage recordlinking

Badges

In a Nutshell, Duke (Dupe Killer)...

Apache License 2.0
Permitted

Commercial Use

Modify

Distribute

Place Warranty

Sub-License

Private Use

Use Patent Claims

Forbidden

Hold Liable

Use Trademarks

Required

Include Copyright

State Changes

Include License

Include Notice

These details are provided for information only. No information here is legal advice and should not be used as such.

This Project has No vulnerabilities Reported Against it

Did You Know...

  • ...
    in 2016, 47% of companies did not have formal process in place to track OS code
  • ...
    check out hot projects on the Open Hub
  • ...
    nearly 1 in 3 companies have no process for identifying, tracking, or remediating known open source vulnerabilities
  • ...
    learn about Open Hub updates and features on the Open Hub blog

Languages

Java
94%
XML
6%
HTML
<1%

30 Day Summary

Sep 30 2025 — Oct 30 2025

12 Month Summary

Oct 30 2024 — Oct 30 2025