We use a patched version of the 0.3.0-20061218 beta version of the deduplicator. The patches fixes the following issues in the !NetarchiveSuite: * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1062|Bug 1062]] Indexserver skips a lot of lines due to threading problem with !SimpleDateFormat * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1078|Bug 1078]] !DeDuplikator index too large * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1248|Bug 1248]] NPE in deduplicator-0.3.0-20061218b.jar ==== Downloads ==== <> Note that the deduplicator must be compiled with the same version of heritrix as the !NetarchiveSuite uses, or the deduplicator will fail to work during runtime.