This page is about how to verify, that data i sdeduplicated

Click on the JobID for your finished snapshot harvest in the Job status overview

Click on "Browse reports for jobs"

Click on the "processors-report" e.g. "metadata://netarkivet.dk/crawl/reports/processors-report.txt?heritrixVersion=1.14.3&harvestid=1&jobid=1"

Check that there is a deduplicator processors-report like this one:

Total handled: 88 
Duplicates found: 20 20.0% 
Bytes total: 6391852 (6.1 MB) 
Bytes discarded: 0 (0  0.0% 
New (no hits): 88 
Exact hits: 0 
Equivalent hits: 0 
......