Differences between revisions 1 and 17 (spanning 16 versions)
Revision 1 as of 2007-11-21 17:03:01
Size: 525
Comment:
Revision 17 as of 2008-05-19 12:27:23
Size: 1234
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
We use a patched version of the deduplicator. The patch fixes the following issues: We use a patched version of the 0.3.0-20061218 beta version of the deduplicator . The patches fixes the following issues in the !NetarchiveSuite:
Line 3: Line 3:
 * ARC records of >2GB caused arithmetic overflow and could not be read.
 * ...something about skipping long records...need to check it...
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1062 Bug 1062] Indexserver skips a lot of lines due to threading problem with !SimpleDateFormat
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1078 Bug 1078] !DeDuplikator index too large

Known bugs in 0.3.0-20061218b:
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1248 Bug 1248] NPE in deduplicator-0.3.0-20061218b.jar

==== Downloads ====
Line 8: Line 13:
[attachment:Deduplicator-0.3.0-20061218-src.zip Patched sourcecode Deduplicator 0.3.0-20061218-src.zip] [attachment:Deduplicator-0.3.0-20061218b.diff Patch against Deduplicator 0.3.0-20061218a]
Line 10: Line 15:
[attachment:Deduplicator-0.3.0-20061218a.zip Patched binary Deduplicator 0.3.0-20061218a.zip] [attachment:Deduplicator-0.3.0-20061218b-src.zip Patched sourcecode Deduplicator 0.3.0-20061218b-src.zip]

[attachment:Deduplicator-0.3.0-20061218a-bin.zip Patched binary Deduplicator 0.3.0-20061218a-bin.zip]

[attachment:Deduplicator-0.3.0-20061218b-bin.zip Patched binary Deduplicator 0.3.0-20061218b-bin.zip]


Note that the deduplicator must be compiled with the same version of heritrix as the !NetarchiveSuite uses,
or the deduplicator will fail to work during runtime.

We use a patched version of the 0.3.0-20061218 beta version of the deduplicator . The patches fixes the following issues in the NetarchiveSuite:

Known bugs in 0.3.0-20061218b:

Downloads

[attachment:deduplicator-0.3.0-20061218a.diff Patch against Deduplicator 0.3.0-20061218]

[attachment:Deduplicator-0.3.0-20061218b.diff Patch against Deduplicator 0.3.0-20061218a]

[attachment:Deduplicator-0.3.0-20061218b-src.zip Patched sourcecode Deduplicator 0.3.0-20061218b-src.zip]

[attachment:Deduplicator-0.3.0-20061218a-bin.zip Patched binary Deduplicator 0.3.0-20061218a-bin.zip]

[attachment:Deduplicator-0.3.0-20061218b-bin.zip Patched binary Deduplicator 0.3.0-20061218b-bin.zip]

Note that the deduplicator must be compiled with the same version of heritrix as the NetarchiveSuite uses, or the deduplicator will fail to work during runtime.

DeduplicatorPatches (last edited 2010-08-16 10:24:26 by localhost)