Differences between revisions 19 and 21 (spanning 2 versions)
Revision 19 as of 2008-05-19 14:10:12
Size: 1532
Comment:
Revision 21 as of 2010-08-16 10:24:26
Size: 718
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
We use a patched version of the 0.3.0-20061218 beta version of the deduplicator (named deduplicator-0.3.0-20080502). The patches fixes the following issues in the !NetarchiveSuite: We use a patched version of the 0.3.0-20061218 beta version of the deduplicator. The patches fixes the following issues in the !NetarchiveSuite:
Line 3: Line 3:
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1062 Bug 1062] Indexserver skips a lot of lines due to threading problem with !SimpleDateFormat
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1078 Bug 1078] !DeDuplikator index too large
 * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1062|Bug 1062]] Indexserver skips a lot of lines due to threading problem with !SimpleDateFormat
 * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1078|Bug 1078]] !DeDuplikator index too large
Line 6: Line 6:
 * [https://gforge.statsbiblioteket.dk/tracker/?aid=1248 Bug 1248] NPE in deduplicator-0.3.0-20061218b.jar  * [[https://gforge.statsbiblioteket.dk/tracker/?aid=1248|Bug 1248]] NPE in deduplicator-0.3.0-20061218b.jar
Line 10: Line 10:
[attachment:deduplicator-0.3.0-20061218a.diff Patch against Deduplicator 0.3.0-20061218]

[attachment:Deduplicator-0.3.0-20061218b.diff Patch against Deduplicator 0.3.0-20061218a]

[attachment:Deduplicator-0.3.0-20061218b-src.zip Patched sourcecode Deduplicator 0.3.0-20061218b-src.zip]

[attachment:Deduplicator-0.3.0-20061218a-bin.zip Patched binary Deduplicator 0.3.0-20061218a-bin.zip]

[attachment:Deduplicator-0.3.0-20061218b-bin.zip Patched binary Deduplicator 0.3.0-20061218b-bin.zip]

[attachment:Deduplicator-0.3.0-20080502-src.zip Patched binary Deduplicator 0.3.0-20080502-src.zip]

[attachment:Deduplicator-0.3.0-20080502-bin.zip Patched binary Deduplicator 0.3.0-20080502-bin.zip]

[attachment:Deduplicator-0.3.0-20080502.diff Patch against Deduplicator 0.3.0-20061218b]
<<AttachList>>

We use a patched version of the 0.3.0-20061218 beta version of the deduplicator. The patches fixes the following issues in the NetarchiveSuite:

  • Bug 1062 Indexserver skips a lot of lines due to threading problem with SimpleDateFormat

  • Bug 1078 DeDuplikator index too large

  • Bug 1248 NPE in deduplicator-0.3.0-20061218b.jar

Downloads

  • [get | view] (2008-05-26 11:44:45, 2621.4 KB) [[attachment:deduplicator-0.3.0-20061218-patch-heritrix-1.12.1b.patch]]
  • [get | view] (2008-05-26 11:44:23, 0.7 KB) [[attachment:deduplicator-0.3.0-20061218-patch-index-NPE.patch]]
  • [get | view] (2008-05-26 11:44:30, 2.5 KB) [[attachment:deduplicator-0.3.0-20061218-patch-local-dateformat.patch]]
  • [get | view] (2008-05-27 08:54:28, 235.4 KB) [[attachment:deduplicator-0.3.0-20061218-patch-lucene-OutOfMemory-2.patch]]
  • [get | view] (2008-05-26 11:44:14, 24.0 KB) [[attachment:deduplicator-0.3.0-20061218-patch-lucene-OutOfMemory.patch]]
  • [get | view] (2008-05-26 11:44:38, 2648.3 KB) [[attachment:deduplicator-0.3.0-20061218-patched-20080522-cumulative.patch]]
  • [get | view] (2008-05-26 11:45:11, 0.9 KB) [[attachment:deduplicator-0.3.0-20061218-patched-20080522.patch]]
  • [get | view] (2008-05-27 08:54:35, 2859.6 KB) [[attachment:deduplicator-0.3.0-20061218-patched-20080527-cumulative.patch]]
  • [get | view] (2008-05-27 08:55:25, 0.9 KB) [[attachment:deduplicator-0.3.0-20061218-patched-20080527.patch]]
  • [get | view] (2008-05-26 11:43:51, 1929.8 KB) [[attachment:deduplicator-0.3.0-20061218-src.zip]]
 All files | Selected Files: delete move to page copy to page

Note that the deduplicator must be compiled with the same version of heritrix as the NetarchiveSuite uses, or the deduplicator will fail to work during runtime.

DeduplicatorPatches (last edited 2010-08-16 10:24:26 by localhost)