Release Notes for NetarchiveSuite 3.9.0

This version of NetarchiveSuite was released on 2009-08-10.

New features since NetarchiveSuite 3.8.*

Apart from a general fixing of bugs (see below) the most important new features are:

General

Common Module

It is now possible to override the implementation of the getBytesFree() method, that is by default calculated using the standard Java method File.getUsableSpace().

Harvester Module

NetarchiveSuite now works properly with MySQL (See bug 1254), and a deadline has now been introduced in the HarvestScheduler for jobs in status STARTED, by default 1 week.

Archive Module

Some excessive logging has been removed, and the previously fixed indexing bugs 1078 and 1079 was reopened and fixed again.

Bugs fixed since NetarchiveSuite 3.8.*

Common Module

Bug 1694 LocalArcRepositoryClient is broken
Bug 1712 Starting multiple applications on one machine leads to potential failure of startup
FR 1654 second-level domains for .at in settings.xml
FR 1709 Module getBytesFree()

Harvester Module

Bug 928 The guess of initial size of unharvested domains is very bad on harvests with a large object limit
Bug 1254 Database connections to MySQL close down intermittently
Bug 1611 Missing space in error message in DefinitionsSiteSection.initialize
Bug 1644 On Edit Domain page, the text field only shows 21 characters of the domainname
Bug 1646 dk.netarkivet.harvester.harvesting.distribute.MetadataEntry needs toString method
Bug 1650  It is not checked when creating the Heritrix process, that the JMX password file assigned to Heritrix exists
Bug 1670 Default timeout settings are set way too low in the default settings
Bug 1711 harvester that is not destroyable makes harvesterApplication take and immediately fail jobs
bUG 1718 The link to monitor Heritrix process does not necessarily give fully qualified hostnames
FR 1014 No good way to mark a non-reported-stopped job as FAILED or DONE
FR 1227 Log the Heritrix command line
FR 1628 Add custom JVM parameters to Heritrix subprocess
FR 1675 List of all Seeds of a selective Harvests
FR 1702 Value for max-trans-hops is way too high in default order templates
FR 1716 Increase size of input field, when uploading harvest templates
FR 1717 Increase crawler trap textarea size
FR 1723 Update the Heritrix templates in harvestdefinionbasedir/order_template_dist to Heritrix 1.14.3

Archive Module

Bug 1078 DeDuplikator index too large (refixed)
Bug 1079 snap shot harvest not browsable due to large index (refixed)
Bug 1547 Wrong synchronization in the IndexRequestServer and the FileBasedCache let two processes generate Index at the same time, and one of them fails
Bug 1722: Excessive logging in indexserver

Access Module

Bug 1700: The WebProxy.handle() method creates CreateErrorResponse for null Uri

Monitor Module

Deploy Module

Documentation

Bug 1636 Warnings in javadoc
Bug 1710 deploy Application seems not to support multiple FTP-servers

Upgrade instructions

Remember to stop the running installation before upgrading.

New settings

The following new settings have been introduced:

settings.common.database.validityCheckTimeout (default: 0): Timeout in seconds to check for the validity of a JDBC connection on the server. This is the time in seconds to wait for the database operation used to validate the connection to complete. If the timeout period expires before the operation completes, this method returns false. A value of 0 indicates a timeout is not applied to the database operation.

settings.common.freespaceprovider.class(Default: dk.netarkivet.common.utils.DefaultFreeSpaceProvider): The implementation class for free space provider, e.g. dk.netarkivet.common.utils.DefaultFreeSpaceProvider. The class must implement FreeSpaceProvider-Interface.

settings.harvester.harvesting.heritrix.javaOpts (default: ""): Additional JVM options for the Heritrix sub-process.

settings.harvester.harvesting.heritrixControllerClass (default: dk.netarkivet.harvester.harvesting.JMXHeritrixController): The implementation of the HeritrixController interface to be used.

New translations

If you are maintaining a translation, please note that the following new keys have been added:

archive/Translations.properties

harvester/Translations.properties

harvestdefinition.linktext.seeds=Seeds
harveststatus.seeds.total=Total
harveststatus.seeds.domains=Domains

viewerproxy/Translations.properties

Version History

Version 3.8.1

2009-07-15

Fix of important bug leading to unresponsive harvesters

Version 3.8.0

2009-05-23

Java 1.6, Heritrix 1.14.1, Derby 10.4.2.0, complete rewrite of settings, new supported deploy module, gui access to harvest logs

Version 3.7.0

2008-11-04

Develop version aiming for 3.8.0

Version 3.6.0

2008-07-03

Improvement of archive component with regard to security, batch, and preservation; greater JMS stability; important bug fixes

Version 3.5.*

Develop versions aiming for 3.6.0

Version 3.4.2

2008-03-14

Bug fix release, fixing JMX timeout

Version 3.4.1

2008-01-16

Bug fix release, fixing out of memory on very large indexes

Version 3.4.0

2008-01-03

Separation of Heritrix, work on developing our open source platform, two-part TLDs like co.uk, and lots of bugfixes

Version 3.3.*

Develop versions aiming for 3.4.0

Version 3.2.3

2007-09-27

Bugfix of 3.2.2 with patched deduplicator, that fixes problem in parallel indexing

Version 3.2.2

2007-08-03

Bugfix of 3.2.1 with patched Heritrix 1.12.1, that supports ARCRecords larger than 2GBs

Version 3.2.1

2007-07-04

Bugfix of 3.2.0 fixing trouble using the quick start manual.

Version 3.2.0

2007-07-04

Open source release

Version 3.1.*

Development versions. Version 3.1.7 was kindly reviewed by Internet Archive and the Norwegian national library.

Version 3.0.0

2007-02-02

Marked the naming of the NetarchiveSuite, the splitting of NetarchiveSuite into independent modules, and the licensing of NetarchiveSuite under LGPL

Version 2.*

Various features and updates

Version 2.0

2006-08-30

Marked a general restructuring of the code, where harvest definition data was backed by a database, the viewerproxy was trimmed and rewritten.

Version 1.*

Various features and updates

Version 1.0

2005-07-01

The first version of the netarchive| software put in production for harvesting the entire Danish web

Version 0.*

Various pre-production development versions

ReleaseNotes3_9_0 (last edited 2010-08-16 10:24:55 by localhost)