Release Notes for NetarchiveSuite 3.15.0

This development version of NetarchiveSuite was released on 2011-MM-DD

New features since NetarchiveSuite 3.14.*

The following bugs and features have been fixed since 3.12

Common Module

Bug 2013 Synchronizer.onMessage could give more diagnostic in case of unexpected reply received

FR 1591 Upgrade our JMS broker to latest (Open MQ 4.3)
FR 2083 Bundle PostgreSQL JDBC driver with netarchiveSuite distribution
FR 2087 Upgrade to apache common-net 2.2
FR 2109 Extend the use of remoteFile.retries to FTPRemoteFile.logOn

Harvester Module

Bug 2075 Many 'crawl is finished' logs on Job termination
Bug 2088 Removal of unused JobDBDAO.getStatusInfo methods
Bug 2089 maxObjects stored as int in table configurations: should be stored as bigint
Bug 2111 Deadlock when processing a CrawlProgressMessage
Bug 2115 Deactivate QuotaEnforcer when only queue-total-budget is used

FR 1969 Missing start and end date columns in Job status to search after
FR 2084 Allow use of queue-total-budget in place of QuotaEnforcer for crawl budget
FR 2093 Maximum time for a crawljob wanted for broad crawls
FR 2094 To flick through the pages of OAI-harvest of e-books
FR 2096 Monitor retired queues in the running jobs monitor
FR 2097 Add additional tags to harvestInfo.xml
FR 2099 Make the alias timeout a setting in HarvesterSettings instead of being hardwired to one year
FR 2101 Selective harvest details page: sort domain configurations and display the size of the list
FR 2117 Frontier reports extract: refactor filters and harmonize UI table columns

Access Module

FR 2103 Add a new netarchiveResourcestore that supports both warc and arc records

Bug 1977 Can not browse in ftp harvested URL's with the viewerproxy

Archive Module

Bug 2008 Odd error: Can't copy file into archive
Bug 2098 Windows bitapps go unresponsive after talking with ftp-servers at statsbiblioteket.dk (Not verified)

Bug 2104 Wrong handling or too strict requirements to GetChecksumMessage results in ArgumentNotValid

Documentation Module

Deploy Module

Upgrade instructions

New settings in the common module

settings.common.database.url: Replaced by the new settings.common.database.baseUrl, settings.common.database.machine, settings.common.database.port, settings.common.database.dir. Note that currently, you need to include username/password information in the settings.common.database.dir value. An FR/bug has been created to address this.

New settings and language keys in the harvester module

settings.harvester.aliases.timeout: The amount of time in seconds before an alias times out, and needs to be re-evaluated. The default value is one year, i.e 31536000 seconds.

settings.harvester.harvesting.harvestReport: The implementation HarvestReport interface to be used. The default is dk.netarkivet.harvester.harvesting.report.LegacyHarvestReport

New language keys together with the English text:

running.job.details.retiredQueuesLink={0} retired Heritrix queues
running.job.details.exhaustedQueuesLink={0} exhausted Heritrix queues
running.job.details.display=Display
running.job.details.export=export
running.job.details.frontier.totalBudget=Total budget
harvest.configuration.count=There are {0} domain configurations in this harvest definition.
prompt;max.seconds.per.crawljob=Max number of seconds for each job

New settings in the archive module

New settings and language keys in the viewerproxy module

settings.viewerproxy.tryLookupUriAsFtp: If we fail to lookup an URI, we will try changing the protocol to ftp, if this setting is set to true. The default is false.

New settings in the wayback module

Changes to tables in the harvest database

configurations: field 'maxobjects' changed from type int to type bigint.
jobs: new bigint field 'forcemaxrunningtime'
fullharvests:  new bigint field maxjobrunningtime
runningjobshistory: new bigint field 'retiredQueuesCount'
runningJobsMonitor: new bigint field 'retiredQueuesCount'

Note the changes mentioned above should be added automatically.

Version History

Version 3.14.0

2010-11-12

Added running jobs overview, and batchGUI; fixed major OOM problem in the batch monitoring code

Version 3.12.0

2010-05-03

New Bitpreservation infrastructure, and upgrade of Apache Derby to version 10.5.3.0

Version 3.11.*

Development versions aiming for 3.12.0

Version 3.10.0

2009-11-16

New deploy application; JMX stability issues fixed; JMS stability issues also fixed

Version 3.9.*

Development versions aiming for 3.10.0

Version 3.8.2

2009-09-10

Fix an important index synchronization bug

Version 3.8.1

2009-07-15

Fix of important bug leading to unresponsive harvesters

Version 3.8.0

2009-05-23

Java 1.6, Heritrix 1.14.1, Derby 10.4.2.0, complete rewrite of settings, new supported deploy module, gui access to harvest logs

Version 3.7.0

2008-11-04

Develop version aiming for 3.8.0

Version 3.6.0

2008-07-03

Improvement of archive component with regard to security, batch, and preservation; greater JMS stability; important bug fixes

Version 3.5.*

Develop versions aiming for 3.6.0

Version 3.4.2

2008-03-14

Bug fix release, fixing JMX timeout

Version 3.4.1

2008-01-16

Bug fix release, fixing out of memory on very large indexes

Version 3.4.0

2008-01-03

Separation of Heritrix, work on developing our open source platform, two-part TLDs like co.uk, and lots of bugfixes

Version 3.3.*

Develop versions aiming for 3.4.0

Version 3.2.3

2007-09-27

Bugfix of 3.2.2 with patched deduplicator, that fixes problem in parallel indexing

Version 3.2.2

2007-08-03

Bugfix of 3.2.1 with patched Heritrix 1.12.1, that supports ARCRecords larger than 2GBs

Version 3.2.1

2007-07-04

Bugfix of 3.2.0 fixing trouble using the quick start manual.

Version 3.2.0

2007-07-04

Open source release

Version 3.1.*

Development versions. Version 3.1.7 was kindly reviewed by Internet Archive and the Norwegian national library.

Version 3.0.0

2007-02-02

Marked the naming of the NetarchiveSuite, the splitting of NetarchiveSuite into independent modules, and the licensing of NetarchiveSuite under LGPL

Version 2.*

Various features and updates

Version 2.0

2006-08-30

Marked a general restructuring of the code, where harvest definition data was backed by a database, the viewerproxy was trimmed and rewritten.

Version 1.*

Various features and updates

Version 1.0

2005-07-01

The first version of the netarchive| software put in production for harvesting the entire Danish web

Version 0.*

Various pre-production development versions

ReleaseNotes3_15_0 (last edited 2011-02-22 13:44:46 by SoerenCarlsen)