Differences between revisions 1 and 2
Revision 1 as of 2007-09-27 09:10:58
Size: 4489
Comment:
Revision 2 as of 2007-09-27 12:28:24
Size: 6202
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Release Notes for NetarchiveSuite 3.2.2 = = Release Notes for NetarchiveSuite 3.3.1 =
Line 3: Line 3:
This version of the !NetarchiveSuite was released on 2007-08-03. This version of the !NetarchiveSuite was released on 2007-09-24.

'''Note: This is a development release, it is not tested to be of production quality'''
Line 7: Line 9:
== New features since NetarchiveSuite 3.0.0 == == New features since NetarchiveSuite 3.2.* ==
Line 11: Line 13:
==== Code released into open source ==== We now have a publicly available SVN repository. See https://gforge.statsbiblioteket.dk/scm/?group_id=7.
Line 13: Line 15:
The code has been cleaned up, non-redistributable material has been removed, and the code is now released into the public. Refer to http://netarchive.dk/suite/ for the website distributing the code, and managing communication around the open source release.

==== Manuals available ====

Manuals have been written aimed at
 * Technical persons wanting a quick-and-easy way to evaluate the software
 * System administrators installing this software
 * Non-technical end users using the software to set up and QA harvests
 * Developers, wishing to extend, modify, or contribute.

==== All webpages localised ====

All the end user web pages are now in a localisation framework, and currently Danish and English are supported. Please refer to the development manual, for how to add support for your own language.
Please note that anonymous access is not available, although the page claims so. You will need to create a gforge account, and use the instructions under Developer Subversion Access.
Line 29: Line 19:
==== New remote file implementation ==== ==== Settings split ====
Line 31: Line 21:
The code for managing transfers of files from one machine to another has been cleaned up, given a clearer and simpler interface, and been made pluggable.

Currently two implementations exist:
 * FTPRemoteFile - transfers files by a push mechanism, uploading them to an FTP server, where another module can retrieve them
 * HTTPRemoteFile - transfers files by a pull mechanism, with point-to-point communication using HTTP

Refer to the installation manual for choosing and setting up your remote file implementation, and to the development manual for how to implement another way of transferring files.
The settings for the monitor module have been moved to a separate settings file. It is expected that we will do further restructuring of our settings to make them more extensible and modular in a later release.
Line 41: Line 25:
==== Database agnostica ==== Preliminary work on communicating with the harvesters through JMX and get the Heritrix UI up and running.
Line 43: Line 27:
The code has been refactored to have all database dependant code moved to a single pluggable class, and implementations are now available for embedded Derby, standalone Derby and MySQL. Refer to the installation manual for details on choosing a database backend, and to the development manual for details on how to add support for another database. === Monitor Module ===
Line 45: Line 29:
==== Heritrix 1.12.1 patch ====
Heritrix 1.12.1a patched to allow ARC-files having records larger than 2GBs
==== Dynamic reload of settings for deployed applications ====

With the new settings file for the monitor module, you will be able to update which applications are to be monitored. Simply change the settings file, and the file should be reloaded on the next load of the monitor application.

== Bugs fixed since NetarchiveSuite 3.2.* ==

=== Common Module ===

{{{
1016 ExtractCDX tool does not handle arc-files with large records
1023 [javadoc] Tag @see: can't find getInstance(File) in dk.netarkivet.common.distribute.RemoteFile
1034 English NetarchiveSuite thumbnail redirects to Danish site
1057 HTTPRemoteFile breaks ?
}}}

=== Harvester Module ===

{{{
628 need a way to reset the nextdate of a HD
789 illegal regexp crashes entire job
861 No logmessage in HarvestDefinitionGUI, when submitting crawljob
915 TLDs having two parts i.e co.uk are disallowed
926 error writing crawl.log to metadata.arc
937 rescheduling of jobs is very slow and blocks normal scheduling
939 Webpages should handle the case, where no schedules or harvestdefinitions exist already properly
970 DB error on registering jobs with too many upload-errors
971 Seeds, passwords and configurations sorted without regard to locale
984 Missing headlines in "Selective Harvests" window
993 missing trim of domain strings or a normal error message
1033 Sensitive "Find Domain(s)"
1038 New title for 'Harvest status' - 'All Jobs per domain'
1039 SideKick sets it's application name to "dk.netarkivet.SideKick"
1040 sortNamedObjectList and Named should be moved to common.utils
1049 Danish translation of resubmitted is wrong
1051 Error on Definitions-create-domain.jsp when trying to create an invalid domain
1053 wrong parameter in configuration-link on job details page
}}}

=== Archive Module ===

{{{
1022 [javadoc] Parameter "data" is documented more than once in RemoveAndGetFileMessage
1024 An error message is always shown on the bitarchive checksum page
1036 missing translation
1046 Bitpreservation-filestatus-checksum.jsp: Missing whitespace between filename and Info-link
1059 Mention of locations as institutions in comments, and variable names
}}}
Line 50: Line 79:
==== Simpler setup ==== {{{
1030 new viewerproxy command URL gives strange behavior in some browsers
}}}
Line 52: Line 83:
The viewerproxy no longer requires a setting to know it's own hostname. === Monitor Module ===
Line 54: Line 85:
## == Bugs fixed in NetarchiveSuite 3.0 ==
##
## === Common Module ===
##
## === Harvester Module ===
##
## === Archive Module ===
##
## === Viewerproxy Module ===
##
## === Monitor Module ===
##
## === General ===
##
## == Known bugs in NetarchiveSuite 3.0 ==
##
## Unfortunately, we do not yet have access to na open bug list. This will be available soon.
##
## === Common Module ===
##
## === Harvester Module ===
##
## === Archive Module ===
##
## === Viewerproxy Module ===
##
## === Monitor Module ===
##
## === General ===
##
{{{
936 no way to add a new bitarchive machine to the JMX-overview without restarting the GUI
}}}

== Upgrade instructions ==

To upgrade from a previous version of NetarchiveSuite, you will need to update your settings files on the application running the monitor GUI, usually HarvestDefinitionApplication.

What needs to be done is move the section that looks like

{{{
<deploy>
    <jmxMonitorRolePassword>JMX_MONITOR_ROLE_PASSWORD_PLACEHOLDER</jmxMonitorRolePassword>
    <numberOfHosts>3</numberOfHosts>
    <host1>
        <name>hostname1.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
        <jmxport>8102</jmxport>
    </host1>
    <host2>
        <name>hostname2.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
    </host2>
    <host3>
        <name>hostname3.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
        <jmxport>8102</jmxport>
        <jmxport>8103</jmxport>
    </host3>
</deploy>
}}}

to the file called {{{monitor_settings.xml}}}.

If you use {{{-Ddk.netarkivet.settings.file=path/to/your/settings.xml}}} you should now also use {{{-Ddk.netarkivet.monitorsettings.file=path/to/your/monitor_settings.xml}}}.
Line 86: Line 126:
Line 87: Line 128:
|| Version 3.3.1 || 2007-09-27 || || || Version 3.3.1 || 2007-09-24 || ||

Release Notes for NetarchiveSuite 3.3.1

This version of the NetarchiveSuite was released on 2007-09-24.

Note: This is a development release, it is not tested to be of production quality

TableOfContents

New features since NetarchiveSuite 3.2.*

General

We now have a publicly available SVN repository. See https://gforge.statsbiblioteket.dk/scm/?group_id=7.

Please note that anonymous access is not available, although the page claims so. You will need to create a gforge account, and use the instructions under Developer Subversion Access.

Common Module

Settings split

The settings for the monitor module have been moved to a separate settings file. It is expected that we will do further restructuring of our settings to make them more extensible and modular in a later release.

Harvester Module

Preliminary work on communicating with the harvesters through JMX and get the Heritrix UI up and running.

Monitor Module

Dynamic reload of settings for deployed applications

With the new settings file for the monitor module, you will be able to update which applications are to be monitored. Simply change the settings file, and the file should be reloaded on the next load of the monitor application.

Bugs fixed since NetarchiveSuite 3.2.*

Common Module

1016   ExtractCDX tool does not handle arc-files with large records
1023   [javadoc] Tag @see: can't find getInstance(File) in dk.netarkivet.common.distribute.RemoteFile
1034   English NetarchiveSuite thumbnail redirects to Danish site
1057   HTTPRemoteFile breaks ?

Harvester Module

628    need a way to reset the nextdate of a HD
789    illegal regexp crashes entire job
861    No logmessage in HarvestDefinitionGUI, when submitting crawljob
915    TLDs having two parts i.e co.uk are disallowed
926    error writing crawl.log to metadata.arc
937    rescheduling of jobs is very slow and blocks normal scheduling
939    Webpages should handle the case, where no schedules or harvestdefinitions exist already properly
970    DB error on registering jobs with too many upload-errors
971    Seeds, passwords and configurations sorted without regard to locale
984    Missing headlines in "Selective Harvests" window
993    missing trim of domain strings or a normal error message
1033   Sensitive "Find Domain(s)"
1038   New title for 'Harvest status' - 'All Jobs per domain'
1039   SideKick sets it's application name to "dk.netarkivet.SideKick"
1040   sortNamedObjectList and Named should be moved to common.utils
1049   Danish translation of resubmitted is wrong
1051   Error on Definitions-create-domain.jsp when trying to create an invalid domain
1053   wrong parameter in configuration-link on job details page

Archive Module

1022    [javadoc] Parameter "data" is documented more than once in RemoveAndGetFileMessage
1024    An error message is always shown on the bitarchive checksum page
1036    missing translation
1046    Bitpreservation-filestatus-checksum.jsp: Missing whitespace between filename and Info-link
1059    Mention of locations as institutions in comments, and variable names

Viewerproxy Module

1030    new viewerproxy command URL gives strange behavior in some browsers

Monitor Module

936    no way to add a new bitarchive machine to the JMX-overview without restarting the GUI

Upgrade instructions

To upgrade from a previous version of NetarchiveSuite, you will need to update your settings files on the application running the monitor GUI, usually HarvestDefinitionApplication.

What needs to be done is move the section that looks like

<deploy>
    <jmxMonitorRolePassword>JMX_MONITOR_ROLE_PASSWORD_PLACEHOLDER</jmxMonitorRolePassword>
    <numberOfHosts>3</numberOfHosts>
    <host1>
        <name>hostname1.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
        <jmxport>8102</jmxport>
    </host1>
    <host2>
        <name>hostname2.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
    </host2>
    <host3>
        <name>hostname3.example.com</name>
        <jmxport>8100</jmxport>
        <jmxport>8101</jmxport>
        <jmxport>8102</jmxport>
        <jmxport>8103</jmxport>
    </host3>
</deploy>

to the file called monitor_settings.xml.

If you use -Ddk.netarkivet.settings.file=path/to/your/settings.xml you should now also use -Ddk.netarkivet.monitorsettings.file=path/to/your/monitor_settings.xml.

Version History

Current development versions

Version 3.3.1

2007-09-24

Version 3.3.0

2007-08-06

Mostly bugfix work, including upgradability of the monitored applications, and faster resubmitting of jobs

Stable versions

Version 3.2.3

2007-09-27

Bugfix of 3.2.2 with patched deduplicator, that fixes problem in parallel indexing

Version 3.2.2

2007-08-03

Bugfix of 3.2.1 with patched Heritrix 1.12.1, that supports ARCRecords larger than 2GBs

Version 3.2.1

2007-07-04

Bugfix of 3.2.0 fixing trouble using the quick start manual.

Version 3.2.0

2007-07-04

Open source release

Version 3.1.*

Development versions. Version 3.1.7 was kindly reviewed by Internet Archive and the Norwegian national library.

Version 3.0.0

2007-02-02

Marked the naming of the NetarchiveSuite, the splitting of NetarchiveSuite into independent modules, and the licensing of NetarchiveSuite under LGPL

Version 2.*

Various features and updates

Version 2.0

2006-08-30

Marked a general restructuring of the code, where harvest definition data was backed by a database, the viewerproxy was trimmed and rewritten.

Version 1.*

Various features and updates

Version 1.0

2005-07-01

The first version of the netarchive software put in production for harvesting the entire Danish web

Version 0.*

Various pre-production development versions

ReleaseNotes3_3_1 (last edited 2010-08-16 10:24:41 by localhost)