Differences between revisions 1 and 2
Revision 1 as of 2010-05-04 13:16:30
Size: 5690
Editor: SoerenCarlsen
Comment: Generated documentation branch for 3.14
Revision 2 as of 2010-08-16 10:24:43
Size: 5693
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
[[Action(edit)]] <<Action(edit)>>
Line 11: Line 11:
For description of the configurations used for installation, please refer to the [:Configuration Manual 3.14:Configuration Manual.] For description of the configurations used for installation, please refer to the [[Configuration Manual 3.14|Configuration Manual.]]
Line 24: Line 24:
This manual does not explain how to configure the applications themselves (see the [:Configuration Manual 3.14:Configuration Manual] for this), how to extend the functionality of the system (see the ["Development"] tab for this) or how to use the running system (see the [:User Manual 3.14:User Manual] for this). This manual does not explain how to configure the applications themselves (see the [[Configuration Manual 3.14|Configuration Manual]] for this), how to extend the functionality of the system (see the [[Development]] tab for this) or how to use the running system (see the [[User Manual 3.14|User Manual]] for this).

1. Introduction

edit

This manual describes how to install the NetarchiveSuite web archive software package.

We first describe how to use the included deploy software to configure and install a distributed NetarchiveSuite installlation. The deploy software offers a way to make configurations gathered in a special configuration file, which ease the job of configuration and installation. Using the deploy module eases the configuration, installation and start/stop of an entire NetarchiveSuite system.

If you are hampered by any limitations in the deploy software, it is of course possible to make custom made installation scripts. An inspection of the scripts generated by the deploy software will probably help you in this respect.

For description of the configurations used for installation, please refer to the Configuration Manual.

1.1. Contents

The first part describes the functionality of the deploy software and how it can be used. This involves a description of how to run this module with both required and optional arguments, and the functionality of the scripts generated.

The second part describes the configuration file used by the deploy software, both in structure, content and examples. This also describes the requirements and limitations of Deploy.

The third part describes the different possible installation scenarios.

The fourth part describes the means of deployment, which includes description of how to obtain and install required libraries, how to install the software on separate machines. Lastly the starting, stoppping and monitoring of the system is described. This part is useful for those who want to go beyond the limitations inherent in the deploy software.

Some parts of NetarchiveSuite requires external software to run. This is described in appendix A.

This manual does not explain how to configure the applications themselves (see the Configuration Manual for this), how to extend the functionality of the system (see the Development tab for this) or how to use the running system (see the User Manual for this).

1.2. Audience

The intended audience of this manual is system administrators who will be responsible for the actual installation of NetarchiveSuite as well as technical personnel responsible for proper operation of NetarchiveSuite. Knowledge of Unix system administration is expected, and some familiarity with XML and Java is an advantage.

1.3. Limitations

Even though the NetarchiveSuite software is developed in Java, and therefore is mostly platform independent, we do have a couple of external calls to the Unix "sort" command. The parts of our software using this external command therefore only runs on Linux/Unix, or Windows with Cygwin installed. The parts in question are:

  • The dk.netarkivet.common.GUIApplication, if the sitesection dk.netarkivet.viewerproxy.webinterface.QASiteSection is used

  • The dk.netarkivet.archive.indexserver.IndexServerApplication

Specifically the following methods all use an external call to the Unix sort() command:

  • FileUtils#sortCrawlLog

    • Used in
      • dk.netarkivet.archive.indexserver.CrawlLogIndexCache,

      • dk.netarkivet.viewerproxy.webinterface.Reporting
  • FileUtils#sortCDX() (only used in dk.netarkivet.archive.indexserver.CrawlLogIndexCache)

  • dk.netarkivet.archive.indexserver.CDXIndexCache#sortFile()
  • dk.netarkivet.viewerproxy.LocalCDXCache#getIndex()

The Software is mainly tested on a Linux platform, but with some of the BitarchiveApplication's installed on a Windows platform.

1.4. Installation Overview

Using NetarchiveSuite's Deploy utility, the steps required to configure and start a webarchive are

  1. Determine the required architecture - ie how many machines you will be using, their locations, their operating systems and which applications should run on each machine
  2. Configure the required machines, the required external software (see Appendices) and any relevant firewalls
  3. Unpack NetarchiveSuite.zip in a directory on a linux machine

  4. Create the config.xml file which describes the architecture and any custom settings. This will also specify your environmentName (e.g. MY_WEBARCHIVE).
  5. Modify the other configuration files (logging and security properties) if necessary.
  6. Run the Deploy utility. This will create a sub-directory MY_WEBARCHIVE with all the deploy scripts and configuration files you need.
  7. Run the install scripts, then the start scripts. You should now have a running netarchivesuite installation.

The remainder of this document is designed to guide you through the above process, and especially the choices involved in your architecture and the creation of the deploy.xml file:

  • Section 2 describes the choices to be made in defining the system architecture.
  • Section 3 describes the Deploy application, including a detailed discussion of exactly how it functions. This may be very useful if you need to customise the process or in the event of problems during deploy, start, and stop.
  • Section 4 describes the configuration file, deploy.xml. This is arguably the most important part of the manual.
  • Section 5 describes the manual installation of NetarchiveSuite.

  • Section 6 describes how to start and stop NetarchiveSuite when not using the script generated by the Deploy application.

  • Section 7 describes the monitoring of NetarchiveSuite using jmx.

  • The Appendices describe how to configure the external components used by NetarchiveSuite.

Installation Manual 3.14/Introduction (last edited 2010-08-16 10:24:43 by localhost)