Snapshot Harvests

snapshot_harvests.png

On [Snapshot Harvests] new snapshot harvests are started, harvesting all domains known to the system in their default configurations. An overview of all snapshot harvests is also provided.

[Create new harvestdefinition] opens the template below.

Creating/editing a snapshot harvest

snapshot_harvest_edit.png

This page is used to define name and size (max. bytes per domain) of the harvest. It is now possible to use number of objects as harvest limits, as well as the size in bytes. The default object limit for harvests if using object limits rather than bytelimits. -1 means unlimited.

It is recommended to systematize the naming for clarity, e.g. 2007-1, 2007-2 etc.

The size of the harvest can be defined in two ways: at the harvest definition [Snapshot Harvests] or at the configuration of the single domain. It will always be the lower size limit stopping the harvesting of a domain.

Comments can freely be added.

Snapshot harvests can be based on previous snapshots in the sense that it can be limited to only harvest domains that hit the max number of bytes limit in a previous harvest.

The domains completely finished (not hitting the max number of bytes limit – either on the configuration level or on the snapshot harvest level) in the first harvest will not be included in the second. Domains included in harvests which were aborted through the Heritrix GUI or otherwise stopped uncleanly (for example by a crash of a harvester machine) will also not be included.

All other domains will be harvested from the beginning in the second harvest.

[Save] saves the harvest definition and returns to [Snapshot harvests].

After defining a snapshot harvest the harvest is activated with the [Activate] button on the snapshot frontpage. Harvest will not start until you press [Activate]. Status then changes to ‘Active’.

[Deactivate] is not relevant in Snapshot Harvests because they only run once. By [Edit] the Snapshot Definition can be changed but only before activation. Parameters changeable are size, commentary and if previous harvest startpoint should be used. The name can not be changed.

[History] provides an overview of the specific harvest: see User Manual 3.14/Harvest History

edit

User Manual 3.14/Snapshot Harvests (last edited 2010-08-16 10:24:10 by localhost)