Differences between revisions 2 and 3
Revision 2 as of 2010-03-08 08:25:51
Size: 4405
Comment:
Revision 3 as of 2010-08-16 10:24:48
Size: 4405
Editor: localhost
Comment: converted to 1.6 markup
No differences found!

Start two different kinds of harvests

Run both a snapshot harvest and a selective harvest.

These should be run simultaniously, so do not wait for one of them to finish before creating and starting the other.

If harvests already exists with the same names, just name them snh2 and sh2 respectively.

Start a snapshot harvest with max 100000 bytes and 100 objects

This page describes how to start a small snapshot harvest

Note: the small size of the harvest is defined by a small number of domains. The domains can be viewed via 'Definitions'->'Domain Statistics'.

Start Program

Make a new snapshot harvest definition with a name you can remember

  • Click 'Definitions'->'Snapshot Harvests' in the left menu

  • Click 'Create new harvestdefinition' in the bottom of the main window
  • Fill in the 'Harvest name' and note the name for later use (from now referred as <snh. name>)

  • Set 'Max number of bytes per domain’ to 100000.
  • Set 'Max number of objects per domain’ to 100.
  • Click Save
  • Click 'Activate' in column 4 on the line with the <snh. name>

Check scheduling of jobs

  • Click 'Harvest status'->'All Jobs' in the left menu

  • Select to view NEW jobs
  • Check that a new snapshot harvest <snh. name> job has been generated (may take a minute before jobs appear)

  • Click 'System status' in the left menu
  • Check that one of the lines for <nop>GUIWebServer contains the message "INFO: Created X jobs for harvest definition <snh. name> <br> (choose Application <nop>GUIWebServer and Show all lines)

  • Check That there are no warnings on the different applications

Define and run selective harvest

This page describes how to define and run a selective harvest of netarkivet.dk

Do following in a browser: Start Program

Make a new selective harvest definition with a name you can remember

  • Click 'Definitions'->'Selective Harvests' in the left menu

  • Click 'Create new harvest definition' in the bottom of the main window
  • Fill in the Harvest name and note the name for later use (from now referred as <sh. name>)

  • Choose "Once_a_week" in the drop down list for 'Schedule'
  • Write =netarkivet.dk= in the 'Enter Domain...' window and click 'Add domains'
  • If =netarkivet.dk= is unknown (i.e. not registered in the domain table), the button "Create and add to the harvest definition" is added to the to page, and you then need to click on this button.
  • Click 'Save'

Activate the selective harvest

  • Click 'Activate' in column 5 on the line with the <sh. name>

  • Check that the time in the ’Next Run’ column time on the line with the <sh. name> is now.

Check harvest status of the selective harvest

  • Click 'Harvest status'->'All Jobs' in the left menu

  • Select "All" in "Only display job status" to the right from the menu
  • Click the "Show" button, until the <sh. name> appears in a new job line (approx. after a minute)

  • Check that the job has status "NEW", it may have turned into status "SUBMITTED" before you see it.

Check job creation in the system status for the selective harvest

  • Click 'Systemstate'->'Overview of the system state'

  • Find and click 'GUIWebServer' in the 'Application' column for the KB kb-test-adm-001
  • Click 'show all' in the ‘Index’ header

  • Check that there exists a line with the message "INFO: Created 1 jobs for harvest definition '<sh. name>'

Wait for both harvests to finish

  • Click 'Harvest status'->'All Jobs' in the left menu

  • Select to view ALL jobs
  • Wait for both running harvest to achieve the status DONE.

It42RunSomeHarvests (last edited 2010-08-16 10:24:48 by localhost)