Start two different kinds of harvests
Run both a snapshot harvest and a selective harvest.
These should be run simultaniously, so do not wait for one of them to finish before creating and starting the other.
If harvests already exists with the same names, just name them snh2 and sh2 respectively.
Start a snapshot harvest with max 100000 bytes and 100 objects
This page describes how to start a small snapshot harvest
Note: the small size of the harvest is defined by a small number of domains. The domains can be viewed via 'Definitions'->'Domain Statistics'.
Start Program
Go to http://$GUIadminserver:$http-port/HarvestDefinition/
- where GUIadminserver and http-port are specified in the deploy configuration file under the application named dk.netarkivet.common.webinterface.GUIApplication
In the one-machine setup (deploy_example_one_machine.xml ) the link will be : http://localhost:8074
Make a new snapshot harvest definition with a name you can remember
Click 'Definitions'->'Snapshot Harvests' in the left menu
- Click 'Create new harvestdefinition' in the bottom of the main window
Fill in the 'Harvest name' and note the name for later use (from now referred as <snh. name>)
- Set 'Max number of bytes per domain’ to 100000.
- Set 'Max number of objects per domain’ to 100.
- Click Save
Click 'Activate' in column 4 on the line with the <snh. name>
Check scheduling of jobs
Click 'Harvest status'->'All Jobs' in the left menu
- Select to view NEW jobs
Check that a new snapshot harvest <snh. name> job has been generated (may take a minute before jobs appear)
- Click 'System status' in the left menu
Check that one of the lines for <nop>GUIWebServer contains the message "INFO: Created X jobs for harvest definition <snh. name> <br> (choose Application <nop>GUIWebServer and Show all lines)
- Check That there are no warnings on the different applications
Define and run selective harvest
This page describes how to define and run a selective harvest of netarkivet.dk
Do following in a browser: Start Program
Go to http://$GUIadminserver:$http-port/HarvestDefinition/
- where GUIadminserver and http-port are specified in the deploy configuration file under the application named dk.netarkivet.common.webinterface.GUIApplication
In the one-machine setup (deploy_example_one_machine.xml ) the link will be : http://localhost:8074
Make a new selective harvest definition with a name you can remember
Click 'Definitions'->'Selective Harvests' in the left menu
- Click 'Create new harvest definition' in the bottom of the main window
Fill in the Harvest name and note the name for later use (from now referred as <sh. name>)
- Choose "Once_a_week" in the drop down list for 'Schedule'
- Write =netarkivet.dk= in the 'Enter Domain...' window and click 'Add domains'
- If =netarkivet.dk= is unknown (i.e. not registered in the domain table), the button "Create and add to the harvest definition" is added to the to page, and you then need to click on this button.
- Click 'Save'
Activate the selective harvest
Click 'Activate' in column 5 on the line with the <sh. name>
Check that the time in the ’Next Run’ column time on the line with the <sh. name> is now.
Check harvest status of the selective harvest
Click 'Harvest status'->'All Jobs' in the left menu
- Select "All" in "Only display job status" to the right from the menu
Click the "Show" button, until the <sh. name> appears in a new job line (approx. after a minute)
- Check that the job has status "NEW", it may have turned into status "SUBMITTED" before you see it.
Check job creation in the system status for the selective harvest
Click 'Systemstate'->'Overview of the system state'
- Find and click 'GUIWebServer' in the 'Application' column for the KB kb-test-adm-001
Click 'show all' in the ‘Index’ header
Check that there exists a line with the message "INFO: Created 1 jobs for harvest definition '<sh. name>'
Wait for both harvests to finish
Click 'Harvest status'->'All Jobs' in the left menu
- Select to view ALL jobs
- Wait for both running harvest to achieve the status DONE.