Differences between revisions 1 and 125 (spanning 124 versions)
Revision 1 as of 2009-09-22 08:03:42
Size: 4099
Editor: TueLarsen
Comment:
Revision 125 as of 2010-04-23 15:59:39
Size: 3674
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Following need to merged with our internal TEST2 and migrated to an open TEST2 '''TEST2: Std-2 (snapshotharvesting, configurations, bytelimits, alias's and domainlists)'''
Line 3: Line 3:
Here are the steps on a fresh install of NAS: Testgoals: Test snaphots harvesting in detail and subsequent follow-up harvesting
Line 5: Line 5:
Sanity test 1: snapshot harvest '''To testwriters''': This test should not contain non-standard snapshotharvesting behavior. After install it should not be necessary to use shell script or commandline statements.
Line 7: Line 7:
- add a set of domains
- configure the domains' default configuration object limit. On my dev setup I added 8 domains, and made two groups, one with a limit of 100 objects, some newspâpers domains with a 200 object limit, and an "outsider" with a 100 object limit.
- define a first snapshot harvest, with no byte limit and an object limit that is lower than the smallest domain config limit you set up (I started at 50).
- activate this harvest and let it finish.
- Verify that the stop reasons for domains, once the harvest is complete, are one of:
        - "Domain completed" with a number of harvested documents that is lower than the snapshot limit (in my test < 50)
        - "Max object limit reached" with a number of harvested documents that is equal to the snapshot limit (in my test 50)
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml files from the metadata arc
Follow the instructions in the section [http://netarchive.dk/suite/Installation_Manual_3.10#head-eb8edd7b928a151dfbd3386f6c67caee8a3dd9a1 10. Easy Installation of NetarchiveSuite in Installation manual ]Before start, set <deduplication><enabled>false</enabled></deduplication> in deploy_example_one_machine.xml
Line 16: Line 9:
Sanity test 2: incremental snapshot harvest

- define a new snapshot harvest, with no byte limit and an object limit that is higher than the highest domain config limit you set (in my case 500). Make this harvest incremental by having it harvest only domains not completed in your initial harvest
- Verify that the stop reasons for domains, once the harvest is complete, are one of:
        - "Domain completed" with a number of harvested documents that is lower than the snapshot limit (in my test < 500)
        - "Max object limit reached" with a number of harvested documents that is equal to the snapshot limit (in my test 500)
        - "Domain-config object limit reached" with a number of harvested documents that is equal to the default domain configuration limit. That stop reason might be tricky to observe because of deduplication yields "Domain completed" more often on consecutive crawls.
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml files from the metadata arc

Sanity test 3: selective harvest

- pick a domain and create a new configuration for it, with an object limit.
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
- Verify that the stop reasons for the selected domain is "Domain-config object limit reached" with a number of harvested documents that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml file from the metadata arc

Sanity test 4: combination of object and byte limit

- pick a domain and create a new configuration for it, with an object limit , and a low byte limit (for instance 100ko and 1000 objects)
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
- Verify that the stop reasons for the selected domain is "Domain-config byte limit reached" with a byte size that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer "group-max-fetch-success" and "group-max-all-kb" parameters are set to the proper limit values in the order.xml file from the metadata arc
If you are netarkiv.dk tester, then follow these instructions : [:NetarkivInstall :Netarkiv Installation setup ]
||<tablewidth="200px">'''Items ''' ||'''Status 1''' ||'''Status 2''' ||'''Status 3''' ||'''Notes ''' ||'''Known open bugs ''' ||'''Bugs tested ''' ||'''New bugs found ''' ||'''Previous bugs''' ||
||[:It24CheckHarvConfig :1. Check single domain creation, harvest config and domain statistics ] ||OK || || || ||1060 is still not fixed || || ||873,892, 973,1051,1033,1060 ||
||[:It42CheckGlobalCrwalerTraps :1.a Check global crawlertraps] ||OK || || || || || ||Bug 1889 || ||
||[:It17ByteLimit :2. Update bytelimits for 6 domains ] ||OK || || || || || || || ||
||[:It16AliasGui :3. Search and add alias in ADM GUI ] ||OK || || || || || || ||894,895,896 ||
||[:It16AliasNoTransitiveGui :4. Check that chains of alias is prevented ] ||OK || || || || || || ||954 ||
||[:It10DefCrossHarv :6. Start a snapshot harvest with max 100000 bytes ] ||OK || || ||Odd warnings in GUIApplication about unknown database tables. || || || || ||
||[:It16VerificerUdenAlias :7. Verify that alias domain is not harvested ] ||OK || || || || || || || ||
||[:It17VerifyLimits :8. Check that the 1. snapshot harvest has reached the expected byte limits ] ||OK || || ||Some domains had different 'stop due to' code than expected || || || ||998 ||
||[:It16AliasSulnuduGui :9. Add sulnudu-alias via ADM GUI ] ||OK|| || || || || || || ||
||[:It17ChangeLimit :10. Change byte limit on a domain ] || || || || || || || || ||
||[:It13Def5mbTvHarvestNetarkiv :11. Start of a snapshot harvest with max bytes limit 5 mb. (takes min. 1 hour) ] || || || || || || || || ||
||[:It39SGotoSelHarvUsingHeritrix :12. Go to Heritrix GUI, verify the job is running and "pause" the job ] || || || || ||Bug 1741, 1791 || || || ||
||[:It31VerifySelHarvUsingADMGUI :13. Go to the System overview in ADM GUI and check the job is paused and there are no error messages ] || || || || || || || || ||
||[:It31SChangeSelHarvUsingHeritrix :14. Go to Heritrix GUI , do some overrides and resume the job ] || || || || || ||FR 1765 || || ||
||[:It31VerifySelHarvUsingADMGUIrun :15. Go to the System overview in ADM GUI and check the job is running again and there are no error messages ] || || || || || || || || ||
||[:It16VerificerUdenSulnuduAlias :16. Verify that no alias domains are harvested ] || || || || || || || || ||
||[:It16VerifyExpectedDomains :17. Check that the 2. snapshot harvest has reached the expected byte limits ] || || || || || || || || ||
||[:It38CheckObjectLimits :18. Check , that objects limits are respected] || || || || || || || || ||
||[:It38CheckHarvestNotDeduplicated :19. Check , that objects are not deduplicated ] || || || || || || || || ||
||[:It39CheckFR1765 :20. Check , that overrides are in the QA reports] || || || || || ||FR 1765 || || ||
Line 42: Line 33:
- pick a domain and create a new configuration for it, with a small object limit , and a high byte limit (for instance 10Mo and 10 objects)
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
- Verify that the stop reasons for the selected domain is "Domain-config object limit reached" with a number of harvested documents that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer "group-max-fetch-success" and "group-max-all-kb" parameters are set to the proper limit values in the order.xml file from the metadata arc
If you are netarkiv.dk tester, here is the shutdown instructions: [:It36CleanupAfterTest Shutdown the system. :Shutdown the system ]

TEST2: Std-2 (snapshotharvesting, configurations, bytelimits, alias's and domainlists)

Testgoals: Test snaphots harvesting in detail and subsequent follow-up harvesting

To testwriters: This test should not contain non-standard snapshotharvesting behavior. After install it should not be necessary to use shell script or commandline statements.

Follow the instructions in the section [http://netarchive.dk/suite/Installation_Manual_3.10#head-eb8edd7b928a151dfbd3386f6c67caee8a3dd9a1 10. Easy Installation of NetarchiveSuite in Installation manual ]Before start, set <deduplication><enabled>false</enabled></deduplication> in deploy_example_one_machine.xml

If you are netarkiv.dk tester, then follow these instructions : [:NetarkivInstall :Netarkiv Installation setup ]

Items

Status 1

Status 2

Status 3

Notes

Known open bugs

Bugs tested

New bugs found

Previous bugs

[:It24CheckHarvConfig :1. Check single domain creation, harvest config and domain statistics ]

OK

1060 is still not fixed

873,892, 973,1051,1033,1060

[:It42CheckGlobalCrwalerTraps :1.a Check global crawlertraps]

OK

Bug 1889

[:It17ByteLimit :2. Update bytelimits for 6 domains ]

OK

[:It16AliasGui :3. Search and add alias in ADM GUI ]

OK

894,895,896

[:It16AliasNoTransitiveGui :4. Check that chains of alias is prevented ]

OK

954

[:It10DefCrossHarv :6. Start a snapshot harvest with max 100000 bytes ]

OK

Odd warnings in GUIApplication about unknown database tables.

[:It16VerificerUdenAlias :7. Verify that alias domain is not harvested ]

OK

[:It17VerifyLimits :8. Check that the 1. snapshot harvest has reached the expected byte limits ]

OK

Some domains had different 'stop due to' code than expected

998

[:It16AliasSulnuduGui :9. Add sulnudu-alias via ADM GUI ]

OK

[:It17ChangeLimit :10. Change byte limit on a domain ]

[:It13Def5mbTvHarvestNetarkiv :11. Start of a snapshot harvest with max bytes limit 5 mb. (takes min. 1 hour) ]

[:It39SGotoSelHarvUsingHeritrix :12. Go to Heritrix GUI, verify the job is running and "pause" the job ]

Bug 1741, 1791

[:It31VerifySelHarvUsingADMGUI :13. Go to the System overview in ADM GUI and check the job is paused and there are no error messages ]

[:It31SChangeSelHarvUsingHeritrix :14. Go to Heritrix GUI , do some overrides and resume the job ]

FR 1765

[:It31VerifySelHarvUsingADMGUIrun :15. Go to the System overview in ADM GUI and check the job is running again and there are no error messages ]

[:It16VerificerUdenSulnuduAlias :16. Verify that no alias domains are harvested ]

[:It16VerifyExpectedDomains :17. Check that the 2. snapshot harvest has reached the expected byte limits ]

[:It38CheckObjectLimits :18. Check , that objects limits are respected]

[:It38CheckHarvestNotDeduplicated :19. Check , that objects are not deduplicated ]

[:It39CheckFR1765 :20. Check , that overrides are in the QA reports]

FR 1765

If you are netarkiv.dk tester, here is the shutdown instructions: [:It36CleanupAfterTest Shutdown the system. :Shutdown the system ]

TEST2 (last edited 2012-03-29 12:58:11 by ColinRosenthal)