990
Comment:
|
1363
|
Deletions are marked like this. | Additions are marked like this. |
Line 11: | Line 11: |
* scheduling * responsabilities, roles of participants in a broad crawl defining crawl target (number of URL, scope, seed lists, politeness,budget...) * dealing with junk data * sorting and spliting seed lists into different jobs running test crawl |
|
Line 15: | Line 19: |
* using frontier reports * modifying settings, creating overrides |
|
Line 16: | Line 22: |
* visual QA * running a patch crawl |
|
Line 17: | Line 25: |
Preliminary Agenda items (proposals) for the non-technical workshop
- Introduction
- scheduling
- responsabilities, roles of participants in a broad crawl defining crawl target (number of URL, scope, seed lists, politeness,budget...)
- dealing with junk data
- sorting and spliting seed lists into different jobs running test crawl
- using frontier reports
- modifying settings, creating overrides
- visual QA
- running a patch crawl
Different set of roles using the NetarchiveSuite