1363
Comment:
|
1363
|
Deletions are marked like this. | Additions are marked like this. |
Line 11: | Line 11: |
* scheduling * responsabilities, roles of participants in a broad crawl defining crawl target (number of URL, scope, seed lists, politeness,budget...) * dealing with junk data * sorting and spliting seed lists into different jobs running test crawl |
* Scheduling * Responsabilities, roles of participants in a broad crawl defining crawl target (number of URL, scope, seed lists, politeness,budget...) * Dealing with junk data * Sorting and spliting seed lists into different jobs running test crawl |
Line 19: | Line 19: |
* using frontier reports * modifying settings, creating overrides |
* Using frontier reports * Modifying settings, creating overrides |
Line 22: | Line 22: |
* visual QA * running a patch crawl |
* Visual QA * Running a patch crawl |
Preliminary Agenda items (proposals) for the non-technical workshop
- Introduction
- Scheduling
- Responsabilities, roles of participants in a broad crawl defining crawl target (number of URL, scope, seed lists, politeness,budget...)
- Dealing with junk data
- Sorting and spliting seed lists into different jobs running test crawl
- Using frontier reports
- Modifying settings, creating overrides
- Visual QA
- Running a patch crawl
Different set of roles using the NetarchiveSuite