Differences between revisions 1 and 2
Revision 1 as of 2010-03-15 15:19:30
Size: 736
Editor: TueLarsen
Comment:
Revision 2 as of 2010-03-18 13:51:49
Size: 388
Editor: TueLarsen
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Choose 'Definitions' -> 'Global Crawler Traps' and type an existing domain e.g. netarkivet.dk
Check that you switch to the 'Edit Domain' screen with data on the netarkivet.dk domain
Click 'Show crawler traps'
Paste into the input-box following list of known crawler traps
Remember currently to update the list of crawler trap examples from our production system.
The file is now located in CVS at http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain
Choose 'Definitions' -> 'Global Crawler Traps' and type a name e.g. crawlertraps1
Save the filehttp://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain on your Desktop as cralertraps.txt and upload the file
Check that all crawler traps are uploaded
Line 10: Line 7:
Click 'Save'
Check that you do not get an error message like "The regular expression '.*www.\jettebrian\.dk\/calendarix.*' is invalid..."

Check Global Crawler traps

Choose 'Definitions' -> 'Global Crawler Traps' and type a name e.g. crawlertraps1 Save the filehttp://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain on your Desktop as cralertraps.txt and upload the file Check that all crawler traps are uploaded

It42CheckGlobalCrwalerTraps (last edited 2011-04-13 11:47:49 by ColinRosenthal)