Differences between revisions 7 and 8
Revision 7 as of 2010-08-16 10:24:38
Size: 1020
Editor: localhost
Comment: converted to 1.6 markup
Revision 8 as of 2010-09-02 11:51:05
Size: 1198
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
Choose 'Definitions' -> 'Global Crawler Traps' and type a name e.g. crawlertraps1 Choose 'Definitions' -> 'Global Crawler Traps' and click 'Edit'
Line 4: Line 4:
Save the file http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain on your Desktop as crawlertraps.txt and upload the file. Type a name e.g. crawlertraps1

Save the file [[http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text/plain|http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain]] on your Desktop as crawlertraps.txt and upload the file.
Line 19: Line 21:
Verify that output only contains additional lines that represent duplicates in the original file.  Verify that output only contains additional lines that represent duplicates in the original file.

Check Global Crawler traps

Choose 'Definitions' -> 'Global Crawler Traps' and click 'Edit'

Type a name e.g. crawlertraps1

Save the file http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain on your Desktop as crawlertraps.txt and upload the file.

Check that all crawler traps are uploaded.

The number of crawlertraps uploaded may vary, as duplicates are removed during the upload to the database.

You can verify that the only difference between the uploaded and downloaded list is duplicates in the former list:

download the uploaded list to your local harddisk as crawlertraps.downloaded.txt
sort crawlertraps.downloaded.txt > crawlertraps.downloaded.txt.sorted
sort crawlertraps.txt > crawlertraps.txt.sorted

diff -u crawlertraps.downloaded.txt.sorted crawlertraps.txt.sorted > output

Verify that output only contains additional lines that represent duplicates in the original file.

It42CheckGlobalCrwalerTraps (last edited 2011-04-13 11:47:49 by ColinRosenthal)