Differences between revisions 8 and 9
Revision 8 as of 2010-09-02 11:51:05
Size: 1198
Comment:
Revision 9 as of 2011-04-13 11:47:49
Size: 1120
Comment:
Deletions are marked like this. Additions are marked like this.
Line 16: Line 16:
sort crawlertraps.downloaded.txt > crawlertraps.downloaded.txt.sorted
sort crawlertraps.txt > crawlertraps.txt.sorted
sort -u crawlertraps.downloaded.txt > crawlertraps.downloaded.txt.sorted
sort -u crawlertraps.txt > crawlertraps.txt.sorted
Line 19: Line 19:
diff -u crawlertraps.downloaded.txt.sorted crawlertraps.txt.sorted > output

V
erify that output only contains additional lines that represent duplicates in the original file.
diff crawlertraps.downloaded.txt.sorted crawlertraps.txt.sorted
The output should be empty.

Check Global Crawler traps

Choose 'Definitions' -> 'Global Crawler Traps' and click 'Edit'

Type a name e.g. crawlertraps1

Save the file http://kb-prod-udv-001.kb.dk/cvsweb/cvsweb.cgi/~checkout~/projects/webarkivering/documents/internal/crawlertrapsCollection.txt?rev=1.1;content-type=text%2Fplain on your Desktop as crawlertraps.txt and upload the file.

Check that all crawler traps are uploaded.

The number of crawlertraps uploaded may vary, as duplicates are removed during the upload to the database.

You can verify that the only difference between the uploaded and downloaded list is duplicates in the former list:

download the uploaded list to your local harddisk as crawlertraps.downloaded.txt
sort -u crawlertraps.downloaded.txt > crawlertraps.downloaded.txt.sorted
sort -u crawlertraps.txt > crawlertraps.txt.sorted

diff crawlertraps.downloaded.txt.sorted crawlertraps.txt.sorted
The output should be empty.

It42CheckGlobalCrwalerTraps (last edited 2011-04-13 11:47:49 by ColinRosenthal)