Differences between revisions 1 and 2
Revision 1 as of 2009-09-04 11:44:58
Size: 4546
Comment:
Revision 2 as of 2009-09-09 09:13:03
Size: 4453
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
Line 17: Line 16:
{{{  {{{
Line 21: Line 20:
{{{ 
{{{
Line 25: Line 25:
Line 28: Line 27:
|| On several of the files you need to set the SVN "svn:keywords" property with value=URL Revision Author Date Id || Cosmetic || NOTOK || || On several of the files you need to set the SVN "svn:keywords" property with value=URL Revision Author Date Id || Cosmetic || OK ||
Line 39: Line 38:
|| 43-44 || We don't include that kind of information in the NetarchiveSuite javadoc || Cosmetic || NOTOK ||
|| 52-53 || Missing javadoc || Cosmetic || NOTOK ||
|| 55 || Missing javadoc || Cosmetic || NOTOK ||
|| 64 || missing javadoc || Cosmetic || NOTOK ||
|| 84 || javadoc || Cosmetic || NOTOK ||
|| 43-44 || We don't include that kind of information in the NetarchiveSuite javadoc || Cosmetic || OK ||
|| 52-53 || Missing javadoc || Cosmetic || OK ||
|| 55 || Missing javadoc || Cosmetic || OK ||
|| 64 || missing javadoc || Cosmetic || OK ||
|| 84 || javadoc || Cosmetic || OK ||
Line 46: Line 45:
|| General || Remove underscores in variablenames. This violates our coding style || Cosmetic || NOTOK ||
|| 30 || Unnecessary blank lines in import block || NA || NOTOK ||
|| 42 || Remove unused line || Cosmetic || NOTOK ||
|| 48 || Missing javadoc || Cosmetic || NOTOK ||
|| 53 || missing javadoc || NA || NOTOK ||
|| 63 || missing javadoc || Cosmetic || NOTOK ||
|| General || Remove underscores in variablenames. This violates our coding style || Cosmetic || OK ||
|| 30 || Unnecessary blank lines in import block || NA || OK ||
|| 42 || Remove unused line || Cosmetic || OK ||
|| 48 || Missing javadoc || Cosmetic || OK ||
|| 53 || missing javadoc || NA || OK ||
|| 63 || missing javadoc || Cosmetic || OK ||
Line 54: Line 53:
|| 31 || Missing period in first sentence of javadoc || Cosmetic || NOTOK ||
|| 47 || use try/catch on securityexception || Cosmetic || NOTOK ||
|| 31 || Missing period in first sentence of javadoc || Cosmetic || OK ||
|| 47 || use try/catch on securityexception || Cosmetic || OK ||
Line 58: Line 57:
|| General || Remove underscores in variablenames. This violates our coding style || Cosmetic || NOTOK ||
|| General || Divide lines longer than 80 characters into two lines. || Cosmetic || NOTOK ||
|| 41 || Missing class javadoc || NA || NOTOK ||
|| 48-58 || Missing javadoc || Cosmetic || NOTOK ||
|| 60 || Missing javadoc || Cosmetic || NOTOK ||
|| 62 || Missing javadoc || Cosmetic || NOTOK ||
|| 66 || Missing javadoc, and missing validation of argument 'line' || Cosmetic || NOTOK ||
|| 67 || Make a constant for the "duplicate:" string || Cosmetic || NOTOK ||
|| 108 || Missing javadoc and argument validation || Cosmetic || NOTOK ||
|| General || Remove underscores in variablenames. This violates our coding style || Cosmetic || OK ||
|| General || Divide lines longer than 80 characters into two lines. || Cosmetic || OK ||
|| 41 || Missing class javadoc || NA || OK ||
|| 48-58 || Missing javadoc || Cosmetic || OK ||
|| 60 || Missing javadoc || Cosmetic || OK ||
|| 62 || Missing javadoc || Cosmetic || OK ||
|| 66 || Missing javadoc, and missing validation of argument 'line' || Cosmetic || OK ||
|| 67 || Make a constant for the "duplicate:" string || Cosmetic || OK ||
|| 108 || Missing javadoc and argument validation || Cosmetic || OK ||
Line 69: Line 68:
|| 30 || Missing period in first sentence of javadoc || Cosmetic || NOTOK ||
|| 37 || What type of canonicalization is done on the target url || Cosmetic || NOTOK ||
|| 46 || Replace "dedup lines" with "lines containing deduplication information" or similar || Cosmetic || NOTOK ||
|| 47 || Missing period in first sentence of javadoc || Cosmetic || NOTOK ||
|| 30 || Missing period in first sentence of javadoc || Cosmetic || OK ||
|| 37 || What type of canonicalization is done on the target url || Cosmetic || OK ||
|| 46 || Replace "dedup lines" with "lines containing deduplication information" or similar || Cosmetic || TOK ||
|| 47 || Missing period in first sentence of javadoc || Cosmetic || OK ||
Line 75: Line 74:
|| General || Missing svn svn:keywords property with value=URL Revision Author Date Id || Cosmetic || NOTOK ||
|| 1 || File headers/copyright missing || Cosmetic || NOTOK ||
|| 25 || Missing javadoc || Cosmetic || NOTOK ||
|| General || Missing svn svn:keywords property with value=URL Revision Author Date Id || Cosmetic || OK ||
|| 1 || File headers/copyright missing || Cosmetic || OK ||
|| 25 || Missing javadoc || Cosmetic || OK ||

Review (NS-87): FR1678: Improved indexing for wayback

Author

Colin

Moderator

Colin

State

Closed

Objectives

See https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1678
and http://netarchive.dk/suite/ImprovedIndexing
The implemented code includes:
A batch job to extract wayback cdx indexes from archive arc files
A batch job to extract wayback cdx indexes from deduplication metadata records
An application to extract wayback cdx indexes from deduplication crawl logs
 + associated helper methods

Summary

follow-up: csr

Total Time Used (Coding,Documentation,Review):

CSR:4 MD
SVC:0.5 MD

General comments:

Description

Classification

Status

On several of the files you need to set the SVN "svn:keywords" property with value=URL Revision Author Date Id

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/DeduplicateToCDXApplication.java', revision 995

Lines

Description

Classification

Status

28

blank line

NA

NOTOK

50

Missing argument validation

Cosmetic

NOTOK

55

explicitly close stream

Minor

NOTOK

70

[spelling] wyaback => wayback

Cosmetic

NOTOK

Comments on file 'trunk/src/dk/netarkivet/wayback/batch/ExtractWaybackCDXBatchJob.java', revision 995

Lines

Description

Classification

Status

43-44

We don't include that kind of information in the NetarchiveSuite javadoc

Cosmetic

OK

52-53

Missing javadoc

Cosmetic

OK

55

Missing javadoc

Cosmetic

OK

64

missing javadoc

Cosmetic

OK

84

javadoc

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/batch/ExtractDeduplicateCDXBatchJob.java', revision 995

Lines

Description

Classification

Status

General

Remove underscores in variablenames. This violates our coding style

Cosmetic

OK

30

Unnecessary blank lines in import block

NA

OK

42

Remove unused line

Cosmetic

OK

48

Missing javadoc

Cosmetic

OK

53

missing javadoc

NA

OK

63

missing javadoc

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/batch/UrlCanonicalizerFactory.java', revision 995

Lines

Description

Classification

Status

31

Missing period in first sentence of javadoc

Cosmetic

OK

47

use try/catch on securityexception

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/batch/DeduplicateToCDXAdapter.java', revision 995

Lines

Description

Classification

Status

General

Remove underscores in variablenames. This violates our coding style

Cosmetic

OK

General

Divide lines longer than 80 characters into two lines.

Cosmetic

OK

41

Missing class javadoc

NA

OK

48-58

Missing javadoc

Cosmetic

OK

60

Missing javadoc

Cosmetic

OK

62

Missing javadoc

Cosmetic

OK

66

Missing javadoc, and missing validation of argument 'line'

Cosmetic

OK

67

Make a constant for the "duplicate:" string

Cosmetic

OK

108

Missing javadoc and argument validation

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/batch/DeduplicateToCDXAdapterInterface.java', revision 995

Lines

Description

Classification

Status

30

Missing period in first sentence of javadoc

Cosmetic

OK

37

What type of canonicalization is done on the target url

Cosmetic

OK

46

Replace "dedup lines" with "lines containing deduplication information" or similar

Cosmetic

TOK

47

Missing period in first sentence of javadoc

Cosmetic

OK

Comments on file 'trunk/src/dk/netarkivet/wayback/WaybackSettings.java', revision 995

Lines

Description

Classification

Status

General

Missing svn svn:keywords property with value=URL Revision Author Date Id

Cosmetic

OK

1

File headers/copyright missing

Cosmetic

OK

25

Missing javadoc

Cosmetic

OK

IssuesFromNs87 (last edited 2010-08-16 10:25:07 by localhost)