Task list and timetable for iteration 44

Status

OK/Not Ok

1. Highlights approved

2. Assignment of tasks

3. Task list and time table approved

4. Implementation phase started

5. Release test phase started

6. Assignment phase for next iteration started

7. Iteration 44 completed

Highlights for Iteration

Development procedure

Table of tasks

Tasks for iteration 43. Updated 14. June 2010

Estimate md

Main responsible

Reviewer

Remaining md at 9. June 2010

Comments

Status

Implementation phase (task x-n)

Open Source release + bugs and feature request

-

-

-

Support of Open Source Release

1. [http://kb-prod-udv-001.kb.dk/twiki/bin/view/Netarkiv/SupportNetarchiveSuite Support] of released NetarchiveSuite

2

All (Google calender)

2

Ongoing

2. Implement translateprocess. Adjustment to Open Source partners.

1

CSR

SVC

-

3. Maintain French Translation files.

1

Nicolas/Sara

SVC

-

4. Maintain Italian and german Translation files.

1

Andreas

SVC

-

Bugs and Feature requests

Prioritized bugs according to [https://gforge.statsbiblioteket.dk/tracker/index.php?group_id=7&atid=105 list] of priority 4 and priority 3 tasks.

-

-

SubTotal 0

..

-

Priority 5 bug

5 Module monitor?: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1895 Bug 1895] Running checksum gives Garbage Collector OutOfMemoryError and schedule stops

1

SVC

CSR

Max heap for Bitarchive monitors raised to 1936MB in prod. Awaiting upgrade of java in production environment and installation of version 3.12

Priority 4 bugs

6. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1739 Feature request 1739] #1739 Missing information in System State GUI

.

7. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1970 Feature request 1970] #1970 SQLDataException: The syntax of the string representation of a datetime value is incorrect.

.

8. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1962 Feature request 1962] Batchjob fails when ArcRepository is overloaded

.

9. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1849 Feature request 1849] Invalid javadoc for DomainUtils#DOMAINNAME_CHAR_REGEX_STRING

.

10. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1961 Feature request 1961] Possible to store invalid set of crawlertraps to domain causing the domain to be unreadable

.

11. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1972 Feature request 1972] Resubmit selected failed job does not do anything and gives a webpage error

.

Priority 3 bugs

12. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=688 Feature request 688] hosts-report should be IDNA decoded when writing harvestInfo to the DB

-

13. Module Access: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=823 Bug 823] No index = Internal server error

14. Module Monitor: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1756 Bug 1756] JMX status page does not update when a new application is started on previously used JMX port

15. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1782 Bug 1782] Same datetime repeated many times, while logging batch checksum of files

16. Module Documentation: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1779 Bug 1779] Improve documentation of the additional tools

17. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1708 Bug 1708] bitpreservation logic offers "add to archive" for file that is not in either location

18. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1193 Bug 1193] Exceptions from FileBatchJob stop batch job processing

19. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1260 Bug 1260] Too much and wrong feedback information on "Missing pages"

20. Module Monitor: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1205 Bug 1205] Security policy for unit tests contains hardcoded path to development environment

..

Prioritized Feature Requests according to [:TaskTableFromMay2009Workshop:list] of priority 4 and priority 3 tasks

-

-

SubTotal 21

-

Priority 4 Feature request

21. Module Harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1696 Feature request 1696] Ingest domain seed URLs

?

Nicolas

SVC

22. Module Harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1926 Feature request 1926] Ability to disable the inactivity check

Nicolas

.

23. Module Harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1928 Feature request 1928] Ability to easily resubmit a selection of failed jobs

Nicolas/Sara

24. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1813 Feature request 1813] An extra resubmit button to make it visible which jobs have already been handled

SVC

25. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=929 Feature request 929] Documentation needed for how we split jobs (incl. maybe additional splitting modularity)

SVC

26. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1774 Feature request 1774] Stop using the JMS queues for queuing snapshot harvests

27. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1963 Feature request 1963]#1963 Make a HarvestSchedulerApplication that runs the HarvestScheduler

28. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1956 Feature request 1956]#1956 Shared dependency links harvester and wayback modules

.

-

Priority 3 Feature request

-

29. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1681 Feature request 1681] Add seed to DB via webservice (via Browser Extension/Rich Client)

Andreas

-

30. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1682 Feature request 1682] Statistics (DB access, scripts, batch jobs ....)

Andreas

-

31. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1683 Feature request 1683] Util for regenerate admin.data file

Andreas

-

32. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1684 Feature request 1684] Activity when domain is to be crawled. One table for seed

Andreas

-

33. Module Archive:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1743 Feature request 1743] When accessing Bitpreservation this takes really long time

Andreas

-

34. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1120 Feature request 1120] Crawlertrap info should be shareable between institutions

Andreas

SVC will add comments to this FR. Might be an easy solution to share Crawlertraps by emailing files with crawler trap informations.

Redundant (Copy of ??)

35. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1066 Feature request 1066] Show whether seed URL existed

Andreas

-

36. Module Archive:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1809 Feature request 1809] Write assignment for improving batchjob interface

JOLF

Hign priority

-

Roadmap tasks

Total 52?

-

-

Total 8,5

-

Tasks from ...

37. QA: Assignment for enhanced QA tools

2

SVC

CSR

High priority

-

38. WARC: Finalize [:AssignmentHarvester2:Assigment] for Harvester for support of WARC format

?

39. Archive: Implement [:AssignmentGroupB2:Assignment B.2.3] - Use segments in bitarchives

6

40. Archive: Implement [:AssignmentGroupB2:Assignment B.2.4] - Write BitPreservation scheduler

5

41. Archive: Implement [:AssignmentGroupB2:Assignment B.2.5] - Write BitPreservation webinterface

6

-

..

..

[http://netarkivet.dk/netarkivet/index.php?title=Kendte_problemer Crawl-problems] (Netarchive.dk) .

Total x

-

-

Total x

-

Focus on following crawl-problems

42. Support to [http://netarkivet.dk/netarkivet/index.php?title=Kendte_problemer Crawl-problems]

CSR

SVC

1

High priority

..

43. [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1881 Feature Request 1881] Quality assurance through a batchjob interface

JOLF

High priority

..

Wayback/Nutchwax tasks independent of NetarchiveSuite code-freeze.

Total x

-

-

Total x

-

Tasks from ...

-

44. Wayback: Implement Index Aggregator

4

MSS

JOLF

High Priority

45.Wayback: Documentation of Indexer/Aggregator

2

CSR

JOLF

1

High priority

-

46. Wayback deploy: Complete Wayback deploy

2

JOLF

CSR

2

High priority

-

..

Converting old Web collections to Netarchive.dk. See [http://udvikling.kb.dk/cvsshadow/digiliv/ProjektDokumenter/omkostninger%20ved%20indsamling%20af%20gammelt%20materiale-3.doc proposal]. These task will be independent of NetarchiveSuite code-freeze.

Total x

-

-

Total x

-

Tasks from ...

47. Old Web collection: Old KB Webarchive

SVC

JOLF

High priority

-

48. Old Web collection: Old Webarchive from Niels Brugger collected by HTTrack

MSS

SVC

High priority

-

49. Old Web collection: Prepare ingest of extracted data from Internet Archive into Netarkivet.dk

SVC

MSS

High priority

-

50. Old Web collection: Ingest received data from Internet Archive into Netarkivet.dk

CLO

SVC

-

Common tasks calculated as implementation tasks

Total x

-

-

Total x

-

Others

Total x

-

-

SubTotal 2

-

51. Upgrade: New KB-PROD-UDV

5

SVC

TLR

High priority

-

52. Batch: Create/execute a batch test script specified by 1 or 2 researches

2

JOLF

TLR

High priority

-

..

..

-

..

..

Prepare release test

Total x

-

-

SubTotal 12

-

53. Prepare [http://netarchive.dk/suite/Iteration44Releasetest release test]

6

TLR

6

-

Available man-days for implementation phase

Total x

-

-

Total x

-

Release test phase (task ...)

Release test

Total x

-

-

Total 12

-

54. Execute [http://netarchive.dk/suite/Iteration44Releasetest release test].

12

TLR

All

12

-

'

..

Release notes

Total x

-

-

Total 0,5

-

55. Write release note

0,5

SVC

-

Available man-days for release test phase

Total x

-

-

Total 10

-

Assignment phase for next iteration (task ...)

56. Component bug/feature fix/management

QA

..

57. Define goals for [http://netarchive.dk/suite/Iteration45TaskList Iteration 45 task list]

CHH

..

58. Presentation of goals and tasks for Iteration 45. Achieve a common understanding of the purpose of the iteration and each task on status meeting

SVC

..

59. Assignment of tasks, bugs and feature request

QA

..

60. Update release test procedure

TLR

..

Available man-days for assigment phase

Total x

-

-

Total 22

-

Timetable

Timetable iteration 44. Updated 14. June2010

Start time

End time

Responsible

Baseline 9. June 2010. Start time

Baseline 9. June 2010. End time

1. Implementation of decided tasks

14. June 2010

23. August 2010

14. June 2010

30. July 2010

2. Code freeze. Create the build for release test and notify when build is ready

24. August 2010

SVC

24. August

3. Release test

24. August 2010

26. August 2010

TLR

24. August

4. August

4. Code unfreeze

27. August 2010

SVC

27. August

5. Assignments, bug components and bug fixes

25. August 2010

26. August 2010

25. August

26. August