Task list and timetable for iteration 37
Status |
OK/Not Ok |
1. Highlights approved |
OK |
2. Assignment of tasks |
OK |
3. Task list and time table approved |
OK |
4. Implementation phase started |
OK |
5. Release test phase started |
OK |
6. Assignment phase for next iteration started |
OK |
7. Iteration 37 completed |
OK |
Highlights for Iteration
Support of released NetarchiveSuite (http://netarchive.dk/suite).
Enhance NetarchiveSuite wiki according to decided structure.
Implement bugs and feature request from this prioritized list
Implementation of some of the priority 4 feature request from task list
- Enhancement of QA
- Enhancement of Batch support
Finalize the support of Wayback in the Netarchive.dk production site. See List of tasks and Assignment for Wayback Integration
- Migration of old Web materials to Netarchive.dk
- Start of task according to roadmap
Module Archive
- Enhanced support for Batch
Module Harvester
- ...
Module Access
- Support for Wayback
- Test of Nutchwax
Module Common
- ...
- Bug fixes according to updated prioritized bug list
- Iteration 37 is planned as a development release candidate.
Development procedure
Implementation according to implementation methodology
Implementation and release test mainly in intensive period
- Estimated ... for implementation.
- Estimated ... for release test.
- Estimated ... for assignemnt of tasks for iteration 38.
- Target release: Start of August
Table of tasks
Tasks for iteration 37. Updated 28. July 2009 |
Estimate md |
Main responsible |
Reviewer |
Remaining md at 10. June 2009 |
Comments |
Status |
||||
Implementation phase (task x-n) |
||||||||||
Open Source release + bugs and feature request |
Total ? |
- |
- |
Total x |
|
- |
||||
Support of Open Source Release |
||||||||||
1. Support of released NetarchiveSuite |
2 |
All (Google calender) |
|
|
|
Ongoing |
||||
2. Implement translateprocess. Adjustment to Open Source partners. |
1 |
KFC |
ELZI |
|
|
.. |
||||
Bugs and Features requests |
||||||||||
Prioritized bugs according to list of priority 4 and priority 3 tasks. |
Total 5,5 |
- |
- |
SubTotal x |
.. |
- |
||||
Priority 4 bugs |
||||||||||
Module Harvester: Bug 1254 Database connections to MySQL close down intermittently? |
2 |
Nicolas |
SVC |
|
.. |
Fixed |
||||
Module Archive: Bug 1694 LocalArcRepositoryClient is broken |
1 |
Nicolas |
SVC |
|
|
Fixed |
||||
Module Harvester: Bug 1628/ Bug 1695 Add custom JVM parameters to Heritrix subprocess |
1 |
Nicolas |
SVC |
|
|
Fixed |
||||
Module Common: Bug 555 JMS connections cannot reconnect. Bug 1218 Exception while adding listeners to JMSConnection. Bug 1299 Network I/O errors shuts down JMSConnection. Bug 1645 JMS connections very unstable. Bug 1275 The message limit (maxNumMsgs) of 100000 has been reached. |
? |
KFC |
SVC |
|
|
Wait for code review |
||||
Module Archive: Bug 1566 Several Deduplicating processes started by error - only one should be possible |
0 |
SVC |
KFC |
|
We consider this bug fixed, as there now is a method to avoid this bug in the future by using the dk.netarkivet.archive.tools.CreateIndex script |
Fixed |
||||
Module Harvester:: Bug 1690 Keep track of order XML changes |
? |
KFC |
SVC |
|
|
.. |
||||
Module Harvester: Bug 1172 password protected domain was not harvested |
1,5 |
CSR |
JOLF |
|
|
Sanity Tested, awaiting QA |
||||
Module Harvester: Bug 1336 Harvester job dies suddenly |
2 |
HBK |
SVC |
|
Waiting for code review #NS-51 for revision #862 |
Wait for code review |
||||
Module Harvester: Bug 928 The guess of initial size of unharvested domains is very bad on harvests with a large object limit |
1 |
HBK |
SVC |
|
Waiting for code review #NS-55+NS-58 for revision #866 |
Wait for code review |
||||
Module Harvester:: Bug 1656 WARNING: Aborting crawl because og inactivity. URLS's in queue:19". |
2 |
HBK |
SVC |
|
This bug cannot be reproduced. Therefore closed! |
OK |
||||
Module Harvester:: Bug 1174 Poor error message on dead job |
0 |
CSR |
JOLF |
|
This should be fixed by fixing bug 1188. No further work is required |
Wait for code review |
||||
Module Harvester:: Bug 1188 Heritrix side exceptions on JMX calls are ignored |
3 |
CSR |
JOLF |
|
|
Wait for code review |
||||
Module Harvester:: Bug 1680 Broad harvest stability (Job fail) |
? |
Andreas |
SVC |
|
|
Sideeffect. Possible solved in bug 555 |
||||
Module Harvester:: Bug 1650 It is not checked when creating the Heritrix process, that the JMX password file assigned to Heritrix exists |
? |
Eleonora |
SVC |
|
Code to fix this bug implemented, but not yet committed. Unittesting remains |
Wait for code review |
||||
Priority 3 bugs |
||||||||||
Module Harvester:: Bug 688 hosts-report should be IDNA decoded when writing harvestInfo to the DB |
2 |
|
|
|
We will need a domain name normalizer that both unmangles IDNA names and lowercases. This will take more than 1 MD. This and 596 must be solved together |
.. |
||||
Module Harvester:: Bug 1644 On Edit Domain page, the text field only shows 21 characters of the domainname |
? |
|
|
|
|
Wait for code review |
||||
Module Harvester:: Bug 1670 Default timeout settings are set way too low in the default settings |
? |
|
|
|
|
OK |
||||
Module Harvester:: Bug 1069 How to setup an apache proxy used to control access to the GUI and viewerproxy servers is missing from the Installation manual |
? |
|
|
|
|
.. |
||||
Module Archive:: Bug 1260 Too much and wrong feedback information on "Missing pages" |
1,5 |
|
|
|
This bug will automatically be solved if we chose to implement feature request #1380 "Avoid double initiations of commands by doubble click" |
.. |
||||
Module Archive:: Bug 1193 Exceptions from FileBatchJob stop batch job processing |
? |
|
|
|
|
.. |
||||
Prioritized Feature Requests according to list of priority 4 and priority 3 tasks |
Total 21,5 |
- |
- |
SubTotal x |
|
- |
||||
Priority 4 Feature request |
||||||||||
Module Harvester: Feature request 1298 Set JMXConnection timeout, if possible |
2 |
CSR |
SVC |
|
... |
.. |
||||
Module Common: Feature request 1654 . second-level domains for .at in settings.xml |
? |
Andreas |
SVC |
|
|
Fixed |
||||
Module Harvester: Feature request 1678 Make CDX-entries for the deduplicate entries in the crawl.log, and append to the other CDX-entries. |
? |
CSR |
SVC |
|
|
.. |
||||
Module Harvester: Feature request 1014 No good way to mark a non-reported-stopped job as FAILED or DONE. |
2 |
HBK |
SVC |
|
|
Waiting for Code Review |
||||
Module Harvester: Feature request 1675 List of all Seeds of a selective Harvests. |
? |
Andreas |
SVC |
|
|
Fixed |
||||
Module Common: Feature request 1687 French translation. |
2 |
Sara |
KFC |
|
BnF: Will be part of next iteration. |
.. |
||||
Module Harvester: Feature request 1678 Make CDX-entries for the deduplicate entries in the crawl.log, and append to the other CDX-entries. |
? |
CSR |
JOLF |
|
|
.. |
||||
Module Harvester: Feature request 1688 Monitoring broad crawls. |
5 |
Sara |
EZI TbC? |
|
BnF: This is just in the assignment phase. |
.. |
||||
Module Harvester: Feature request 1689 Managing crawls using object number. |
? |
Nicolas |
KFC |
|
BnF: Nicolas will work on it in August |
.. |
||||
Module Harvester: Feature request 1641 It should be possible to turn off deduplication completely. |
? |
SVC |
Nicolas? |
|
BnF: Nicolas will be back in August |
.. |
||||
Module Harvester: Feature request 1691 Configure which Heritrix reports to include in metadata ARC file. |
? |
Nicolas |
KFC |
|
Nicolas will work on it in August |
.. |
||||
Priority 3 Feature request |
||||||||||
Module Access: Feature request 623 We need to normalize URLs when browsing data |
5 |
|
|
|
Lighter solution |
.. |
||||
Module Harvester: Feature request 680 Cannot browse harvested password protected material |
10 |
|
|
|
At least partly solved by wayback. Investigations by collections sections ongoing. |
.. |
||||
Module Documentation: Feature request 1288 Batch and and use of Tools must be described |
? |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1066 Show whether seed URL existed |
2,5 |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1112 Automatic checks of seeds when entered in the harvest definition interface |
? |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1120 Crawlertrap info should be shareable between institutions |
? |
|
|
|
|
.. |
||||
Module Archive: Feature request 1285 Storage of processed batch classes |
? |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1482 Harvest information for job must report if there are problems in getting information |
? |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1511 Thousand separators requested in user interface |
? |
|
|
|
|
.. |
||||
Module Harvester: Feature request 1681 Add seed to DB via webservice (via Browser Extension/Rich Client) |
? |
Andreas |
|
|
|
.. |
||||
Module Harvester: Feature request 1682 Statistics (DB access, scripts, batch jobs ....) |
? |
Andreas |
|
|
|
.. |
||||
Module Harvester: Feature request 1683 Util for regenerate admin.data file |
? |
Andreas |
|
|
|
.. |
||||
Module Harvester: Feature request 1684 Activity when domain is to be crawled. One table for seed |
? |
Andreas |
|
|
|
.. |
||||
Module None Feature request 1677 Enable WARC file writing and handling in the NetarchiveSuite |
? |
Soeren |
|
|
|
.. |
||||
Module None Feature request 1116 Global crawlertraps |
? |
Soeren |
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
Roadmap tasks |
Total 52? |
- |
- |
Total x |
|
- |
||||
Tasks from ... |
||||||||||
Task Access 2.1 Wayback Into Version Control |
1,5 |
CSR |
KFC |
|
|
OK |
||||
Task Access 2.2 Ant target for deployable wayback |
2 |
CSR |
JOLF |
|
|
In Production Awaiting QA ? |
||||
Task Access 2.3 Create a PROPER version of NetarchiveResourceStore |
5 |
HBK |
CSR |
|
Committed but not tested. No impact on NetarchiveSuite code. |
Started |
||||
Assignment for enhanced QA tools |
2 |
KFC |
SVC |
|
|
.. |
||||
Finalize Assigment for Harvester for support of WARC format |
? |
SVC |
KFC |
|
|
.. |
||||
Finalize assignment for Assignment group B.2.2 |
0,5 |
JOLF |
KFC |
|
|
.. |
||||
Implement Assignment B.2.2a - Generalise replica to include all checksum voters |
14? |
JOLF |
KFC |
|
|
Started |
||||
Implement Assignment B.2.2b - Store bit preservation information in a database |
8 |
JOLF |
KFC |
|
|
Started |
||||
Implement Assignment B.2.3 - Use segments in bitarchives |
6 |
|
|
|
|
.. |
||||
Implement Assignment B.2.4 - Write BitPreservation scheduler |
5 |
|
|
|
|
.. |
||||
Implement Assignment B.2.5 - Write BitPreservation webinterface |
6 |
|
|
|
|
.. |
||||
Finalize assignment for Assignment group B.4.4 - Yet more better infrastructure |
2 |
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
Wayback/Nutchwax tasks independent of NetarchiveSuite code-freeze. |
Total x |
- |
- |
Total x |
|
- |
||||
Tasks from ... |
||||||||||
|
|
|
|
|
|
.. |
||||
|
5 |
|
|
|
|
.. |
||||
Task Access 2.4 Deduplicated CDX Indexing (Technical investigation) |
1 |
|
|
|
|
.. |
||||
Evaluation of NutchWax. |
2? |
|
|
|
|
.. |
||||
Technical decision on type of production HW for Wayback and Nutchwax. |
2? |
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
Converting old Web collections to Netarchive.dk. See proposal. These task will be independent of NetarchiveSuite code-freeze. |
Total x |
- |
- |
Total x |
|
- |
||||
Tasks from ... |
||||||||||
Investigation in dataformat as well as methods |
? |
SVC |
HBK |
|
|
.. |
||||
Generic converter prototype |
? |
SVC |
HBK |
|
|
.. |
||||
Old KB Webarchive |
? |
SVC |
HBK |
|
|
.. |
||||
Old Webarchive harvested with ARC-Httrack |
? |
HBK |
SVC |
|
|
.. |
||||
Old Webarchive harvested with Wget |
? |
HBK |
SVC |
|
|
.. |
||||
Old Webarchive harvested with NedLib |
? |
SVC |
HBK |
|
|
.. |
||||
Old Webarchive from Niels Brugger in waf format |
? |
HBK |
JOLF |
|
|
.. |
||||
Old Webarchive from Kurt Vest Nielsen (Ingeniøren from 1995) |
? |
JOLF |
HBK |
|
|
.. |
||||
Webarchive from the library of The Danish Parliament |
? |
SVC |
HBK |
|
|
.. |
||||
Old Webarchives from Net-papers |
? |
SVC |
HBK |
|
|
.. |
||||
Digital publications of The Danish Law Gazette from the missing period |
? |
SVC |
HBK |
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
|
|
|
|
|
|
.. |
||||
Common tasks calculated as implementation tasks |
Total x |
- |
- |
Total x |
|
- |
||||
Others |
Total x |
- |
- |
SubTotal x |
|
- |
||||
Setup of new KB test system |
|
TLR |
|
|
|
.. |
||||
Setup open Crucible server |
|
KFC |
|
|
|
.. |
||||
Prepare release test |
Total x |
- |
- |
SubTotal x |
|
- |
||||
Prepare release test |
|
|
|
|
|
.. |
||||
Available man-days for implementation phase |
Total x |
- |
- |
Total x |
|
- |
||||
Release test phase (task ...) |
||||||||||
Release test |
Total x |
- |
- |
Total 10 |
|
- |
||||
Execute release test. |
|
' |
|
|
|
.. |
||||
|
|
' |
|
|
|
.. |
||||
Release notes |
Total x |
- |
- |
Total 0,5 |
|
- |
||||
|
|
|
|
|
|
- |
||||
Available man-days for release test phase |
Total x |
- |
- |
Total 10 |
|
- |
||||
Assignment phase for next iteration (task ...) |
||||||||||
Component bug/feature fix/management |
|
QA |
|
|
|
.. |
||||
Define goals for Iteration 38 task list |
|
CHH |
|
|
|
.. |
||||
Presentation of goals and tasks for Iteration 37. Achieve a common understanding of the purpose of the iteration and each task on status meeting |
|
SVC |
|
|
|
.. |
||||
Assignment of tasks, bugs and feature request |
|
QA |
|
|
|
.. |
||||
Update release test procedure |
|
TLR |
|
|
|
.. |
||||
Available man-days for assigment phase |
Total x |
- |
- |
Total 22 |
|
- |
Timetable
Timetable iteration 37. Updated 23. July 2009 |
Start time |
End time |
Responsible |
Baseline 2. June 2009 . Start time |
Baseline 2. June 2009 . End time |
1. Implementation of decided tasks |
2. June 2009 |
24. July 2009 |
|
2. June 2009 |
24. July 2009 |
2. Code freeze. Create the build for release test and notify when build is ready |
30. July 2009 |
|
SVC |
27. July 2009 |
|
3. Release test |
30. July 2009 |
3. August 2009 |
TLR |
27. July 2009 |
29. July 2009 |
4. Code unfreeze |
3. August 2009 |
|
SVC |
30. July 2009 |
|
5. Assignments, bug components and bug fixes |
3. August 2009 |
5 August 2009 |
|
30. July 2009 |
31. July 2009' |