== Task list and timetable for iteration 41 == ||'''Status''' ||'''OK/Not Ok''' || ||1. Highlights approved ||'''OK''' || ||2. Assignment of tasks ||'''OK''' || ||3. Task list and time table approved ||'''OK''' || ||4. Implementation phase started ||'''OK''' || ||5. Release test phase started ||'''OK''' || ||6. Assignment phase for next iteration started || || ||7. Iteration 41 completed || || === Highlights for Iteration === * [[http://kb-prod-udv-001.kb.dk/twiki/bin/edit/Netarkiv/SupportNetarchiveSuite|Support]] of released !NetarchiveSuite (http://netarchive.dk/suite). * Enhance !NetarchiveSuite wiki according to [[UpdateNetarchiveSuiteWiki|decided structure]]. * Implement prioritized bugs and feature requests * Enhancement of Batch support * Support of Wayback in the Netarchive.dk production site. See [[IntegrationOfWaybck|List of tasks]] and [[AssignmentWaybackIntegration|Assignment]] for Wayback Integration * Migration of old Web materials to Netarchive.dk * Iteration 41 is planned as a development release. === Development procedure === * Implementation according to [[http://netarchive.dk/suite/Development|implementation methodology]] * Implementation and release test mainly in [[http://www.google.com/calendar/render?gsessionid=tjZgbhGt6eNBB1mrlNwt3A|intensive period]] * Target release: February 2010 === Table of tasks === ||'''Tasks for iteration 41. Updated 15. February 2010''' ||'''Estimate md''' ||'''Main responsible''' ||'''Reviewer ''' ||<10% bgcolor="#cccccc" style="TEXT-ALIGN: center">'''Remaining md at 9. February 2010''' ||<20% bgcolor="#cccccc" style="TEXT-ALIGN: center">'''Comments''' ||'''Status''' || ||||||||||||||||||||||'''Implementation phase (task x-n)''' || ||'''Open Source release + bugs and feature request''' ||'''Total 3''' ||'''-''' ||'''-''' ||'''Total 3''' || ||'''-''' || ||||||||||||||||||||||'''Support of Open Source Release''' || ||1. [[http://kb-prod-udv-001.kb.dk/twiki/bin/view/Netarkiv/SupportNetarchiveSuite|Support]] of released !NetarchiveSuite ||2 ||'''All (Google calender)''' || ||2 || ||Ongoing || ||2. Implement translateprocess. Adjustment to Open Source partners. ||1 ||CSR ||SVC || || ||- || ||3. Maintain French Translation files. ||1 ||Nicolas/Sara ||SVC || ||See also Task 22 ||- || ||4. Maintain Italian and german Translation files. ||1 ||Andreas/Eleonora ||SVC || ||See also Task 22 ||- || ||||||||||||||||||||||'''Bugs and Features requests''' || ||Prioritized bugs according to [[https://gforge.statsbiblioteket.dk/tracker/index.php?group_id=7&atid=105|list]] of priority 4 and priority 3 tasks. ||'''Total 5''' ||'''-''' ||'''-''' ||'''!SubTotal 0''' ||.. ||'''-''' || ||||||||||||||||||||||'''Priority 4 bugs''' || ||'''5 Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1832|Bug 1832]]'' 4-6 minutes delay during delete file and reply to checksumreplica ||? ||JOLF ||CSR ||0 ||Test by running setup with a checksum replica. Perform a harvest an check that the upload time to the checksum replica is less than 1 min. ||OK || ||'''6.Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1834|Bug 1834]]'' No change in GUI checksum after remove of line in checksum_CS.md5 ||3 ||JOLF ||HBK ||0 ||Test by running Test3 without restarting the checksum application ||OK || ||'''7.Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1836|Bug 1836]]'' The value of the variable 'File from' must not be null. ||1 ||JOLF ||HBK ||0 ||An IOFailure should be thrown instead of a null returned, which causes problems later. ||OK || ||||||||||||||||||||||'''Priority 3 bugs''' || ||'''8. Module harvester:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=688|Feature request 688]]'' hosts-report should be IDNA decoded when writing harvestInfo to the DB || || || || || ||- || ||'''9. Module Access:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=823|Bug 823]]'' No index = Internal server error || || || || || || || ||'''10. Module Monitor:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1756|Bug 1756]]'' JMX status page does not update when a new application is started on previously used JMX port || || || || || || || ||'''11. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1782|Bug 1782]]'' Same datetime repeated many times, while logging batch checksum of files || || || || || || || ||'''12. Module Documentation:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1779|Bug 1779]]'' Improve documentation of the additional tools || || || || || || || ||'''13. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1764|Bug 1764]]'' Poor information on failed batch job || || || || || || || ||'''14. Module Documentation:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1732|Bug 1732]]'' LocalArcRepositoryClient not documented || || || || || || || ||'''15. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1727|Bug 1727]]'' Poor error message in RunBatch ||0.5 ||JOLF ||SVC ||0 ||Try running RunBatch with a unknown replica name. ||OK || ||'''15.a Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1764|Bug 1764]]'' Poor information on failed batch job ||2 ||JOLF ||SVC ||0 ||Try running RunBatch with a unknown method (-N argument) in a jar file. ||OK || ||'''16. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1708|Bug 1708]]'' bitpreservation logic offers "add to archive" for file that is not in either location || || || || || || || ||'''''' || || || || || ||- || ||'''17. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1619|Bug 1619]]'' Potential NullPointer exception in RemoveAndGetFileMessage.getData() || || || || || || || ||'''18. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1260|Bug 1260]]'' Too much and wrong feedback information on "Missing pages" || || || || || || || ||'''19. Module Monitor:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1205|Bug 1205]]'' Security policy for unit tests contains hardcoded path to development environment || || || || || || || ||'''20. Module Archive:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1193|Bug 1193]]'' Exceptions from FileBatchJob stop batch job processing || || || || || || || || || || || || || ||.. '''''' || ||Prioritized Feature Requests according to [[TaskTableFromMay2009Workshop|list]] of priority 4 and priority 3 tasks ||'''Total 21''' ||'''-''' ||'''-''' ||'''!SubTotal 21''' || ||'''-''' || ||||||||||||||||||||||'''Priority 4 Feature request''' || ||'''21. Module harvester:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1116|Feature request 1116]]'' Global crawlertraps ||7 ||CSR ||SVC ||0 ||Implementation . See also FR 1120 ||Follow up || ||'''22. Module harvester:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1696|Feature request 1696]]'' Ingest domain seed URLs ||? ||Nicolas ||SVC || || ||- || ||'''23. Module harvester:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1688|Feature request 1688]]'' Monitoring broad crawls. ||? ||Nicolas ||SVC || ||. ||In sanity test || ||'''24. Module Harvester:''' ''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1134|Feature request 1134]]'' Filter job lists by category '''''' ||? ||Nicolas/Sara ||CSV || || ||'''-''' || ||'''25. Module Harvester: '''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1668|Feature request 1668]]'' Paginate and make sortable and searchable the list of jobs '''''' ||? ||Nicolas/Sara ||CSV || || ||In sanity test || ||'''26. Module Harvester:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1813|Feature request 1813]]'' An extra resubmit button to make it visible which jobs have already been handled ||? ||SVC ||HBK || || ||-. || ||'''''' || || || || ||. ||-''' ''' || ||||||||||||||||||||||Priority 3 Feature request '''''' || ||'''27.Module Harvester:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1774|Feature request 1774]]'' Stop using the JMS queues for queuing snapshot harvests '''''' || || || || || ||-''' ''' || ||'''28. Module Harvester''':''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1681|Feature request 1681]]''''' '''Add seed to DB via webservice (via Browser Extension/Rich Client)''' ''' ||? ||Andreas || || || ||Started || ||'''29. Module Harvester:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1682|Feature request 1682]]'' Statistics (DB access, scripts, batch jobs ....) '''''' ||? ||Andreas || || || ||? '''''' || ||'''30. Module Harvester''':''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1683|Feature request 1683]]'' Util for regenerate admin.data file''' ''' ||? ||Andreas || || || ||? '''''' || ||'''31. Module Harvester:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1684|Feature request 1684]]'' Activity when domain is to be crawled. One table for seed '''''' ||? ||Andreas || || || ||? || ||'''32. Module Archive''':''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1743|Feature request 1743]]'' When accessing Bitpreservation this takes really long time ||? ||Andreas || || || ||? || ||'''33. Module Harvester:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1120|Feature request 1120]]'' Crawlertrap info should be shareable between institutions '''''' ||? ||Andreas || || ||SVC will add comments to this FR. Might be an easy solution to share Crawlertraps by emailing files with crawler trap informations. ||Redundant (Copy of 20) || ||'''34. Module Harvester''':''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1066|Feature request 1066]]'' Show whether seed URL existed''' ''' ||? ||Andreas || || || ||? || ||'''35. Module Archive:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1809|Feature request 1809]]'' Write assignment for improving batchjob interface '''''' ||? ||JOLF || || || ||.. '''''' || ||'''35.a Module Deploy:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1846|Feature request 1846]]'' Deploy the bitpreservation database ||0.5 ||JOLF ||HBK ||1 || ||OK || ||'''35.b Module Deploy:'''''[[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1790|Feature request 1790]]'' Print usage of RunNetarchiveSuite.sh ||0.5 ||HBK ||JOLF || || ||OK || ||Roadmap tasks '''''' ||Total 52? '''''' ||- '''''' ||- '''''' ||Total 8,5 '''''''''''' || ||- '''''' || ||||||||||||||||||||||Tasks from ... '''''' || ||'''''' || || || || || || || ||36. Assignment for enhanced QA tools ||2 ||SVC ||HBK || ||High priority || || ||37. Finalize [[AssignmentHarvester2|Assigment]] for Harvester for support of WARC format ||? || || || || || || ||38. Finalize assignment for [[AssignmentGroupB2|Assignment group B.2.2]] ||0,5 ||JOLF ||HBK || || ||OK || ||39. Implement [[AssignmentGroupB2|Assignment B.2.2a]] - Generalise replica to include all checksum voters ||8 ||JOLF ||CSR ||8 ||High priority ||OK || ||40. Implement [[AssignmentGroupB2|Assignment B.2.2b]] - Store bit preservation information in a database ||8 ||JOLF ||HBK || ||Splitting in subtasks ||In progress || ||41. Implement [[AssignmentGroupB2|Assignment B.2.3]] - Use segments in bitarchives ||6 || || || || || || ||42. Implement [[AssignmentGroupB2|Assignment B.2.4]] - Write !BitPreservation scheduler ||5 || || || || || || ||43. Implement [[AssignmentGroupB2|Assignment B.2.5]] - Write !BitPreservation webinterface ||6 || || || || || || ||44. Finalize assignment for [[AssignmentGroupB4|Assignment group B.4.4]] - Yet more better infrastructure ||2 || || || || || || || || || || || || ||.. '''''' || || || || || || || ||.. '''''' || ||[[http://netarkivet.dk/netarkivet/index.php?title=Kendte_problemer|Crawl-problems]] (Netarchive.dk) . '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total x '''''''''''' || ||- '''''' || ||||||||||||||||||||||Focus on following crawl-problems '''''' || ||45. [[http://netarkivet.dk/netarkivet/index.php?title=Dinby.dk|dinby.dk]] 2009-02-17 ||1 ||CSR ||JOLF/DLA ||1 ||High priority ||.. '''''' || ||46. [[http://netarkivet.dk/netarkivet/index.php?title=Kino.dk|Kino.dk]] 2009-03-25 ||1 ||HBK ||SVC/LO+JM ||1 ||High priority ||Awaiting review '''''' || ||47. [[http://netarkivet.dk/netarkivet/index.php?title=Webmuseum.re-cph.com|Webmuseum.re-cph.com]] 2009-08-04 ||1 ||CSR ||JOLF/DLA ||1 ||High priority ||In progress '''''' || ||48. [[http://netarkivet.dk/netarkivet/index.php?title=Epn.dk|Epn.dk]] 2009-08-30 ||1 ||CSR ||SVC/SAS ||1 ||High priority ||.. '''''' || ||48a. [[http://netarkivet.dk/netarkivet/index.php?title=statstidende.dk|Statstidende.dk]] ||0.5 ||HBK ||SVC/SAS || || ||Awaiting review '''''' || ||48b. [[http://netarkivet.dk/netarkivet/index.php?title=seoghoer.dk|seoghoer.dk]] ||0.5 ||HBK ||JOLF/SAS || || ||Awaiting review'''''' || ||Wayback/Nutchwax tasks independent of !NetarchiveSuite code-freeze. '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total x '''''''''''' || ||- '''''' || ||||||||||||||||||||||Tasks from ... '''''' || ||49. Review of Wayback Indexing component architecture and assignment document (AutomaticIndexing) ||1 ||CSR ||SVC ||1 ||High priority ||Awaiting review '''''' || ||50. Assignment: Integration of wayback in deploy ||3 ||JOLF ||HBK ||3 ||High priority ||- '''''' || || || || || || || ||.. '''''' || || || || || || || ||.. '''''' || ||Converting old Web collections to Netarchive.dk. See [[http://udvikling.kb.dk/cvsshadow/digiliv/ProjektDokumenter/omkostninger%20ved%20indsamling%20af%20gammelt%20materiale-3.doc|proposal]]. These task will be independent of !NetarchiveSuite code-freeze. '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total x '''''''''''' || ||- '''''' || ||||||||||||||||||||||Tasks from ... '''''' || || || || || || || ||- '''''' || || || || || || || ||'''''' || || || || || || || || || ||51. Old KB Webarchive ||? ||SVC ||HBK || ||High priority ||In progress '''''' || ||52. Old Webarchive harvested with ARC-Httrack ||? ||HBK ||SVC || || ||.. ''Under dev.'' || ||53. Old Webarchive harvested with Wget ||? ||HBK ||SVC || || ||In progress '''''' || ||54. Old Webarchive harvested with !NedLib ||? ||SVC ||HBK || || ||In progress '''''' || || || || || || || || || ||55. Old Webarchive from Kurt Vest Nielsen (Ingeniøren from 1995) ||? ||JOLF ||HBK || || ||Postponed '''''' || ||56. Webarchive from the library of The Danish Parliament ||? ||SVC ||HBK || || ||Postponed '''''' || ||57. Old Webarchives from Net-papers ||? ||SVC ||HBK || || ||Postponed '''''' || ||58. Digital publications of The Danish Law Gazette from the missing period ||? ||SVC ||HBK || || ||Postponed '''''' || ||59. Old Webarchive from Niels Brugger collected by HTTrack ||? ||HBK ||SVC || || ||Postponed '''''' || ||60. Prepare ingest of extracted data from Internet Archive into Netarkivet.dk || ||SVC ||HBK || ||High priority.Output will be a document showing what has been made of choice and an instruction to the daily manager of Netarkivet.dk of how to ingest the data. ||In progress || ||61. Ingest received data from Internet Archive into Netarkivet.dk || ||CLO ||SVC || ||High priority ||Awaiting document from task 59. '''''' || ||Common tasks calculated as implementation tasks '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total x '''''' || ||- '''''' || ||Others '''''' ||Total x '''''' ||- '''''' ||- '''''' ||!SubTotal 2 '''''' || ||- '''''' || ||62. Setup of new KB test system (KB-Prod-DK) ||2 ||TLR ||SVC ||2 ||High priority ||- '''''' || ||63. Test of 64 bit version of KB-PROD-ADM ||2 ||TLR ||SVC ||2 || ||.. '''''' || ||[[UsingHarvestersOutsideKbandSB|64. Architectual consideration: Move harvesters close to the backbone of the research network]]. ||2 ||SVC ||HBK ||2 ||High priority ||OK '''''' || ||65. Create/execute a batch test script specified by 1 or 2 researches ||2 ||JOLF ||HBK ||2 || ||.. '''''' || ||66. Prepare joint face to face meeting with UDV and Pligt/Natinal ||1 ||CHH ||CSR ||1 ||High priority ||.. '''''' || ||67. ||1 ||CHH ||CSR ||1 || ||.. '''''' || || || || || || || ||.. '''''' || ||Prepare release test '''''' ||Total x '''''' ||- '''''' ||- '''''' ||!SubTotal 12 '''''' || ||- '''''' || ||68. Prepare [[http://netarchive.dk/suite/Iteration41Releasetest|release test]] ||6 || || ||2 || ||In progress || ||Available man-days for implementation phase '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total x '''''' || ||- '''''' || ||||||||||||||||||||||Release test phase (task ...) '''''' || ||Release test '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total 12 '''''' || ||- '''''' || ||69. Execute [[http://netarchive.dk/suite/Iteration41Releasetest|release test]]. ||12 ||TLR ||All ||12 || ||Awaiting code freeze '''''' || || || ||' || || || ||.. '''''' || ||Release notes '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total 0,5 '''''' || ||- '''''' || ||70. Write release note ||0,5 ||SVC || || || ||Awaiting end of code freeze '''''' || ||Available man-days for release test phase '''''''''''' ||Total x '''''''''''' ||- '''''''''''' ||- '''''''''''' ||Total 10 '''''''''''' || ||- '''''''''''' || ||||||||||||||||||||||Assignment phase for next iteration (task ...) '''''''''''' || ||71. Component bug/feature fix/management || ||QA || || || ||.. '''''' || ||72. Define goals for [[http://netarchive.dk/suite/Iteration42TaskList|Iteration 42 task list]] || ||CHH || || || ||.. '''''' || ||73. Presentation of goals and tasks for Iteration 42. Achieve a common understanding of the purpose of the iteration and each task on status meeting || ||SVC || || || ||.. '''''' || ||74. Assignment of tasks, bugs and feature request || ||QA || || || ||.. '''''' || ||75. Update release test procedure || ||TLR || || || ||.. '''''' || ||Available man-days for assigment phase '''''' ||Total x '''''' ||- '''''' ||- '''''' ||Total 22 '''''' || ||- '''''' || === Timetable === ||Timetable iteration 41. Updated 17. January 2010 ||Start time '''''''''''' ||End time '''''''''''' ||Responsible '''''' ||Baseline 13. December ''2009''. Start time '''''' ||''Baseline ''13. December ''2009. End time'' '''''' || ||1. Implementation of decided tasks ||18. December 2009 ||8. February 2009 || ||18. December 2009 '''''' ||1. February 2009 '''''' || ||2. Code freeze. Create the build for release test and notify when build is ready ||9. February 2009 || ||SVC ||2. February 2009 '''''' || || ||3. Release test ||9. February 2009 ||11. February 2009 ||TLR ||2. February 2009 '''''' ||4. February 2009 '''''' || ||4. Code unfreeze ||12. February 2009 || ||SVC ||5. February 2009 || || ||5. Assignments, bug components and bug fixes ||10. February 2009 ||11. February 2009 || ||3. February 2009 ||4. February 2009 '''''' ||