Differences between revisions 5 and 6
Revision 5 as of 2010-03-23 07:32:44
Size: 41069
Comment:
Revision 6 as of 2010-03-23 07:33:44
Size: 40988
Comment:
Deletions are marked like this. Additions are marked like this.
Line 60: Line 60:
||<style="VERTICAL-ALIGN: top">'''23c. Module Harvester:''[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1774 Feature request 1774]''''' Support of harvesting FTP sites ||<style="TEXT-ALIGN: center">2 ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center">. ||<bgcolor="#cccccc" style="TEXT-ALIGN: center">-''' ''' || ||<style="VERTICAL-ALIGN: top">'''23c. Module Harvester:''Feature request''''' Support of harvesting FTP sites ||<style="TEXT-ALIGN: center">2 ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center"> ||<style="TEXT-ALIGN: center">. ||<bgcolor="#cccccc" style="TEXT-ALIGN: center">-''' ''' ||

Task list and timetable for iteration 43

Status

OK/Not Ok

1. Highlights approved

OK

2. Assignment of tasks

3. Task list and time table approved

4. Implementation phase started

5. Release test phase started

6. Assignment phase for next iteration started

7. Iteration 43 completed

Highlights for Iteration

  • [http://kb-prod-udv-001.kb.dk/twiki/bin/edit/Netarkiv/SupportNetarchiveSuite Support] of released NetarchiveSuite (http://netarchive.dk/suite).

  • Enhance NetarchiveSuite wiki according to [:UpdateNetarchiveSuiteWiki:decided structure].

  • Implement prioritized bugs and feature requests
  • Enhancement of Batch support
  • Support of Wayback in the Netarchive.dk production site. See [:IntegrationOfWaybck:List of tasks] and [:AssignmentWaybackIntegration:Assignment] for Wayback Integration

  • Migration of old Web materials to Netarchive.dk
  • Iteration 42 is planned as a stable release.

Development procedure

Table of tasks

Tasks for iteration 43. Updated 23. March 2010

Estimate md

Main responsible

Reviewer

Remaining md at 18. March 2010

Comments

Status

Implementation phase (task x-n)

Open Source release + bugs and feature request

Total 3

-

-

Total 3

-

Support of Open Source Release

1. [http://kb-prod-udv-001.kb.dk/twiki/bin/view/Netarkiv/SupportNetarchiveSuite Support] of released NetarchiveSuite

2

All (Google calender)

2

Ongoing

2. Implement translateprocess. Adjustment to Open Source partners.

1

CSR

SVC

-

3. Maintain French Translation files.

1

Nicolas/Sara

SVC

See also Task 22

-

4. Maintain Italian and german Translation files.

1

Andreas/Eleonora

SVC

See also Task 22

-

Bugs and Feature requests

Prioritized bugs according to [https://gforge.statsbiblioteket.dk/tracker/index.php?group_id=7&atid=105 list] of priority 4 and priority 3 tasks.

Total 5

-

-

SubTotal 0

..

-

Priority 5 bug

5 Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1856 Bug 1856] Schedule problem after first start on NAS 3.10.0. No schedule started

1

SVC

CSR

Not possible to provoke the error in the test system. Next attempt to solve the problem is patch release 3.10.2.

OK

Priority 4 bugs

6.

7..

.

Priority 3 bugs

8. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=688 Feature request 688] hosts-report should be IDNA decoded when writing harvestInfo to the DB

-

9. Module Access: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=823 Bug 823] No index = Internal server error

10. Module Monitor: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1756 Bug 1756] JMX status page does not update when a new application is started on previously used JMX port

11. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1782 Bug 1782] Same datetime repeated many times, while logging batch checksum of files

12. Module Documentation: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1779 Bug 1779] Improve documentation of the additional tools

13. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1708 Bug 1708] bitpreservation logic offers "add to archive" for file that is not in either location

14. Module Documentation: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1732 Bug 1732] LocalArcRepositoryClient not documented

15. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1619 Bug 1619] Potential NullPointer exception in RemoveAndGetFileMessage.getData()

Fixed and reviewed

OK

16. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1260 Bug 1260] Too much and wrong feedback information on "Missing pages"

17. Module Monitor: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1205 Bug 1205] Security policy for unit tests contains hardcoded path to development environment

18. Module Archive: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1193 Bug 1193] Exceptions from FileBatchJob stop batch job processing

..

Prioritized Feature Requests according to [:TaskTableFromMay2009Workshop:list] of priority 4 and priority 3 tasks

Total 21

-

-

SubTotal 21

-

Priority 4 Feature request

19. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1696 Feature request 1696] Ingest domain seed URLs

?

Nicolas

SVC

Postponed

20. Module harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1688 Feature request 1688] Monitoring broad crawls.

?

Nicolas

SVC

.

Postponed

21. Module Harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1134 Feature request 1134] Filter job lists by category

?

Nicolas/Sara

CSR

Postponed

22. Module Harvester: [https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1668 Feature request 1668] Paginate and make sortable and searchable the list of jobs

?

Nicolas/Sara

CSR

Postponed

23. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1813 Feature request 1813] An extra resubmit button to make it visible which jobs have already been handled

?

SVC

CSR

Postponed.

23a. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=929 Feature request 929] Documentation needed for how we split jobs (incl. maybe additional splitting modularity)

?

SVC

CSR

23b. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1774 Feature request 1774] Stop using the JMS queues for queuing snapshot harvests

?

SVC

CSR

23c. Module Harvester:Feature request Support of harvesting FTP sites

2

.

-

Priority 3 Feature request

24.Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1774 Feature request 1774] Stop using the JMS queues for queuing snapshot harvests

-

25. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1681 Feature request 1681] Add seed to DB via webservice (via Browser Extension/Rich Client)

Andreas

Standby

26. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1682 Feature request 1682] Statistics (DB access, scripts, batch jobs ....)

Andreas

-

27. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1683 Feature request 1683] Util for regenerate admin.data file

Andreas

-

28. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1684 Feature request 1684] Activity when domain is to be crawled. One table for seed

Andreas

-

29. Module Archive:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1743 Feature request 1743] When accessing Bitpreservation this takes really long time

Andreas

-

30. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1120 Feature request 1120] Crawlertrap info should be shareable between institutions

Andreas

SVC will add comments to this FR. Might be an easy solution to share Crawlertraps by emailing files with crawler trap informations.

Redundant (Copy of 20)

31. Module Harvester:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1066 Feature request 1066] Show whether seed URL existed

Andreas

-

32. Module Archive:[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1809 Feature request 1809] Write assignment for improving batchjob interface

JOLF

..

Roadmap tasks

Total 52?

-

-

Total 8,5

-

Tasks from ...

33. QA: Assignment for enhanced QA tools

2

SVC

CSR

High priority

In progress

34. WARC: Finalize [:AssignmentHarvester2:Assigment] for Harvester for support of WARC format

?

35. Archive: Implement [:AssignmentGroupB2:Assignment B.2.2b] - Store bit preservation information in a database

8

JOLF

SVC

High priority

OK

36. Archive: Implement [:AssignmentGroupB2:Assignment B.2.3] - Use segments in bitarchives

6

37. Archive: Implement [:AssignmentGroupB2:Assignment B.2.4] - Write BitPreservation scheduler

5

38. Archive: Implement [:AssignmentGroupB2:Assignment B.2.5] - Write BitPreservation webinterface

6

..

..

[http://netarkivet.dk/netarkivet/index.php?title=Kendte_problemer Crawl-problems] (Netarchive.dk) .

Total x

-

-

Total x

-

Focus on following crawl-problems

39. [http://netarkivet.dk/netarkivet/index.php?title=Dinby.dk dinby.dk] 2009-02-17

1

CSR

JOLF

1

High priority

..

40. [http://netarkivet.dk/netarkivet/index.php?title=Kino.dk Kino.dk] 2009-03-25

1

HBK

SVC

1

High priority

Awaiting review

41. [http://netarkivet.dk/netarkivet/index.php?title=Webmuseum.re-cph.com Webmuseum.re-cph.com] 2009-08-04

1

CSR

JOLF

1

High priority

In progress

42. [http://netarkivet.dk/netarkivet/index.php?title=Epn.dk Epn.dk] 2009-08-30

1

CSR

SVC

1

High priority

..

43. [http://netarkivet.dk/netarkivet/index.php?title=statstidende.dk Statstidende.dk]

0.5

HBK

JOLF

Awaiting review

44. [http://netarkivet.dk/netarkivet/index.php?title=seoghoer.dk seoghoer.dk]

0.5

HBK

JOLF

Awaiting review

Wayback/Nutchwax tasks independent of NetarchiveSuite code-freeze.

Total x

-

-

Total x

-

Tasks from ...

45. Wayback: Review of Wayback Indexing component architecture and assignment document (AutomaticIndexing)

1

SVC

CSR

1

OK

46. Wayback: Implementation of object store

3

CSR

SVC

1

-

47. Wayback: Implementation of Indexer

7

CSR

SVC

1

-

48. Wayback: Implementation of aggregator

5

CSR

SVC

1

-

49.Wayback: Documentation

1

CSR

SVC

1

-

50. Wayback deploy:Test of Wayback deploy using Jetty

1

SVC

CSR

1

A prototype application (based on the GUIApplication and corresponding GUIWebserver) has been committed to our SVN repository: dk.netarkivet.wayback.WaybackWebServer dk.netarkivet.wayback.WaybackApplication

OK

51. Wayback deploy: If not possible with Jetty then test of Wayback deploy using Tomcat

2

JOLF

CSR

2

In progress

..

Converting old Web collections to Netarchive.dk. See [http://udvikling.kb.dk/cvsshadow/digiliv/ProjektDokumenter/omkostninger%20ved%20indsamling%20af%20gammelt%20materiale-3.doc proposal]. These task will be independent of NetarchiveSuite code-freeze.

Total x

-

-

Total x

-

Tasks from ...

52. Old Web collection: Old KB Webarchive

SVC

JOLF

High priority

In progress

53. Old Web collection: Old Webarchive from Niels Brugger collected by HTTrack

HBK

SVC

High priority

In progress

54. Old Web collection: Prepare ingest of extracted data from Internet Archive into Netarkivet.dk

SVC

HBK

Wait for IA correction

55. Old Web collection: Ingest received data from Internet Archive into Netarkivet.dk

CLO

SVC

-.

Common tasks calculated as implementation tasks

Total x

-

-

Total x

-

Others

Total x

-

-

SubTotal 2

-

56. Workshop: Heritrix 3 workshop

5

SVC

Week 8

OK

57. Test enviroment: Test of 64 bit version of KB-PROD-ADM

2

TLR

SVC

2

High priority

Postponed

..

58. Batch: Create/execute a batch test script specified by 1 or 2 researches

2

JOLF

HBK

2

Wait for stable production

..

59. Batch: Prepare joint face to face meeting with UDV and Pligt/Natinal

1

CHH

CLO

1

High priority

OK

..

..

Prepare release test

Total x

-

-

SubTotal 12

-

60. Prepare [http://netarchive.dk/suite/Iteration42Releasetest release test]

6

6

OK

Available man-days for implementation phase

Total x

-

-

Total x

-

Release test phase (task ...)

Release test

Total x

-

-

Total 12

-

61. Execute [http://netarchive.dk/suite/Iteration42Releasetest release test].

12

TLR

All

12

Started

'

..

Release notes

Total x

-

-

Total 0,5

-

62. Write release note

0,5

SVC

Awaiting end of code freeze

Available man-days for release test phase

Total x

-

-

Total 10

-

Assignment phase for next iteration (task ...)

63. Component bug/feature fix/management

QA

..

64. Define goals for [http://netarchive.dk/suite/Iteration43TaskList Iteration 43 task list]

CHH

..

65. Presentation of goals and tasks for Iteration 43. Achieve a common understanding of the purpose of the iteration and each task on status meeting

SVC

..

66. Assignment of tasks, bugs and feature request

QA

..

67. Update release test procedure

TLR

..

Available man-days for assigment phase

Total x

-

-

Total 22

-

Timetable

Timetable iteration 42. Updated 19. January 2010

Start time

End time

Responsible

Baseline 19. January 2010. Start time

Baseline 19. January 2010'. End time

1. Implementation of decided tasks

19. February 2010

18. March 2010

15. February 2010

12. March 2010

2. Code freeze. Create the build for release test and notify when build is ready

18. March 2010

SVC

15. March 2010

3. Release test

18. March 2010

19. March 2010

TLR

15. March 2010

19. March 2010

4. Code unfreeze

22. March 2010

SVC

22. March 2010

5. Assignments, bug components and bug fixes

17. March 2010

19. March 2010

17. March 2010

19. March 2010

Iteration43TaskList (last edited 2010-08-16 10:24:49 by localhost)