Differences between revisions 14 and 40 (spanning 26 versions)
Revision 14 as of 2007-06-29 16:01:20
Size: 1326
Comment:
Revision 40 as of 2010-11-29 13:19:58
Size: 1538
Editor: ClausLomborg
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
attachment:transparent_logo.png = Welcome to the NetarchiveSuite =
The !NetarchiveSuite software is developed by the two national deposit libraries in Denmark, [[http://www.kb.dk/|The Royal Library]] and [[http://www.statsbiblioteket.dk|The State and University Library]], and has been running in production, harvesting the Danish world wide web since 2005. The Danish netarchive currently contains over 160 TB of data that are mirrored on two different geographical locations.
Line 3: Line 4:
== Introduction ==
Software to harvest and preserve websites.
The !NetarchiveSuite is the complete web archiving software package developed within the netarchive.dk project from 2004 and onwards. The primary function of the !NetarchiveSuite is to plan, schedule and run web harvests of parts of the Internet. It scales to a wide range of tasks, from small, thematic harvests (e.g. related to special events, or special domains) to harvesting and archiving the content of an entire national domain. The software has built-in bit preservation functionality. The systems architecture allows for the software to be distributed among several machines, possibly on more than one geographical location. The !NetarchiveSuite is built around the Heritrix web crawler, which it uses to harvest the web. You find more information in the [[Overview 3.10|overview]].
Line 6: Line 6:
The NetarchiveSuite software consists of harvesting, preserving and making available parts of the world wide web. <<Include(News)>>
Line 8: Line 8:
Its harvesting capabilities are built around the Heritrx web crawler from internet archive, but the focus of the NetarchiveSuite is to make the power of Heritrix available to the common librarian or curator. To get started with !NetarchiveSuite, [[Get NetarchiveSuite|download]] it and try it out with our [[Quick Start Manual|Quick Start]] installation setup, which only requires one standard Linux machine.
Line 10: Line 10:
The archiving module supports distributed storage with active bit integrity checking of large amounts of data, and support for batch runs over the data.

The access module gives a proxy-based approach, where setting a proxy in your browser will give you access to the web, as it looked at the time of harvest.

Everything is released with full source under the LGPL license.

== About the Netarchive/us ==
The NetarchiveSuite software was developed by the two national deposit libraries in Denmark, [http://www.kb.dk/ The Royal Library] and [http://www.statsbiblioteket.dk The State and University Library], and has been running in production, harvesting the Danish world wide web for two years. The Danish netarchive currently contains over 30 TB of data.

== News ==
[[Include(News)]]

== Getting Started ==
 * Click on the tab ["Get NetarchiveSuite"] and follow the instructions
The software is released with full source under the LGPL license.

Welcome to the NetarchiveSuite

The NetarchiveSuite software is developed by the two national deposit libraries in Denmark, The Royal Library and The State and University Library, and has been running in production, harvesting the Danish world wide web since 2005. The Danish netarchive currently contains over 160 TB of data that are mirrored on two different geographical locations.

The NetarchiveSuite is the complete web archiving software package developed within the netarchive.dk project from 2004 and onwards. The primary function of the NetarchiveSuite is to plan, schedule and run web harvests of parts of the Internet. It scales to a wide range of tasks, from small, thematic harvests (e.g. related to special events, or special domains) to harvesting and archiving the content of an entire national domain. The software has built-in bit preservation functionality. The systems architecture allows for the software to be distributed among several machines, possibly on more than one geographical location. The NetarchiveSuite is built around the Heritrix web crawler, which it uses to harvest the web. You find more information in the overview.

News about issues related to the NetarchiveSuite and this web is given below (To see earlier news, please refer to Old News)

Date

News

14/12 2011

Stable release 3.18.0 has been released. See release notes and download page

08/09 2011

Development release 3.17.0 has been released. See release notes and download page

28/06 2011

Stable release 3.16.1 has been released. See release notes and download page

11/05 2011

Stable release 3.16.0 has been released. See release notes and download page

01/03 2011

Development release 3.15.0 has been released. See release notes and download page

16/02 2011

Stable release 3.14.1 has been released. See release notes and download page

12/11 2010

Stable release 3.14.0 has been released. See release notes and download page

15/09 2010

Stable release 3.12.2 has been released. See release notes and download page

08/09 2010

Development release 3.13.1 has been released. See release notes and download page

06/07 2010

Stable release 3.12.1 has been released. See release notes and download page

15/06 2010

Development release 3.13.0 has been released. See release notes and download page

03/05 2010

Stable release 3.12.0 has been released. See release notes and download page

22/12 2009

Development release 3.11.0 has been released. See release notes and download page

16/11 2009

Stable release 3.10.0 has been released. See release notes and download page

10/9 2009

Stable release 3.8.2 has been released. See release notes and download page

10/8 2009

Development release 3.9.0 has been released. See release notes and download page

15/7 2009

Stable release 3.8.1 has been released. This is a patch release. See release notes and download page

To get started with NetarchiveSuite, download it and try it out with our Quick Start installation setup, which only requires one standard Linux machine.

The software is released with full source under the LGPL license.

Welcome (last edited 2012-04-10 10:43:03 by MikisSethSorensen)