Heritrix Configurations

Action(edit)

For configuration related to NetarchiveSuite, please refer to section on [:Configuration Manual 3.10#ConfigureHeritrixProcess:Configure Heritrix Process].

For more specific Heritrix configurations, please refer to [:Configuration Manual 3.10#ManagingHeritrixHarvestTemplates:appendix B] and [:Configuration Manual 3.10#MigrateHeritrixTemplatesTo36:appendix C] of this document.

The crawling in NetarchiveSuite uses by default Deduplication. This feature and how to disable it is described in Configuration Manual, Section 8.1.2.