## page was renamed from AssignmentDeploy = Assignment - Rewrite deploy based on settings overwrite = <> <> == References == === Reference documents === * [[Glossary]] And various mails is Danish which are not included here. === Dependencies === * All of the tasks are required to be done in the same iteration, although they can be implemented and tested in parallel with other developments. This is discussed in section "Order of implementation". === Terminology === * The term 'replica' is introduced instead of the old term 'bitarchive location'. The reason for this change in terminology is that location was confusingly giving people the idea that it had something to do with the physical location of for instance a bitarchive. * The term 'physical location' is introduced to emphasize when we are refering to a physical location. === Bugs === Bugs that will be addressed by this assignment: * [[https://gforge.statsbiblioteket.dk/tracker/?group_id=7&atid=105&func=detail&aid=431|Bug 431: Settings.DIR_COMMONTEMPDIR directories should be emptied upon startup]] (done) * [[https://gforge.statsbiblioteket.dk/tracker/?group_id=7&atid=105&func=detail&aid=433|Bug 433: Starting the bit archives twice without killing inbetween make bitarchive immortal]] (done) * [[https://gforge.statsbiblioteket.dk/tracker/?group_id=7&atid=105&func=detail&aid=846|Bug 846: Make sure that install-scripts makes the necessary directories]] (done) * [[https://gforge.statsbiblioteket.dk/tracker/?group_id=7&atid=105&func=detail&aid=1523|Bug 1523: Windows applications don't show exceptions away during startup]] (later) Note that deploy was not been part of the !NetarchiveSuite from the start, therefore this assignment covers a rewrite of deploy that is not directly covered by the current list of !NetarchiveSuite bugs and feature requests === Feature Requests === Feature request that will be addressed by this assignment: * [[https://gforge.statsbiblioteket.dk/tracker/?group_id=7&atid=108&func=detail&aid=281|Feature Request 281: set FTP-directory to ~/ftp (default is ~)]] (Not attached to deploy anymore - changed to documentation) * [[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=291&group_id=7&atid=108|Feature Request 291: HarvestControllerServer uses http port to set unique THIS_HACO]] (done) * [[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=1520&group_id=7&atid=108|Feature Request 1520: Deploy of more than one Bitapplication per server]] (done) == Basic idea behind new deploy == === History of deploy === In the previous !NetarchiveSuite there already existed a deploy module. However this module had the following inconveniences: * It was quite tightly related to the setup of the Danish installation of !NetarchiveSuite * It was based on a configuration file consisting of XML where the relation between configuration settings and !NetarchiveSuite settings was far from obvious. * It was based on the assumption that all settings had to be specified in a settings file for !NetarchiveSuite Now the assumption that all settings must be specified in a settings file for !NetarchiveSuite is no longer true because of recent changes. The !NetarchiveSuite has now been changed to have in-build default settings so that only local overwrites of defaults has to be set for the individual applications. Furthermore, recent analysis and assignment description (assignment B2) for the archive has revealed the most obvious connections to the setup of the Danish intallation of !NetarchiveSuite. This is mainly based on the interpretation of Location, which previously referred to "location of bitarchive to be used" as well as "physical location of this instance of this application". Part of the assignment B2 as well as part of the current assignment consists of changing the use of the old location settings into the two different terms 'physical location' and 'replica'. This change will mean that the new deploy is no longer so bound the Danish !NetarchiveSuite installation setup. Lastly, the definitions in the new configuration file will in the new deploy be named precisely the same way as their counterpart settings for !NetarchiveSuite in cases where there are a direct correlation. Only a few definitions will be for deploy only, and in these cases the definitions will follow the naming convention of being prefixed with "deploy_". === Override settings structure === The idea is that the new deploy will be based on the default settings in !NetarchiveSuite. The configuration file is then used to declare where the different applications are placed and which settings overrides is needed for each of the applications. The default settings can be overwritten at different levels in the configuration settings file. This is illustrated in the below figure: {{attachment:layers.gif}} == Documentation == === In existing documentation === Deployment/configuration manual must be created with reference to existing sections in Installation Manual, and it must explain the following steps: * Preparation * make it-config file * get netarchivesuite zip file * get and adjust configuration files * deploy * how to * what is happening * installation * how to (including modifications) * what is happening (incl. makedir) * start and kill Settings documentation should only consist in references to documentation done in the subversion copy xml files in repository. The documentation must also include explation of settings overwrite structure and special deploy configuration settings as done in section "Override settings structure" and section "New definition of IT-config". === New deploy documentation === The following documentation must be included as part of a new deploy/configuration manual: The new deploy works in three steps: 1. prepare folders for deployment 1. Install folders on machines for all physical location 1. Start all installed applications Preparation of folders for deployment is illustrated in the below figure (consider whether !NetarchiveSuite zip file should be given as parameter here and be placed in install-dir) {{attachment:deploy_step1.gif}} Installation of folders on machines for all physical location is illustrated in the below figure (consider whether !NetarchiveSuite zip file should be as parameter in previous step, and then be taken from install dir here) {{attachment:deploy_step2.gif}} Start all installed applications is illustrated in the below figure {{attachment:deploy_step3.gif}} == Changes in NetarchiveSuite apart from deploy == === System state GUI === The current columns "Organisation" and "Port" must be replaced by * physicalLocation for "Location" (for all) * replica for "Replica" (for !BitArchiveMonitors and !BitArchives only) * http.port for "HTTP" (for GUIApplication and !ViewerProxyApplication only) * harvesting.queuePriority (for !HarvestControllerApplication only) The final layout must be accepted by the Danish netarchiveSuite users. An idea could be to introduce a hiding mechanism at the same level and in the same way as the "Show all" functionality. In order to pass data to System state GUI, the data must be made available via the !SingleMbeanObject. === Split of location concept === Today "location" covers both physical location and bitarchive replica. This must change in order to make deploy more general. The following location related settings are changed: * locations -> replicas * location -> * replicaId (added replicaType & replicaName) * deployApplicationInstanceId (for !BitarchiveMonitorApplications) * locations.batchLocation -> useReplicaId * thisLocation -> * archive.thisReplicaId for bitarchive & monitor (Channels) * thisPhysicalLocation for others (!SingleMbeanObject) * useReplicaId for getFile/get in archive.tool, viewerproxy & arcrepository?! Used for getFile/get tool === Channel name definitions === Definition of channel names must rely on * thisReplica instead of thisLocation (only used for BitArchives and BitArchiveMonitors) * applicationInstanceId instead of http.port (only used for !HarvestController and !ViewerProxyApplication) Note that this means that [[https://gforge.statsbiblioteket.dk/tracker/index.php?func=detail&aid=291&group_id=7&atid=108|Feature Request 291: HarvestControllerServer uses http port to set unique THIS_HACO]] automatically will be solved by this change. Call of construction of ChannelId is currently: {{{ constructName(String app, String locationName, boolean useNodeId, boolean useProcId) }}} Here * app is for example "ALL_BA" (should be renamed to a better name) * locationName is name of location, which in the future must be the replicaId, thus this parameter should be renamed to a better name. * useNodeId indicates whether IP (LocalIP) should be used (no changed except for a better name maybe) * useProcId indicates whether httpPortNumber should be used - here we want to use value of applicationInstanceId setting instead - and the parameter name must be changed accordingly. === Differentiate applications on instance id === The new deploy will differentiate instances of applications (on the same machine) by a new setting applicationInstanceId. This eliminates dependencies which was introduced in the old deploy which used http.port number or thisLocation instead. In other words the setting applicationInstanceId defines identification of a single application instance which e.g. is used in suffix for application specific scripts, suffix for directory to place files etc. This is needed in cases where there are more instances of the same application are placed on the same machine (e.g. BitarchiveMonitors) The new setting will also replace port in definition of channels (see above). Overview for where the applicationInstanceId is introduced: * Replacing use of port in channel names) * !HarvestController * !ViewerProxyApplication (may not be necessary if use of IP-address is enough and only 1 on each machine) * !BitarchiveMonitorApplication == New definition of IT-config == As decribed in section "Override settings structure" (under "Basic idea behind new deploy") the idea is to have different levels of settings which can overwrite settings from higher levels. {{attachment:layers.gif}} In the next subsection the new deploy settings, additional !NetarchiveSuite settings and indirectly set !NetarchiveSuite settings are explained. In the last subsection an example of a new it-config file is given. Notice for instance that installation directory is defined by deployInstallDir under each physical location defined by thisPhysicalLocation, and then only overwritten on the windows machines. Another example is environmentName which is set to TEST under "deployGlobal", and stays that for the whole deplotment (which it should). It also worthwhile to notice that there a specific configuration can be declared in more ways. For instance for defition of deployInstallDir's, we would have gained the same result if one of the physical location definitions had been moved to the global level, - though the configuration file would not be as readable. === New special deploy settings === * . Defines a class path to be added for an application . Note: several additional class paths can be specified within a scope, but new definitions in inner scopes will overwrite outer scopes. * . Defines a deploy global 1. level scope where settings can be set to overwrite setting defaults from the !NetarchiveSuite software. * . Installation directory for a deployMachine * . Defines a jave option for an application. . Note: several additional java options can be specified within a scope, but new definitions in inner scopes will overwrite outer scopes. * . Defines a deploy machine 3. level scope where common settings for the machine and the applications running in the macine can be set. These settings will overwrite 1. and 2. level settings * . Defines the user name for a deployMachine === New settings === See also above in end of previous main section "Differentiate applications on instance id". The new settings is: * . Defines identification of a single application instance (e.g. suffix for application specific scripts, suffix for directory to place files etc.). This is needed in cases where there are more instances of the same application are placed on the same machine (e.g. !BitarchiveMonitors) in case more instances of the same application are placed on the same machine . Will for instance replace port in definition of channels. === Settings indirectly set === Named tags that result in implicit setting og settings: * in scope . sets settings.common.thisPhysicalLocation to PL * in scope . sets settings.common.applicationName to AN . NOTE: that the applicationName is NOT USED today, it is automatically set at startup. However this is also one of the reasons that we cannot reload new setting for an application. Thus this new deploy will also help solving the reuoload issue at a later stage. === New it-config example === The below must replace the existing it-config-example file, and is an example of how the contents of a new it-config.xml file will look. {{{ lib/dk.netarkivet.archive.jar lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.monitor.jar -Xmx1536m TEST dk.netarkivet.common.distribute.FTPRemoteFile 21 3 dk.netarkivet.common.distribute.JMSConnectionSunMQ kb-dev-adm-001.kb.dk 7676 conf/jmxremote.password 120 43200000 SB SBB bitArchive KB KBB bitArchive KB monitorRole test bitpreservation . tmpdircommon /home/test test kb-dev-har-001.kb.dk ftptestuser ftptestpasswd examplesmtpserver.netarkivet.dk dk.netarkivet.common.utils.EMailNotifications example@netarkivet.dk example@netarkivet.dk KB lib/dk.netarkivet.harvester.jar lib/dk.netarkivet.archive.jar lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.monitor.jar 8076 8100 8200 8101 8201 KBBM 8102 8202 KB SBBM 8103 8203 SB ba-test c:\Documents and Settings\ba-test -Xmx1150m 8100 8200 KB q:\bitarkiv ba-test c:\Documents and Settings\ba-test -Xmx1150m 8100 8200 KB q:\bitarkiv lib/dk.netarkivet.harvester.jar lib/dk.netarkivet.archive.jar lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.monitor.jar 8100 8200 LOWPRIORITY 8190 8191 harvester lib/dk.netarkivet.harvester.jar lib/dk.netarkivet.archive.jar lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.monitor.jar 8100 8200 LOWPRIORITY 8190 8191 harvester viewerproxy 8101 8201 lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.archive.jar lib/dk.netarkivet.monitor.jar 8076 8100 8200 /home/netarkiv netarkiv sb-dev-bar-001.statsbiblioteket.dk ftptestuser ftptestpasswd examplesmtpserver.netarkivet.dk dk.netarkivet.common.utils.EMailNotifications example@netarkivet.dk example@netarkivet.dk SB lib/dk.netarkivet.harvester.jar lib/dk.netarkivet.archive.jar lib/dk.netarkivet.viewerproxy.jar lib/dk.netarkivet.monitor.jar 8100 8200 HIGHPRIORITY 8190 8191 harvester 8100 8200 SB /netarkiv/0001 /netarkiv/0002 8100 8200 8076 viewerproxy }}} TODO: Replace {{{ 8100 8200 }}} by {{{ 8100 8200 }}} == Rewrite deploy == Note that the newest version of the !NetarchiveSuite code has eliminated the use of the !SideKick application. Therefore the special handling in starting and stopping this process is not necessary anymore. Reuse deploy code from kb-doms (Royal Library Digital Object Managemnt System code): * Machine definition class (for different os) . Note that this also must include a internal rename of paths, e.g. if "passwordFile" is set to "conf/jmxremote.password", then the value for windows machines is "conf\jmxremote.password" * make hierarchy of objects at machine and application level Reuse (and expand) !SimpleXmlTree from the current !NetarchiveSuite software. parameter changes compares to old deploy: * jmxremote.password must be generated from scratch instead of reading a file * security.policy file must be given explicit as parameter * log.prop file must be given explicit as parameter * !NetarchiveSuite-package (zip file) must be given explicit as a new parameter * Settings file is no longer needed as parameter * Environment is no longer needed as parameter, it is readen from setting environmentName Use following design: * check input * read and set settings from global level * make physical location objects with from global level in it-config overwritten by specific physical location settings * for each physical location * read and set settings from physical location level * make machine objects specified for the location correspond to the mashine OS * for each machine * read and set settings from machine level * make application objects specified for the machine * for each application * read and set settings from applciation level * use methods on objects in hierarchy to produce various scripts Remember to include bugs and feature requests in the scripts generated. When deployed is rewritten, the scripts for making multi-user test platform must also be updated. Note that it may be an advantage to put in special wordings like "../TESTDIR/.." in the it-test-config file and then replace these tags afterwards by scripting. Remember to uppercase "TESTDIR" specifications, since the difference in whether Linux and windows are case sensitive can give noise in cases where both lower and uppercase is used. == Order of implementation == * Location impl. + test could with big advantage be implemented first * New deploy could be started as parallel activity, but will depend on * changes in location settings * (to some extend) introduction os applicationInstanceId setting == Run command (bash) == * export JAVA_HOME=/usr/java/jdk1.6.0_07 * or a newer java version * export PATH=$JAVA_HOME/bin:$PATH * java -cp deploy2.jar:lib/dk.netarkivet.archive.jar:lib/dk.netarkivet.common.jar:lib/dk.netarkivet.harvester.jar:lib/dk.netarkivet.monitor.jar:lib/dk.viewerproxy.jar:lib/dom4j-1.5.2.jar:lib/commons-logging-1.0.4.jar:lib/commons-cli-1.0.jar dk.netarkivet.deploy2.DeployApplication -Cit_config.xml -ZNetarchiveSuite.zip -Ssecurity.policy -Llog.prop [-Olocation] [-Ddatabase.jar || -Ddatabase.zip] * deploy2.jar is the file containing the compiled method * lib/dk.netarkivet.common.jar, lib/dom4j-1.5.2.jar and lib/commons-logging-1.0.4.jar are libraries required by the application * dk.netarkivet.deploy2.DeployApplication is the name (and path) of the application * Arguments (no required order) * -C followed by the configuration file e.g. it_config.xml (must end with .xml). * -Z followed by the package file e.g. NetarchiveSuite.zip (must end with .zip). * -S followed by the security policy file e.g. security.policy (must end with .policy). * -L followed by the property file for logging e.g. log.prop (must end with .prop). * [OPTIONAL] -O followed by the output directory, if not given the environment name in the it_config file is used. * [OPTIONAL] -D followed by the database, if not given the database in NetarchiveSuite package file is used (must end with either .zip or .jar). * [OPTIONAL] -R followed by yes/no. It defines whether the temporary file directory should be reset/cleaned when making a reinstallation. (Any input different from 'y' or 'yes' will be considered a 'no'). * [OPTIONAL] -T followed by the following sequence: OffsetPort, HttpPort, environmentName, mailReceivers. A new config file is created based on these inputs and the given config file. * [OPTIONAL] -E followed by yes/no. It defines whether the config file should be evaluated, thus checking whether the defined branches in the XML-tree exist in the default settings. Any input different from 'y' or 'yes' (not-case sensitive) will be considered a 'no'.