Rose Reference Guide: Configuration

Goals

Suite configurations should be portable between users (at least at the same site). E.g.: another user should be able to run the same suite:

  • without making ANY changes to it.
  • without having to add/modify things in their $HOME/.profile.

Input configurations should be programming language neutral.

  • Any processing logic should be application/version independent, generic and future proof.
  • Data structure should be represented in formats easily understood and manipulatable by a human and a computer.

The life cycles of application configurations in a suite may differ from that of the suite.

  • The configuration of an application may be independent of the suite.
  • The configuration of an application should be portable between suitable suites.

The configurations are independent of the utilities. For example, the configuration metadata for the suite and application configurations will drive the Rose config editor GUI, but will not be bound or restricted by it.

Configuration Format

A configuration in Rose is normally represented by a directory with the following:

  • a configuration file in a modified INI file format.
  • (optionally) files containing data that cannot easily be represented by the INI format.

We have added the following conventions into our INI format:

  1. The file name is normally called rose*.conf, e.g. rose.conf, rose-app.conf, rose-meta.conf, etc.
  2. Only a hash # in the beginning of a line starts a comment. Empty lines and lines with only white spaces are ignored. There is no support for trailing comments. Comments are normally ignored when a configuration file is loaded. However, some comments are loaded if the following conditions are met:
    • Comment lines at the beginning of a configuration file up to but not including the 1st blank line or the 1st setting are comment lines associated with the file. They will re-appear at the top of the file when it is re-dumped.
    • Comment lines between a blank line and the next setting, and the comment lines between the previous setting and the next setting are comments associated with the next setting. The comment lines associated with a setting will re-appear before the setting when the file is re-dumped.
  3. Only the equal sign = is used to delimit a key-value pair - because the colon : may be used in keys of namelist declarations.
  4. A key-value pair declaration does not have to live under a section declaration. Such a declaration lives directly under the root level.
  5. Key-value pair declarations following a line with only [] are placed directly under the root level.
  6. Declarations are case sensitive. When dealing with case-insensitive inputs such as Fortran logicals or numbers in scientific notation, lowercase values should be used.
  7. When writing namelist inputs, keys should be lowercase.
  8. Declarations start at column 1. Continuations start at column >1.
    • Each line is stripped of leading and trailing spaces.
    • A newline \n character is prefixed to each continuation line.
    • If a continuation line has a leading equal sign = character, it is stripped from the line. This is useful for retaining leading white spaces in a continuation line.
  9. A single exclamation ! or a double exclamation !! in front of a section (i.e. [!SECTION]) or key=value pair (i.e. !key=value) denotes an ignored setting.
    • E.g. It will be ignored in run time but may be used by other Rose utilities.
    • A single exclamation denotes a user-ignored setting.
    • A double exclamation denotes a program-ignored setting. E.g. rose config-edit may use a double exclamation to switch off a setting according to the setting metadata.
  10. The open square bracket ([) and close square bracket (]) characters cannot be used within a section declaration. E.g. [[hello], [hello]], [hello [world] and beyond] should all be errors on parsing.
  11. If a section is declared twice in a file, the later section will append settings to the earlier one. If the same key in the same section is declared twice, the later value will override the earlier one. This logic applies to the state of a setting as well.
  12. Once the file is parsed, declaration ordering is insignificant. N.B. Do not assume order of environment variables.
  13. Values of settings accept syntax such as $NAME or ${NAME} for environment variable substitution.

E.g.

# This is line 1 of the comment for this file.
# This is line 2 of the comment for this file.

# This comment will be ignored.

# This is a comment for section-1.
[section-1]
# This is a comment for key-1.
key-1=value 1
# This comment will be ignored.

# This is line 1 of the comment for key-2.
# This is line 2 of the comment for key-2.
key-2=value 2 line 1
      value 2 line 2
# This is a comment for key-3.
key-3=value 3 line 1
     =    value 3 line 2 has leading identation.
     =
     =    value 3 line 3 is blank. This is line 4.

# section-2 is user-ignored.
[!section-2]
key-4=value 4
# ...

[section-3]
# key-5 is program ignored.
!!key-5=value 5

In this document, the shorthand SECTION=KEY=VALUE is used to represent a KEY=VALUE pair in a [SECTION] of an INI format file.

Optional Configuration

In a Rose configuration directory, we can add an opt/ sub-directory for optional configuration files. Optional configuration files contain additional configuration, which can be selected at run time to override the configuration in the main rose-${TYPE}.conf file. The name of each optional configuration should follow the syntax rose-${TYPE}-${KEY}.conf, where ${KEY} is a short name to describe the override functionality of the optional configuration file.

A root level opts=KEY ... setting in the main configuration will tell the run time program to load the relevant optional configurations in the opt/ sub-directory at run time. Individual Rose utilities may also read optional configuration keys from environment variables and/or command line options.

Where multiple $KEY settings are given, the optional configurations are applied in that order - for example, a setting:

opts=ketchup mayonnaise

implies loading the optional configuration rose-app-ketchup.conf and then the optional configuration rose-app-mayonnaise.conf, which may override the previous one.

By default, a Rose command will fail if an optional configuration file is missing. However, if you put the optional configuration key in brackets, then the optional configuration file is allowed to be missing. E.g.:

opts=ketchup (mayonnaise)

In the above example, rose-app-mayonnaise.conf can be missing.

Some Rose utilities (e.g. rose suite-run, rose task-run, rose app-run, etc) allow optional configurations to be selected at run time using:

  1. The ROSE_APP_OPT_CONF_KEYS Environment variables.
  2. The command line options --opt-conf-key=KEY or -O KEY.

See reference of individual commands for detail.

Note that by default optional configurations must exist else an error will be raised. To specify an optional configuration which may be missing write the name of the configuration inside parenthesis (e.g. (foo)).

Optional Configurations and Metadata

Metadata utilities such as rose app-upgrade and rose macro treat each main + optional configuration as a separate entity to be transformed, upgraded, or validated. Use cases with more than one optional configuration are not handled.

When transforming or upgrading, each optional configuration is treated separately and re-created after the transform as a functional difference from the main upgraded configuration.

The logic for transforming or upgrading a main configuration C with optional configurations O1 and O2 into a new main configuration Ct and new optional configurations O1t and O2t can be represented like this:

C => Ct
C + O1 => C1t
C + O2 => C2t
O1t = C1t - Ct
O2t = C2t - Ct

Import Configuration

A root level import=PATH1 PATH2... setting in the main configuration will tell Rose utilities to search for configurations at PATH1, PATH2 (and so on) and inherit configuration and files from them if found.

At the moment, use of this is only encouraged for configuration metadata.

Re-define Configuration at Run Time

Some Rose utilities (e.g. rose suite-run, rose task-run, rose app-run, etc) allow you to re-define configuration settings at run time using the --define=[SECTION]NAME=VALUE or -D [SECTION]NAME=VALUE options on the command line. This would add new settings or override any settings defined in the main and optional configurations. E.g.:

# Set [env]FOO=foo, and [env]BAR=bar
# (Overriding any original settings of [env]FOO or [env]BAR)
rose task-run -D '[env]FOO=foo' -D '[env]BAR=bar'

# Switch off [env]BAZ
rose task-run -D '[env]!BAZ='

Site and User Configuration

Aspects of some Rose utilities can be configured per installation via the site configuration file and per user via the user configuration file. Any configuration in the site configuration overrides the default, and any configuration in the user configuration overrides the site configuration and the default. Rose expects these files to be in the modified INI format described above. Rose utilities search for its site configuration at $ROSE_HOME/etc/rose.conf where $ROSE_HOME/bin/rose is the location of the rose command, and they search for the user configuration at $HOME/.metomi/rose.conf where $HOME is the home directory of the current user.

Allowed settings in the site and user configuration files will be documented in a future version of this document. In the mean time, the settings are documented as comments in the etc/rose.conf.example file of each distribution of Rose.

You can also override many internal constants of the rose config edit and rosie go. To change the keyboard shortcut of the Find Next action in the config editor to F3, put the following lines in your user config file, and the setting will apply the next time you run rose config-edit:

[rose-config-edit]
accel-find-next=F3

Suite Configuration

The configuration and functionality of a suite will usually be covered by the use of cylc. In which case, most of the suite configuration will live in the cylc suite.rc file. Otherwise, a suite is just a directory of files.

A suite directory may contain the following:

  • A file called rose-suite.conf, a configuration file in the modified INI format described above. It stores the information on how to install the suite. See below for detail.
  • A file called rose-suite.info, a configuration file in the modified INI format described above. It describes the suite's purpose and identity, e.g. the title, the description, the owner, the access control list, and other information. Apart from a few standard fields, a suite is free to store any information in this file. See below for detail.
  • An app/ directory of application configurations used by the suite.
  • A bin/ directory of scripts and utilities used by the suite.
  • An etc/ directory of other configurations and resources used the suite. E.g. fcm make configurations.
  • A meta/ directory containing the suite's configuration metadata.
  • opt/ directory. For detail, see Optional Configuration.
  • Other items, as long as they do not clash with the scheduler's working directories. E.g. for a cylc suite, log*/, share/, state/ and work/ should be avoided.

Suite Configuration: Suite Installation

The suite install configuration file rose-suite.conf should contain the information on how to install the suite. It may have the following sections and root level options:

[env]
Specify the environment variables to export to the suite daemon. The usual $NAME or ${NAME} syntax can be used in values to reference environment variables that are already defined before the suite runner is invoked. However, it is unsafe to reference other environment variables defined in this section. If the value of an environment variable setting begins with a tilde ~, all of the characters preceding the 1st slash / are considered a tilde-prefix. Where possible, a tilde-prefix is replaced with the home directory associated with the specified login name at run time. There are 2 special environment variables which can be specified in this section:
  • ROSE_VERSION: If specified, the version of Rose that starts the suite run must match the specified version.
  • CYLC_VERSION: If specified for a cylc suite, the Rose suite runner will attempt to use this version of cylc.
[jinja2:suite.rc]
For a cylc suite, if jinja2 assignments are required for suite.rc, they may be defined as key=value pairs in the [jinja2:suite.rc] section. The assignments will be inserted after the #!jinja2 line of the installed suite.rc file.
[file:NAME]
Specify a file/directory to be installed. NAME should be a path relative to the run time $PWD.

It may have the following top level (no section) options:

meta
Specify the configuration metadata for the suite. The section may be used by various Rose utilities, such as the config editor GUI. It can be used to specify the suite type.
root-dir=LIST
A new line delimited list of PATTERN=DIR pairs. The PATTERN should be a glob-like pattern for matching a host name. The DIR should be the root directory to install a suite run directory. E.g.:
root-dir=hpc*=$WORKDIR
        =*=$DATADIR

In this example, rose suite-run of a suite with name $NAME will create ~/cylc-run/$NAME as a symbolic link to $DATADIR/cylc-run/$NAME/ on any machine, except those with their hostnames matching hpc*. In which case, it will create ~/cylc-run/$NAME as a symbolic link to $WORKDIR/cylc-run/$NAME/.

root-dir{share}=LIST
A new line delimited list of PATTERN=DIR pairs. The PATTERN should be a glob-like pattern for matching a host name. The DIR should be the root directory where the suite's share/ directory should be created.
root-dir{share/cycle}=LIST
A new line delimited list of PATTERN=DIR pairs. The PATTERN should be a glob-like pattern for matching a host name. The DIR should be the root directory where the suite's share/cycle/ directory should be created.
root-dir{work}=LIST
A new line delimited list of PATTERN=DIR pairs. The PATTERN should be a glob-like pattern for matching a host name. The DIR should be the root directory where the suite's work/ directory for tasks should be created.
root-dir-share=LIST
Deprecated. Same as root-dir{share}=LIST.
root-dir-work=LIST
Deprecated. Same as root-dir{work}=LIST.

Suite Configuration: Suite Information

The suite information file rose-suite.info should contain the information on identify and the purpose of the suite. It has no sections, only KEY=VALUE pairs. The owner, project and title settings are compulsory. Otherwise, any KEY=VALUE pairs can appear in this file. If the name of a KEY ends with -list, the value is expected to be a space-delimited list. The following keys are known to have special meanings:

owner
Specify the user ID of the owner of the suite. The owner has full commit access to the suite. Only the owner can delete the suite, pass the suite's ownership to someone else or change the access-list.
project
Specify the name of the project associated with the suite.
title
Specify a short title of the suite.
access-list
Specify a list of users with commit access to trunk of the suite. A * in the list means that anyone can commit to the trunk of the suite. Setting this blank or omitting the setting means that nobody apart from the owner can commit to the trunk. Only the suite owner can change the access list.
description
Specify a long description of the suite.
sub-project
Specify a sub-division of project, if applicable.

Application Configuration

The configuration of an application is represented by a directory. It may contain the following:

  • rose-app.conf: a compulsory configuration file in the modified INI format. See below for detail. It contains the following information:
    • the command(s) to run.
    • the metadata type for the application.
    • the list of environment variables.
    • other configurations that can be represented in un-ordered key=value pairs, e.g. Fortran namelists.
  • file/ directory: other input files, e.g.:
    • FCM configuration files (requires ordering of key=value pairs).
    • STASH files.
    • other configuration files that require more than section=key=value.

    Files in this directory are copied to the working directory in run time.

    Note: If there is a clash between a [file:*] section and a file under file/, the setting in the [file:*] section takes precedence. E.g. Suppose we have a file file/hello.txt. In the absence of [file:hello.txt], it will copy file/hello.txt to $PWD/hello.txt in run time. However, if we have a [file:hello.txt] section and a source=SOURCE setting, then it will install the file from SOURCE instead. If we have [!file:hello.txt], then the file will not be installed at all.

  • bin/ directory for e.g. scripts and executables used by the application at run time. If a bin/ exists in the application configuration, it will prepended to the PATH environment variable at run time.
  • meta/ directory for the metadata of the application.
  • opt/ directory. For detail, see Optional Configuration.

E.g. The application configuration directory may look like:

./bin/
./rose-app.conf
./file/file1
./file/file2
./meta/rose-meta.conf
./opt/rose-app-extra1.conf
./opt/rose-app-extra2.conf
...

Application Configuration File

The rose-app.conf contains a serialised data structure that is an unordered map (sections=) of unordered maps (keys=values). There can also be keys=values without a section, at the top level. The sections and keys can be:

file-install-root
Root level setting. Specify the root directory to install file targets that are specified with a relative path.
meta
Root level setting. Specify the configuration metadata for the application. This is ignored by the application runner, but may be used by other Rose utilities, such as the config editor GUI. It can be used to specify the application type.
mode
Root level setting. Specify the name of a builtin application, instead of running a command specified in the [command] section. See also Running Tasks > rose task-run > Built-in Applications Selection
[command]
Specify the command(s) to run. The default key can be used to define the default command to run. Other keys can be used to define alternate commands, which can be selected at run time.
[env]
Specify input environment variables to the command, in KEY=VALUE pairs. The usual $NAME or ${NAME} syntax can be used in values to reference environment variables that are already defined when the application runner is invoked. However, it is unsafe to reference other environment variables defined in this section. Note: UNDEF is a special variable that is always undefined at run time. Reference to it will cause a failure at run time. It can be used to indicate that a value must be overridden at run time. If the value of an environment variable setting begins with a tilde ~, all of the characters preceding the 1st slash / are considered a tilde-prefix. Where possible, a tilde-prefix is replaced with the home directory associated with the specified login name at run time.
[etc]
Specify misc. settings. Currently, only UM defs for science sections are thought to require this section.
[file:NAME]
Specify a file/directory to be generated by the application runner at run time. NAME should be a path relative to the run time $PWD, or STDIN.
[namelist:NAME]
Specify a namelist with the group name called NAME, which can be referred to by a source setting of a file. Each setting in a namelist:NAME section is a KEY=VALUE pair exactly like a normal Fortran namelist, but without the trailing comma.
[namelist:NAME(SORT-INDEX)]
Same as [namelist:NAME] but:
  • It allows the source setting of a file to refer to all [namelist:NAME(SORT-INDEX)] as namelist:NAME(:), and the namelist groups will be sorted alphanumerically by the SORT-INDEX.
  • It allows different namelist files to have namelists with the same group name. These will all inherit the same group configuration metadata (from [namelist:NAME]).
[namelist:NAME{CATEGORY}] or [namelist:NAME{CATEGORY}(SORT-INDEX)]
Same as [namelist:NAME(SORT-INDEX)] but:
  • An alternate way for grouping namelists. This allows the same namelist to have different usage and configuration metadata according to its category. Namelists will inherit configuration metadata from their basic group [namelist:NAME] as well as from their specific category [namelist:NAME{CATEGORY}].
[poll]
Specify prerequisites to poll for before running the actual application. 3 types of tests can be performed:

all-files: A list of space delimited list of file paths. This test passes only if all file paths in the list exist.

any-files: A list of space delimited list of file paths. This test passes if any file path in the list exists.

test: A shell command. This test passes if the command returns a 0 (zero) return code.

Normally, the all-files and any-files tests both test for the existence of file paths. If this is not enough, e.g. you want to test for the existence of a string in each file, you can specify a file-test to do a grep. E.g.:

all-files=file1 file2
file-test=test -e {} && grep -q 'hello' {}

At runtime, any {} pattern in the above would be replaced with the name of the file. The above make sure that both file1 and file2 exist and that they both contain the string hello.

The above tests will only be performed once when the application runner starts. If a list of delays are added, the tests will be performed a number of times with delays between them. If the prerequisites are still not met after the number of delays, the application runner will fail with a time out. The list is a comma-separated list. The syntax looks like [n*][DURATION], where DURATION is an ISO 8601 duration such as PT5S (5 seconds) or PT10M (10 minutes), and n is an optional number of times to repeat it. E.g.:

# Default
delays=0

# Poll 1 minute after the runner begins, repeat every minute 10 times
delays=10*PT1M

# Poll when runner begins,
# repeat every 10 seconds 6 times,
# repeat every minute 60 times,
# repeat once after 1 hour
delays=0,6*PT10S,60*PT1M,PT1H

Application Configuration File: Built-in Application: fcm_make

See Running Tasks > rose task-run > Built-in Application: fcm_make.

Application Configuration File: Built-in Application: rose_ana

See Running Tasks > rose task-run > Built-in Application: rose_ana and rose stem > Analysing output with rose_ana.

Application Configuration File: Built-in Application: rose_arch

See Running Tasks > rose task-run > Built-in Application: rose_arch.

Application Configuration File: Built-in Application: rose_prune

See Running Tasks > rose task-run > Built-in Application: rose_prune.

Configuration Metadata

See Configuration Metadata.

Appendix: File Creation Mode

A [file:TARGET] section may have the following settings:

source
A space delimited list of sources for generating this file. A source can be the path to a regular file or directory in the file system (globbing is also supported - e.g. using "*.conf" to mean all .conf files), or it may be a URI to a resource. If a source is a URI, it may point to a section with a supported scheme in the current configuration, e.g. a namelist:NAME section. Otherwise the URI must be in a supported scheme or be given sufficient information for the system to determine its scheme, e.g. via the root level schemes setting described below.

Normally, a source that does not exist would trigger an error in run time. However, it may be useful to have an optional source for a file sometimes. In which case, the syntax source=(SOURCE) can be used to specify an optional source. E.g. source=namelist:foo (namelist:bar) would allow namelist:bar to be missing or ignored without an error.

checksum
The expected MD5 checksum of the target. If specified, the file generation will fail if the actual checksum of the target does not match with this setting. This setting is only meaningful if TARGET is a regular file or a symbolic link to a regular file. N.B. An empty value for checksum tells the system to report the target checksum in verbose mode.
mode
auto (default), mkdir, symlink or symlink+. See below.

The following is a list of all the supported usages:

mode=auto,source=
Target is an empty file.
mode=auto,source=SOURCE
Target is a copy of SOURCE.
mode=auto,source=SOURCE-LIST
Target is created by concatenating the contents of SOURCE-LIST. The sources should either be all files or all directories.
mode=mkdir
Target is a directory.
mode=symlink,source=SOURCE
Target is created as a symlink of an FS SOURCE. N.B. In this mode, SOURCE must be a single source. SOURCE does not have to exist when the symbolic link is created.
mode=symlink+,source=SOURCE
Target is created as a symlink of an FS SOURCE. N.B. In this mode, SOURCE must be a single source. SOURCE must exist when the symbolic link is created.

The root level schemes setting: While the system would attempt to automatically detect the scheme of a source, the name of the source can sometimes be ambiguous. E.g. A URL with a http scheme can be a path in a version control system, or a path to a plain file. The root level schemes setting can be used to help the system to do the right thing. The general syntax of the value of the root level schemes setting looks like:

schemes=PATTERN-1=SCHEME-1
       =PATTERN-2=SCHEME-2

E.g.:

schemes=hpc*:*=rsync
       =http://host/svn-repos/*=svn

[file:foo.txt]
source=hpc1:/path/to/foo.txt

[file:bar.txt]
source=http://host/svn-repos/path/to/bar.txt

In this example, a URI matching the pattern hpc*:* would use the rsync scheme to pull the source to the current host, and a URI matching the pattern http://host/svn-repos/* would use the svn scheme. For all other URIs, the system will try to make an intelligent guess.

The system will always match a URI in the order as specified by the setting to avoid ambiguity.

The system has built-in support for the following schemes:

fs
The file system scheme. If a URI looks like an existing path in the file system, this scheme will be used.
namelist
The namelist scheme. Refer to namelist:NAME sections in the configuration file.
svn
The Subversion scheme. The location is a Subversion URL or an FCM location keyword. A URI with these schemes svn, svn+ssh and fcm are automatically recognised.
rsync
This scheme is useful for pulling a file or directory from a remote host using rsync via ssh. A URI should have the form HOST:PATH. (Note: If required, you can use the User setting in ~/.ssh/config to specify the user ID for logging into HOST.)

The application launcher will use the following logic to determine the root directory to install file targets with a relative path:

  1. If the setting file-install-root=PATH is specified (at the root level) in the application configuration file, its value will be used.
  2. If the environment variable ROSE_FILE_INSTALL_ROOT is specified, its value will be used.
  3. Otherwise, the working directory of the task will be used.

Appendix: rose-ana configuration format

The rose-ana builtin application reads information about which analysis steps it should perform from the rose-app.conf file for that task. Each of the section names (in square brackets) which describe an analysis step must follow a particular format:

  • the name must begin with ana:. This is required for rose-ana to recognise it as a valid section.
  • the next part gives the name of the class within one of the analysis modules, including namespace information; for example to use the built-in FilePattern class from the grepper module you would provide the name grepper.FilePattern.
  • finally an expression within parentheses which may contain any string; this should be used to make comparisons using the same class unique, but can otherwise simply act as a description or note.

The content within each of these sections consists of a series of key-value option pairs; just like other standard Rose apps. However the availability of options for a given section is specified and controlled by the class rather than the meta-data. This makes it easy to provide your own analysis modules without requiring changes to Rose itself.

Therefore you should consult the documentation or source code of the analysis module you wish to use for details of which options it supports. Additionally, some special treatment is applied to all options depending on what they contain:

  • Environment vars: If the option contains any words prefixed by $ they will be substituted for the equivalent environment variable, if one is available.
  • Lists: If the option contains newlines it will be returned as a list of strings automatically.
  • Argument substitution: If the option contains one or more pairs of empty curved-parentheses {} the option will be returned multiple times with the parentheses substituted once for each argument passed to rose task-run

The app may also define a configuration section; [ana:config], whose key-value pairs define app-wide settings that are passed through to the analysis classes. In the same way that the task options are dependent on the class definition; interpretation of the config options is done by the class(es), so their documentation or source code should be consulted for details.