Consolidating Configuration

In the last section we wrote out the following code in the suite.rc file:

[runtime]
    [[get_observations_heathrow]]
        script = get-observations
        [[[environment]]]
            SITE_ID = 3772
            API_KEY = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    [[get_observations_camborne]]
        script = get-observations
        [[[environment]]]
            SITE_ID = 3808
            API_KEY = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    [[get_observations_shetland]]
        script = get-observations
        [[[environment]]]
            SITE_ID = 3005
            API_KEY = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    [[get_observations_aldergrove]]
        script = get-observations
        [[[environment]]]
            SITE_ID = 3917
            API_KEY = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

In this code the script item and the API_KEY environment variable have been repeated for each task. This is bad practice as it makes the configuration lengthy and making changes can become difficult.

Likewise the graphing relating to the get_observations tasks is highly repetitive:

[scheduling]
    [[dependencies]]
        [[[T00/PT3H]]]
            graph = """
                get_observations_aldergrove => consolidate_observations
                get_observations_camborne => consolidate_observations
                get_observations_heathrow => consolidate_observations
                get_observations_shetland => consolidate_observations
            """

Cylc offers three ways of consolidating configurations to help improve the structure of a suite and avoid duplication.

The cylc get-config Command

The cylc get-config command reads in then prints out the suite.rc file to the terminal.

Throughout this section we will be introducing methods for consolidating the suite.rc file, the cylc get-config command can be used to “expand” the suite.rc file back to its full form.

Note

The main use of cylc get-config is inspecting the [runtime] section of a suite. The cylc get-config command does not expand parameterisations and families in the suite’s graph. To inspect the graphing use the cylc graph command.

Call cylc get-config with the path of the suite (. if you are already in the suite directory) and the --sparse option which hides default values.

cylc get-config <path> --sparse

To view the configuration of a particular section or setting refer to it by name using the -i option (see The suite.rc File Format for details), e.g:

# Print the contents of the [scheduling] section.
cylc get-config <path> --sparse -i '[scheduling]'
# Print the contents of the get_observations_heathrow task.
cylc get-config <path> --sparse -i '[runtime][get_observations_heathrow]'
# Print the value of the script setting in the get_observations_heathrow task
cylc get-config <path> --sparse -i '[runtime][get_observations_heathrow]script'

The Three Approaches

The next three sections cover the three consolidation approaches and how we could use them to simplify the suite from the previous tutorial. Work through them in order!

Which Approach To Use

Each approach has its uses. Cylc permits mixing approaches, allowing us to use what works best for us. As a rule of thumb:

  • Families work best consolidating runtime configuration by collecting tasks into broad groups, e.g. groups of tasks which run on a particular machine or groups of tasks belonging to a particular system.
  • Jinja2 is good at configuring settings which apply to the entire suite rather than just a single task, as we can define variables then use them throughout the suite.
  • Parameterisation works best for describing tasks which are very similar but which have subtly different configurations (e.g. different arguments or environment variables).