Graphing¶
In this section we will cover writing basic workflows in cylc.
The suite.rc
File Format¶
We refer to a Cylc workflow as a Cylc suite. A Cylc suite is a
directory containing a suite.rc
file. This configuration file is where
we define our workflow. The suite.rc
file uses a nested INI-based
format:
- Comments start with a
#
character. - Settings are written as
key = value
pairs. - Settings can be contained within sections.
- Sections are written inside square brackets i.e.
[section-name]
. - Sections can be nested, by adding an extra square bracket with each level,
so a sub-section would be written
[[sub-section]]
, a sub-sub-section[[[sub-sub-section]]]
, and so on.
# Comment
[section]
key = value
[[sub-section]]
another-key = another-value # Inline comment
yet-another-key = """
A
Multi-line
String
"""
Throughout this tutorial we will refer to settings in the following format:
[section]
- refers to the entire section.[section]key
- refers to a setting within the section.[section]key=value
- expresses the value of the setting.[section][sub-section]another-key
. Note we only use one set of square brackets with nested sections.
Tip
It is advisable to indent suite.rc
files. This indentation, however,
is ignored when the file is parsed so settings must appear before
sub-sections.
[section]
key = value # This setting belongs to the section.
[[sub-section]]
key = value # This setting belongs to the sub-section.
# This setting belongs to the sub-section as indentation is ignored.
# Always write settings before defining any sub-sections!
key = value
Note
In the suite.rc
file format duplicate sections are additive, that is
to say the following two examples are equivalent:
[a]
c = C
[b]
d = D
[a]
e = E
[a]
c = C
e = E
[b]
d = D
Settings, however, are not additive meaning that a duplicate setting will override an earlier value. The following two examples are also equivalent:
a = foo
a = bar
a = bar
Graph Strings¶
In Cylc we consider workflows in terms of tasks and dependencies.
Task are represented as words and dependencies as arrows (=>
), so the
following text defines two tasks where make_dough
is dependent on
purchase_ingredients
:
purchase_ingredients => make_dough
In a Cylc workflow this would mean that make_dough
would only run when
purchase_ingredients
has succeeded. These dependencies can be chained together:
purchase_ingredients => make_dough => bake_bread => sell_bread
This line of text is referred to as a graph string. These graph strings can be combined to form more complex workflows:
purchase_ingredients => make_dough => bake_bread => sell_bread
pre_heat_oven => bake_bread
bake_bread => clean_oven
Graph strings can also contain “and” (&
) and “or” (|
) operators, for
instance the following lines are equivalent to the ones just above:
purchase_ingredients => make_dough
pre_heat_oven & make_dough => bake_bread => sell_bread & clean_oven
Collectively these graph strings are referred to as a graph.
Note
The order in which lines appear in the graph section doesn’t matter, for instance the following examples are the same as each other:
foo => bar
bar => baz
bar => baz
foo => bar
Cylc Graphs¶
In a Cylc suite the graph is stored under the
[scheduling][dependencies]graph
setting, i.e:
[scheduling]
[[dependencies]]
graph = """
purchase_ingredients => make_dough
pre_heat_oven & make_dough => bake_bread => sell_bread & clean_oven
"""
This is a minimal Cylc suite, in which we have defined a graph representing a workflow for Cylc to run. We have not yet provided Cylc with the scripts or binaries to run for each task. This will be covered later in the runtime tutorial.
Cylc provides a GUI for visualising graphs. It is run on the
command line using the cylc graph <path>
command where the path path
is to the suite.rc
file you wish to visualise.
When run, cylc graph
will display a diagram similar to the ones you have
seen so far. The number 1
which appears below each task is the
cycle point. We will explain what this means in the next section.
Hint
A graph can be drawn in multiple ways, for instance the following two examples are equivalent:
The graph drawn by cylc graph
may vary slightly from one run to another
but the tasks and dependencies will always be the same.
Practical
In this practical we will create a new Cylc suite and write a graph for it to use.
Create a Cylc suite.
A Cylc suite is just a directory containing a
suite.rc
file.If you don’t have one already, create a
cylc-run
directory in your user space i.e:~/cylc-run
Within this directory create a new folder called
graph-introduction
, which is to be our suite directory. Move into it:mkdir ~/cylc-run/graph-introduction cd ~/cylc-run/graph-introduction
Inside this directory create a
suite.rc
file and paste in the following text:[scheduling] [[dependencies]] graph = """ # Write graph strings here! """
Write a graph.
We now have a blank Cylc suite, next we need to define a workflow.
Edit your
suite.rc
file to add graph strings representing the following graph:Use
cylc graph
to visualise the workflow.Once you have written some graph strings try using
cylc graph
to display the workflow. Run the following command:cylc graph .
Note
cylc graph
takes the path to the suite as an argument. As we are inside the suite directory we can runcylc graph .
.If the results don’t match the diagram above try going back to the suite.rc file and making changes.
Tip
In the top right-hand corner of the
cylc graph
window there is a refresh button which will reload the GUI with any changes you have made.Solution
There are multiple correct ways to write this graph. So long as what you see in
cylc graph
matches the above diagram then you have a correct solution.Two valid examples:
foo & pub => bar => baz & wop baz => qux
foo => bar => baz => qux pub => bar => wop
The whole suite should look something like this:
[scheduling] [[dependencies]] graph = """ foo & pub => bar => baz & wop baz => qux """