This chapter of the user guide discusses the Rose suite tools and utilities that can be used to simplify the logic of a suite and/or provide a common approach of doing the same things.
The rose task-run
command selects and
launches an application (according to an application
configuration) under the environment of a job (of a
task) in a suite.
See Rose Reference Guide: CLI > rose task-run for a full command reference.
See Rose Reference Guide: Configuration > Configuration Format for more information on the Rose configuration format.
See Rose Reference Guide: Configuration > Application Configuration for more information on how to set up a Rose application configuration.
The working directory of a task is dependent on the suite engine. For a task running under cylc, the working directory of a task is normally $ROSE_SUITE_DIR/work/$ROSE_TASK_CYCLE_TIME/$ROSE_TASK_NAME/, (or $ROSE_SUITE_DIR/work/1/$ROSE_TASK_NAME/ for a non-cycling task.)
The rose task-run
(or the rose
app-run
) command also dumps out a
rose-app-run.conf file in the working
directory. The file contains the original
rose-app.conf in the application
configuration with any added optional configurations
and command line settings set via
--define=[SECTION]NAME=VALUE
.
The rose task-run
command selects its
application configuration in the following order:
--config=DIR
option is
specified, it uses the value of DIR as the
path to the application configuration directory.--app-key=KEY
option is
specified, it uses the KEY sub-directory
under the suite's app/ directory as the
application configuration directory.The rose task-run
command exports a set
of ROSE_* environment variables to its
environment before doing anything. The list is
documented in Rose Reference Guide:
CLI > rose task-env.
The rose task-run
command also sets up
the PATH environment variable and if
relevant, other PATH-like environment variables, (e.g.
PYTHONPATH). The following logic is applied
in order:
[rose-task-run] # Prepend /opt/hello/bin /opt/greeting/bin to $PATH path-prepend=/opt/hello/bin /opt/greeting/bin # Prepend /opt/stuff/lib/perl to $PERL5LIB path-prepend.PERL5LIB=/opt/stuff/lib/perl
share/fcm[_-]make*/*/bin
,
work/fcm[_-]make*/*/bin
. Matched
directories are added to the PATH
environment variable.--path=[NAME=]PATTERN
command line
option. Each of these specify a glob pattern for
paths to prepend to an environment variable called
NAME (or PATH if
NAME is not specified). If a relative path
is given, it is relative to
$ROSE_SUITE_DIR. An empty value resets
anything prepended in (1.) and (2.) above and any
previous --path=[NAME=]PATTERN
settings
for the relevant environment variable.The rose task-run
command (or the
rose app-run
command) also exports each
environment variables in the [env] section
in the application configuration file. N.B.:
$NAME
and ${NAME}
syntax in values of the settings are substituted by
the value of the environment variable
NAME. The command does not support any
other Unix shell variable substitution/manipulation
syntax, nor does it support any Unix shell syntax for
sub-shell command substitution. The command fails if
NAME is not an environment variable. You
can escape a substitution by adding a backslash in
front of the syntax, e.g. \$NAME
or
\${NAME}
.The following is an example of using environment variables:
[command] default=echo "$HELLO $WORLDS" [env] HELLO=Greeting WORLDS=Mars Jupiter Saturn
Most other sections in the application configuration file, (e.g. the file installation sections and the application command) can reference the exported environment variables described above.
Finally, if the application configuration has a
bin/ sub-directory, the rose
task-run
command (or the rose
app-run
command) will prepend its location in
front of the PATH environment variable.
The rose task-run
command (or the
rose app-run
command) can be configured to
install files to the working directory or other
locations in the suite. It does the following:
Note: If there is a clash between a [file:*] section and a file under file/, the setting in the [file:*] section takes precedence. E.g. Suppose we have a file file/hello.txt. In the absence of [file:hello.txt], it will copy file/hello.txt to $PWD/hello.txt in run time. However, if we have a [file:hello.txt] section and a source=SOURCE setting, then it will install the file from SOURCE instead. If we have [!file:hello.txt], then the file will not be installed at all.
If the command for the application requires some pre-defined standard input, a file called STDIN can be installed in the working directory. When the command is invoked, the contents of the file will be piped to the standard input of the command.
The rose task-run
command (or the
rose app-run
command) caches file
installation settings in the
.rose-config_processors-file.db file in the
working directory. This allows file installation to be
incremental. If the task is run again,
identical files will not have to be reinstalled.
By default, the rose task-run
command
(or the rose app-run
command) invokes a
shell command. (Alternatively, the rose
task-run
command invokes a built-in application.
See Built-in
Applications Selection for detail.) It uses the
following logic to select a shell command to run:
rose
task-run
, the arguments are run as a shell
command.--command-key=KEY
option is
specified, the command specified in the
[command]KEY setting in the application
configuration file is used.ROSE_APP_COMMAND_KEY
environment variable
is set, the command specified in the [command]KEY
setting in the application configuration whose KEY matches it
is used.This mechanism allows, for example, similar tasks to share the same application configuration, if most of their differences can be defined at the command line.
E.g.
[command] default=echo Hello World hello_earth=echo Hello Earth greet_martians=echo Greeting Martians
In the above example, if the command key is hello_earth, the application will echo Hello Earth. If the command key is not defined, but the task name is greet_martians, the application will echo Greeting Martians. For any other tasks using this application configuration and if the command key is not defined, it will echo Hello World.
Apart from running a shell command, the rose
task-run
command (or the rose
app-run
command) may be configured to call a
built-in application. To use a built-in application,
add the mode=KEY top level setting in the
application configuration, where KEY is the
name of a built-in application. (Each built-in
application is discussed individually later in this
chapter.)
A built-in application would normally behave very much like running an external command. The key differences are normally that:
Prerequisites of a task should normally be defined
at the suite level. However, there are times when it is
more efficient to poll for a simple prerequisite before
running a command. The rose task-run
command (or the rose app-run
command)
provides a facility for tasks to poll for some
prerequisites before running the application command.
The facility supports 3 types of tests:
Normally, both all-files and any-files test for the existence of file paths. If this is not enough, you can specify a [poll]file-test setting to run a command on each file. E.g. if you want to test for the existence of a string in each file, you can do:
all-files=file1 file2 file-test=test -e {} && grep -q 'hello' {}
At runtime, any {}
pattern in the above
is replaced with the name of the file. The above
example checks that both file1 and
file2 exist and that they both contain the
string hello.
By default, tests will only be performed once. If a
list of delays is added, the tests will be
performed a number of times with delays between them.
If the prerequisites are still not met after the number
of delays, rose task-run
will fail with a
time out. The delays list is a comma-separated list.
The syntax looks like [R*][P], where
R is the number of repeats, P is
the ISO8601 date-time format syntax, see wikipedia
entry (last checked on 2015-05-29). E.g.:
# Default delays=0 # Poll 1 minute after the runner begins, repeat every minute 10 times delays=10*PT1M # Poll when runner begins, # repeat every 10 seconds 6 times, # repeat every minute 60 times, # repeat once after 1 hour, # repeat once after 1 week, 2 days, 6 hours and 30 seconds delays=0,6*PT10S,60*PT1M,PT1H,P1W2DT6H30S
The fcm_make
built-in applications is
provided for running fcm make
.
N.B.:
rose task-run
will
run this built-in application automatically.rose
task-run
will attempt to associate it with the
corresponding fcm_make* application
configuration.rose
task-env
and/or rose task-run
commands run by subsequent tasks in the suite.The fcm_make
application expects a file
file/fcm-make.cfg in its application
configuration. It runs fcm make
using this
configuration file.
You can configure these applications with environment variables or settings in rose-app.conf. (Settings in rose-app.conf override their equivalent environment variables.)
fcm make
command.rose task-run
is invoked in
--new
mode, the application will remove
this directory before running fcm make
--new
.rose task-run
is invoked in --new
mode, the
application will remove this directory before running
fcm make --new
. (If this location is in
the same physical location of the destination of the
original make, you should only invoke rose
task-run --new
on the original make.
Otherwise, contents generated by the original make
will be wiped clean before the continuation make
begins.)fcm make
command will be invoked in a temporary directory
under this location before being copied back to the
actual destination.fcm make
command will be invoked in a
temporary directory under this location before being
copied back to the actual destination.fcm make
command will be invoked with
the --name=NAME
option of fcm
make
.fcm_make
→
fcm_make2
mapping is used, the context
name of the continuation make will be set to
2
. You can specify an alternate context
name if this is undesirable. The continuation command
will be invoked with the --name=NAME
option of fcm make
.fcm_make
→ fcm_make2
)
which will continue the fcm make
command
at a remote HOST. If such a task is found,
it will add the configuration
mirror.target=HOST:cylc-run/$ROSE_SUITE_NAME/share/$ROSE_TASK_NAME
as an argument to the fcm make
command
to substitute the mirror target. To switch off this
feature, set STEP-NAME to a null string,
i.e. mirror-step=.fcm make
would use in
parallel. (default=4)fcm_make
→ fcm_make2
mapping between the names of the original and the
continuation tasks in the suite.E.g.:
meta=fcm_make mode=fcm_make opt.jobs=8
This built-in application runs
rose-ana
, the Rose analysis engine. This
performs various configurable analysis steps; for example
a common usage is to compare 2 files and report whether
they differ or not. It can writes the details of any comparisons
it runs to a database in the suite's log
directory
to assist with any automated updating of control data
(see the guide linked below for more details).
In automatic selection mode, this built-in
application will be invoked automatically if a task has
a name that starts with rose_ana*
.
The built-in application will search for suitable analysis
modules to load firstly in the ana
subdirectory
of the rose-ana
app, then in the ana
subdirectory of the top-most suite directory. Any additional
directories to search (for example a site-wide central directory)
may be specified in the rose.conf
file using the
method-path variable in the
[rose-ana] section. Finally the
ana_builtins
subdirectory of the Rose installation
itself contains any built-in comparisons.
See also rose stem > Comparing output with rose_ana.
This built-in application provides a generic
solution to configure site specific archiving of suite
files. It is designed to work under rose
task-run
.
In automatic selection mode, this built-in
application will be invoked automatically if a task has
a name that starts with rose_arch*
.
The application is normally configured in a rose-app.conf. Global settings may be specified in an [arch] section. Each archiving target will have its own [arch:TARGET] section for specific settings, where TARGET would be a URI to the archiving location on your site specific archiving system. Settings in a [arch:TARGET] section would override those in the global [arch] section for the given TARGET.
A target is considered compulsory, i.e. it must have at least one source, unless it is specified with the syntax [arch:(TARGET)]. In which case, TARGET is considered optional. The application will skip an optional target that has no actual source.
The application provides some useful functionalities:
The following settings are accepted in [arch] and [arch:TARGET] sections:
printf
style
format string to construct the archive command. It
must contain the placeholders %(sources)s
and %(target)s for substitution of the
sources and the target respectively.printf
style
format string. It may contain the placeholder
%(cycle)s (for the current
$ROSE_TASK_CYCLE_TIME, the placeholder
%(name)s for the name of the file, and/or
named placeholders that are generated by
rename-parser.(hello-world.*)
, the source is
considered optional and will not cause a failure if
it does not match any source file names. However, a
compulsory target that ends up with no matching
source file will be considered a failure.printf
style
format string to construct a command to edit or
modify the content of source files before archiving
them. It must contain the placeholders
%(in)s and %(out)s for
substitution of the path to the source file and the
path to the modified source file (which will be
created in a temporary working directory).E.g.:
# General settings [arch] command-format=foo put %(target)s %(sources)s source-prefix=$ROSE_DATAC/ target-prefix=foo://hello/ # Archive a file to a file [arch:world.out] source=hello/world.out # Auto gzip [arch:planet.out.gz] source=hello/planet.out # Archive files matched by a glob to a directory [arch:worlds/] source=hello/worlds/* # Archive multiple files matched by globs or names to a directory [arch:worlds/] source=hello/worlds/* greeting/worlds/* hi/worlds/* # As above, but "greeting/worlds/*" may return an empty list [arch:worlds/] source=hello/worlds/* (greeting/worlds/*) hi/worlds/* # Target is optional, implied that sources may all be missing [arch:(black-box/)] source=cats.txt dogs.txt # Auto tar-gzip [arch:galaxies.tar.gz] source-prefix=hello/ source=galaxies/* # File with multiple galaxies may be large, don't do its checksum update-check=mtime+size # Force gzip each source file [arch:stars/] source=stars/* compress=gzip # Source name transformation [arch:moons.tar.gz] source=moons/* rename-format=%(cycle)s-%(name)s source-edit-format=sed 's/Hello/Greet/g' %(in)s >%(out)s # Source name transformation with a rename-parser [arch:unknown/stuff.pax] rename-format=hello/%(cycle)s-%(name_head)s%(name_tail)s rename-parser=^(?P<name_head>stuff)ing(?P<name_tail>-.*)$ source=stuffing-*.txt # ...
On completion, rose_arch
writes a
status summary for each target to the standard output,
which looks like this:
0 foo:///fred/my-su173/output0.tar.gz [compress=tar.gz] + foo:///fred/my-su173/output1.tar.gz [compress=tar.gz, t(init)=2012-12-02T20:02:20Z, dt(tran)=5s, dt(arch)=10s, ret-code=0] + output1/earth.txt (output1/human.txt) + output1/venus.txt (output1/woman.txt) + output1/mars.txt (output1/man.txt) = foo:///fred/my-su173/output2.tar.gz [compress=tar.gz] ! foo:///fred/my-su173/output3.tar.gz [compress=tar.gz]
The 1st column is a status symbol, where:
If the 1st column and the 2nd column are separated by a space character, the 2nd column is a target. If the 1st column and the 2nd column are separated by a tab character, the 2nd column is a source in the target above.
For a target line, the 3rd column contains the compress scheme, the initial time, the duration taken to transform the sources, the duration taken to run the archive command and the return code of the archive command. For a source line, the 3rd column contains the original name of the source.
This built-in application allows running of multiple command variants in parallel under a single job, as defined by the application configuration.
The application is normally configured in the [bunch] and [bunch-args] sections in rose-app.conf.
Each variant of the command is run in the same working directory with its output directed to separate .out and .err files of the form bunch.<name>.out. Should you need separate working directories you should configure your command to create the appropriate subdirectory for working in.
Note that, under load balancing systems such as PBS or Slurm, you will need to set resource requests to reflect the resources required by running multiple commands at once e.g. if one command would require 1GB memory and you have configured your app to run up to 4 commands at once then you will need to request 4GB of memory.
The following [bunch] settings are accepted:
printf
style
format string to construct the commands to run.
Insert placeholders %(argname)s for
substitution of the arguments specified under
[bunch-args] to the invoked command. The
placeholder %(command-instances)s is reserved for
inserting an automatically generated index for the
command invocation when using the
command-instances setting.The [bunch-args] section is used to specify the various combinations of args to be passed to the command specified under [bunch]command-format=:
E.g.:
meta=rose_bunch mode=rose_bunch [bunch] command-format=echo arg1: %(arg1)s, arg2: %(arg2)s, command-instance: %(command-instances)s command-instances = 4 fail-handle = abort incremental = True names = foo1 bar2 baz3 qux4 pool-size=2 [bunch-args] arg1=1 2 3 4 arg2=foo bar baz qux
This built-in application offers a way to housekeep
a cycling suite. It prunes files and directories
generated by suite tasks. It is designed to work under
rose task-run
on the host that runs the
suite daemon.
In automatic selection mode, this built-in
application will be invoked automatically if a task has
a name that starts with rose_prune*
.
The application is normally configured in the [prune] section in a rose-app.conf.
All settings are expressed as a space delimited list of cycles, normally as cycle points or offsets relative to the current cycle. For date-time cycles, the format of a cycle point should be an ISO8601 date-time, and an offset should be an ISO8601 duration. E.g. -P1DT6H is 1 day and 6 hours before the current cycle point.
The cycles of some settings also accept an optional argument followed by a colon. In these, the argument should be globs for matching items in the directory. If two or more globs are required, they should be separated by a space. In which case, either the argument should be quoted or the space should be escaped by a backslash.
The following settings are accepted:
The key can be any string that can be
used in a %(key)s
substitution, and
format should be a a valid rose
date print format.
If globs are specified for a cycle, it will attempt to prune only items matching CYCLE/GLOBS under item-root. E.g. In cylc, if current cycle is 20141225T1200Z, then prune{share/cycle}=-PT12H:wild* will clear out all items matching share/cycle/20141225T0000Z/wild*.
A glob can also be specified as a formatting
string containing a single substitution
%(cycle)s
. In this mode, the cycle
string will not be added as a sub-directory of the
item-root. E.g. In cylc, if current cycle is
20141225T1200Z, then
prune{share}=-PT12H:hello-*-at-%(cycle)s.txt
will clear out all items matching
share/hello-*-at-20141225T0000Z.txt.
A glob can also be specified as a formatting
string containing a substitution
%(key)s
, if a
cycle-format{key}=format
setting is
specified. See above for detail.
E.g.:
meta=rose_prune mode=rose_prune [prune] cycle-format{cycle_year_month}=CCYYMM prune-remote-logs-at=-PT6H archive-logs-at=-P1D prune-server-logs-at=-P7D prune{work}=-PT6H:task_x* -PT12H:*/other*.dat -PT18H:task_y* -PT24H prune{share}=-P1D:hello-*-at-%(cycle)s.txt -P3M:monthly/%(cycle_year_month)s/ prune{share/cycle}=-PT6H:foo* -PT12H:'bar* *.baz*' -P1D
There are times when extra environment needs to be
defined before launching rose task-run
.
This is where rose task-env
may come in
handy. The command prints to the STDOUT the
standard Rose task environment variables (which are
normally provided by rose task-run
) in a
syntax compatible to bash / ksh. This means that
the output of this command can be shell
eval
into the current environment.
E.g.
eval $(rose task-env)
See Rose Reference Guide: CLI > rose task-env for a full list of environment variables provided by this command.
Run an application according to its configuration,
outside of a suite task environment. Although you will
normally launch a Rose application using rose
task-run
, there are situations when you may have
a standalone Rose application configuration that you
just want to run outside of a suite. This is where
rose app-run
may come in handy.
See Rose Reference Guide: CLI > rose app-run for a full list of environment variables provided by this command.