Job identifier.
How red, yellow, and green percentages are calculated
Each I/O call has a duration measured in microseconds. Once the call is
categorized under bad, medium, or good I/O, we accumulate the call duration to
get the time spent in red, yellow, and green I/O operations. In addition, we
need to measure the total time the application spent doing I/O. The
percentages are then simply calculated as:
- % Red time = (Time spent in bad I/O ops) / (Total time spent in I/O ops)
- % Yellow time = (Time spent in medium I/O ops) / (Total time spent in I/O ops)
- % Green time = (Time spent in good I/O ops) / (Total time spent in I/O ops)
We don’t calculate the percentages against the total wallclock runtime,
because the application spends time also doing CPU-intensive tasks, memory
I/O, synchronization (locks), sleeping, etc.
In a similar fashion, we calculate the percentages using call counts:
- % Red calls = (Number of bad I/O calls) / (Total I/O calls)
- % Yellow calls = (Number of medium I/O calls) / (Total I/O calls)
- % Green calls = (Number of good I/O calls) / (Total I/O calls)
We log the total time spent in I/O ops, which is:
- Total time spent in I/O ops = Red time + Yellow time + Green time and similarly for the total number of I/O calls:
- Total number of I/O calls = Red calls + Yellow calls + Green calls
- We also log how much of the total running time was spent in I/O:
- % I/O Time = (Total time spent in I/O ops) / (Total wallclock runtime)
For multi-threaded processes, the times and call counts are accumulated from
each thread. Therefore the total time spent in I/O may be greater than the
total wallclock runtime, and equally % I/O Time may be greater than 100%.
Rules for Bad I/O
Definition of bad I/O:
- Small reads or writes.
- Opens for files where nothing was written or read.
- Stats that succeeded on files that were not used.
- Failed I/O.
- Backward seeks.
- Trawls of failed I/O where we include the whole time from the first fail to the last fail or the first success of the same type.
- Zero seeks, reads, and writes.
- Failed network I/O.
Rules for medium I/O
Definition of medium I/O:
- Opens for files from which less than N bytes were read or written.
- Stats of files that were used later.
- Forward seeks.
Rules for good I/O
Definition of good I/O:
- Reads and writes greater than MISTRAL_PROFILE_SMALL_IO
- Opens for files from which at least MISTRAL_PROFILE_SMALL_IO bytes were read or written.
- Successful network I/O.
Monitoring an application
Once Mistral has been configured it can be run using the mistral script
available at the top level of the installation. To monitor an application you
just type mistral followed by your command and arguments. For example:
$ ./mistral ls -l $HOME
By default, any error messages produced by Mistral will be written to a file
named mistral.log in the current working directory.
Any errors that prevent the job from running as expected, such as a malformed
command line, will also be output to stderr.
This behaviour can be changed by the following command line options.
–log= < filename >
-l= < filename >
Record Mistral error messages in the specified file. If this option is not
set, errors will be written to a file named mistral.log in the current working
directory. -q Quiet mode. Send all error messages, regardless of severity, to
the error log. Command line options are processed in order, therefore this
option must be specified first to ensure any errors parsing command line
options are sent to the error log.
Example contracts
Monitoring Contract
Consider the following contract:
#LABEL,| PATH,| CALL-TYPE,| SIZE-RANGE,|
MEASUREMENT,| THRESHOLD
---|---|---|---|---|---
High_reads,| /usr/,| read,| all,| bandwidth,| 1MB
Higher_reads,| /usr/,| read,| all,| bandwidth,| 5MB
Even_higher_reads,/usr/,| usr/,| read,| all,| bandwidth,| 50MB
High_create_lat,| /tmp/,| create,| all,| mean-latency,| mean-latency, 10ms
High_num_w,| /home/,| write,| all,| count,| 750
2, monitortimeframe, 1s
This line identifies the contract as containing monitoring rules that are
applied over a time frame of 1 second.
High_reads, /usr/, read, all, bandwidth, 1MB
Assuming that /usr/ is a mount point, this line defines a rule named “
High_reads” and tells Mistral to generate an alert when the total amount of
data read from /usr/ exceeds 1 MB within the one-second time frame.
If a monitored process were to read a 2 MB file in /usr/share/doc/ in less
than a second, for example, this rule would be violated and a log message of
the following form would be output:
2020-07-30T14:30.108355,High_reads,/usr,ext4,/dev/nvme0n1p5,,read,all,
bandwidth,2MB/1s,1MB/1s,foo.bar.com,15392,0,/mnt/tool/bin/python script.py,
,3,6,,0
Higher_reads, /usr/, read, all, bandwidth, 5MB
Even_higher_reads, /usr/, read, all, bandwidth, 50MB
These two lines define two additional rules named Higher_reads and
Even_higher_reads respectively.
All reads in /usr/ will be tested against all three rules.
If a process read 60MB of data in less than 1 second all three currently
defined rules would be violated, but only the third rule would be logged. This
is because Mistral only logs the largest threshold violated when multiple
rules are defined on the same path, call-type, and measurement as is the case
with the High_reads and Higher_reads_bin rules:
2020-07-30T14:30.108529,Even_higher_reads,/usr,ext4,/dev/nvme0n1p5,,read,all,
bandwidth,60MB/1s,50MB/1s,foo.bar.com,15392,0,/bin/bash script.sh, ,3,6,,0
High_create_lat, /tmp/, create, all, mean-latency, 10ms
The rule labeled High_create_lat is only concerned with function calls that
create file system objects ( create) under /tmp/, which is assumed to be a
mount point. In this case the latency of each call made during the time frame
is accumulated and averaged over the total number of these calls, provided the
number of calls within the time frame is higher than the value of
MISTRAL_MONITOR_LATENCY_MIN_IO.
If at the end of the time frame this mean-latency is higher than 10ms then a
log message will be output, for example:
2020-07-30T15:10.108650,High_create_lat,/tmp,,,,create,all,mean-latency,
22ms,10ms,foo.bar.com,15537,1,/bin/bash script.sh,,3,6,,0
High_num_w, /home/, write, all, count, 750
The rule labeled High_num_w is violated if the number of write calls in a time
frame exceeds 750.
2020-07-30T15:10.108669,High_num_w,/home,nfs4,server25.local:/nfs/home,
server25.local,read,all,,write,all,count,863,750,foo.bar.com,15537,1,
/bin/bash script.sh,,3,6,,0
Throttling Contract
Consider the following contract:
2, throttletimeframe, 1s
LABEL,| PATH,| CALL-TYPE,| SIZE-RANGE,| MEASUREMENT,| ALLOWED
---|---|---|---|---|---
High_reads,| /usr/,| read,| all,| bandwidth,| 5MB
Moderate_reads,| Moderate_reads, /usr/,| read,| all,| bandwidth,| 1MB
High_num_r,| /home/,| read,| all,| count,| 750
Examining each line individually:
2, throttle time frame, 1s
This line identifies the contract as containing throttling rules that are
applied over a time frame of 1 second.
High_reads, /usr/, read, all, bandwidth, 5MB
If a monitored job were to try and read a 6MB file in /usr/share/doc/ in less
than a second, for example, this rule would be violated. When Mistral
identifies an I/O operation that would violate a throttling rule it will
introduce a sleep long enough to bring the observed I/O back down to the
configured limit and a log message of the following form will be output:
2020-07-30T14:30.108355, High_reads,/usr,ext4,/dev/nvme0n1p5,, read, all,
bandwidth, 1MB/1s,1MB/1s, foo.bar.com,15392,0,/mnt/tool/bin/python script.py,
3,6,,0
Moderate_reads, /usr/, read, all, bandwidth, 1MB
The second rule in this contract is very similar to the first. Again it is
monitoring read bandwidth but this time will allow up to 1MB of data to be
read before the rule is violated.
In this case, all reads in /usr/ will be tested against both the “ High_reads
and Moderate_reads rules.
If the process attempted to read 6MB of data in less than 1 second both the
currently defined rules would be violated. In this case, the most restrictive
rule applies and the process will be throttled to 1MB/1s, and up to 6 log
messages generated by violations of the Moderate_reads rule will be logged.
High_num_r, /home/, read, all, count, 750
The third rule does not care about how large each operation is, it is simply
interested in the total number of times a call is made to a read operation. If
a total of more than 750 read operations are performed within the time frame
of 1 second on the device mounted at /home/ then on the 751st read Mistral
would introduce a sleep long enough to bring the data rate under 750/1s and a
log message of the following form would be logged:
2020-07-30T16:45.108469,High_num_r,/home,nfs4,server25.local:/nfs/home,
750/1s,750/1s,foo.bar.com,16601,1,/usr/lib64/firefox/firefox, ,1,1,,0
Plug-ins
Currently, two different plug-ins are supported.
Update Plug-in
The updated plug-in is used to modify local Mistral configuration contracts
dynamically during a job execution run according to conditions on the node
and/or cluster. Using an updated plug-in is the only way to modify the local
contracts in use by a running job.
Global contracts are assumed to be configured with high “system threatening”
rules that should not be frequently changed.
hese contracts are intended to be maintained by system administrators and will
be polled periodically for changes on disk as described above. Global
contracts cannot be modified by the update plug-in in any way.
If an update plug-in configuration is not defined Mistral will use the same
local configuration contracts throughout the life of the job.
Output Plug-in
The output plug-in is used to record alerts generated by the Mistral
application. All event alerts raised against any contract (local or global,
monitoring or throttling) are sent to the output plug-in.
If an output plug-in configuration is not defined Mistral will default to
recording alerts to disk as described above. In addition, if an output plug-in
performs an unclean exit during a job Mistral will revert to recording alerts
to a log file. This log file will use the log record format expected by the
plug-in to allow for simpler recovery of the data at a later date.
Data rate
When setting up an output plug-in it makes sense to consider the rate at which
Mistral can be configured to output data. The amount of data output is
dependent on your configuration, for sizing the database we recommend the
following calculation:
Each record has a maximum size of 4kB – this can be reduced by excluding
fields.
Most rules are applied per mount point and can output data once per time
frame.
(4kB Active Mount Points Number of Rules) / Time frame = Data per Second
Plug-in Configuration
On start up Mistral will check the environment variable MISTRAL_PLUGIN_CONFIG.
If this environment variable is defined it must point to a file that the user
running the application can read. If the environment variable is not defined
Mistral will assume that no plug-ins are required and will use the default
behaviors as described above.
When using plug-ins, at the end of a job Mistral will wait for a short time,
by default 30 seconds, for all plug-ins in use to exit in order to help
prevent data loss. If any plug-in processes are still active at the end of
this timeout they will be killed. The timeout can be altered by setting the
environment variable MISTRAL_PLUGIN_EXIT_TIMEOUT to an integer value between
0 and 86400 that specifies the required time in seconds.
The expected format of the configuration file consists of one block of
configuration lines for each configured plug-in. Each line is a comma-
separated pair of a single configuration option directive and its value.
Whitespace is treated as significant in this file. The full specification for
a plug-in configuration block is as follows:
PLUGIN,<OUTPUT|UPDATE>INTERVAL,
PLUGIN_PATH, [PLUGIN_OPTION,<Option to pass
to plug-in>] …END
PLUGIN directive
The PLUGIN directive can take one of only two values, UPDATE or OUTPUT which
indicates the type of plug-in being configured.
If multiple configuration blocks are defined for the same plug-in the values
specified in the later block will take precedence.
INTERVAL directive
The INTERVAL directive takes a single integer value parameter. This value
represents the time in seconds the Mistral application will wait between calls
to the specified plug-in.
PLUGIN_PATH directive
The PLUGIN_PATH directive value must be the fully qualified path to the plug-
in to be run e.g. /home/ellexus/bin/output_plugin.sh.
This plug-in must be executable by the user that starts the Mistral
application. The plug-in must also be available in the same location on all
possible execution host nodes where Mistral is expected to run.
The PLUGIN_PATH value will be passed to /bin/sh for environment variable
expansion at the start of each execution host job.
PLUGIN_OPTION directive
The PLUGIN_OPTION directive is optional and can occur multiple times. Each
PLUGIN_OPTION directive is treated as a separate command line argument to the
plug-in. Whitespace is respected in these values.
As whitespace is respected command line options that take parameters must be
specified as separate PLUGIN_OPTION values.
For example, if the plug-in uses the option “–output /dir/name/” to specify
where to store its output then this must be specified in the plug-in
configuration file as: PLUGIN_OPTION,–output
PLUGIN_OPTION,/dir/name/Options will be passed to the plug-in in the order in
which they are defined.
Each PLUGIN_OPTION value will be passed to /bin/sh for environment variable
expansion at the start of each execution host job.
END Directive
The END directive indicates the end of a configuration block and does not take
any values.
Invalid Configuration
Blank lines and lines starting with “#” are silently ignored. All other lines
that do not begin with one of the configuration directives defined above cause
a warning to be raised.
Example Configuration
Consider the following configuration file; line numbers have been added for
clarity:
# File version: 2.9.3.2, modification date: 2016-06-17 2
PLUGIN,OUTPUT
INTERVAL,300
PLUGIN_PATH,/home/ellexus/bin/output_plugin.sh
PLUGIN_OPTION,–output
PLUGIN_OPTION,/home/ellexus/log files
END 9
PLUGIN,UPDATE
INTERVAL,60
PLUGIN_PATH, $HOME /bin/update_plugin
END
The configuration file above sets up both update and output plug-ins.
Lines 1-2 are ignored as comments. The first configuration block (lines 3-8)
defines an output plug-in (line 3) that will be called every 300 seconds (line
4) using the command line /home/ellexus/bin/output_plugin.sh –output
“/home/ellexus/log files” (lines 5-7). The configuration block is terminated
on line 8.
The blank line is ignored (line 9).
The second configuration block (lines 10-13) defines an updated plug-in (line
10) that will be called every 60 seconds (line 11) using the command line
/home/ellexus/bin/update_plugin, (line 12), assuming $HOME is set to
/home/ellexus. The configuration block is terminated on line 13.
Scheduler Integration
IBM Spectrum LSF
Launcher script
Create a script that defines the required environment variables and any
default settings, for example #!/bin/bash
INSTALL= /apps/ellexus export MISTRAL_INSTALL_DIRECTORY=${INSTALL}
/mistral_latest_x86_64 export MISTRAL_LICENSE=${ALTAIR_LICENSE_SERVER} :6200
This script hard codes a simple global contract but the
following lines can be replaced with whatever business
logic is required to set up an appropriate contract for
the submitted job.
export MISTRAL_CONTRACT_MONITOR_GLOBAL=${INSTALL} /global.contract export
MISTRAL_LOG_MONITOR_GLOBAL=${INSTALL} /global- ${HOSTNAME} .log # Set up the
Mistral environment. As we are doing this # automatically on LSF queues set
Mistral to only manually # insert itself in is and ssh commands to other
nodes. source ${MISTRAL_INSTALL_DIRECTORY} /mistral –remote=rsh, ssh This
script should be saved in an area accessible to all execution nodes.
Define a Job Starter
For each queue that is required to automatically wrap jobs with Mistral add a
JOB_STARTER setting that re-writes the command to launch the submitted job
using the script created above. For example, if the script above has been
saved in /apps/ellexus/mistral_launcher.sh the following code defines a simple
queue that will use it to wrap all jobs with Mistral:
# Mistral job starter queue
Begin Queue
QUEUE_NAME = mistral
PRIORITY = 30
INTERACTIVE = NO
TASKLIMIT = 5
JOB_STARTER = . /apps/ellexus/mistral_launcher.sh ; %USRCMD
DESCRIPTION = For mistral demo
End Queue
Once the job starter configuration has been added the queues must be
reconfigured by running the command: $ admin config
To check if the configuration has been successfully applied to the queue the
queues command can be used with the “ -l” long format option which will list
any job starter configured, e.g. $ queues -l mistral
QUEUE: mistral
— For mistral demo
PARAMETERS/STATISTICS
PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USURP RSV 30 0
Open:Active – – – – 0 0 0 0 0 0
Interval for a host to accept two jobs is 0 seconds TASKLIMIT 5
SCHEDULING PARAMETERS
r15s r1m r15m ut pg io ls it tmp swp mem loadSched – – – – – – – – – – –
loadStop – – – – – – – – – – SCHEDULING
POLICIES: NO_INTERACTIVE
USERS: all
HOSTS: all
JOB_STARTER: . /apps/ellexus/mistral_launcher.sh ; %USRCMD
OpenLava
Launcher script
Create a script that defines the required environment variables and any
default settings, for example: #!/bin/bash INSTALL= /apps/ellexus
export MISTRAL_INSTALL_DIRECTORY=${INSTALL} /mistral_latest_x86_64 export
MISTRAL_LICENSE=${ALTAIR_LICENSE_SERVER} :6200
This script hard codes a simple global contract but the
following lines can be replaced with whatever business
logic is required to set up an appropriate contract for
the submitted job.
export MISTRAL_CONTRACT_MONITOR_GLOBAL=${INSTALL} /global.contract export
MISTRAL_LOG_MONITOR_GLOBAL=${INSTALL} /global- ${HOSTNAME} .log # Set up the
Mistral environment. As we are doing this automatically
on OpenLava queues set Mistral to only manually insert itself
in rsh and ssh commands to other nodes. source ${MISTRAL_INSTALL_DIRECTORY}
/mistral –remote=rsh,ssh This script should be saved in an area accessible to
all execution nodes.
Define a Job Starter
For each queue that is required to automatically wrap jobs with Mistral add a
JOB_STARTER setting that re-writes the command to launch the submitted job
using the script created above.
For example, if the script above has been saved in
/apps/ellexus/mistral_launcher.sh the following code defines a simple queue
that will use it to wrap all jobs with Mistral:
# Mistral job starter queue
Begin Queue
QUEUE_NAME = mistral
PRIORITY = 30
INTERACTIVE = NO
JOB_STARTER = . /apps/ellexus/mistral_launcher.sh ; %USRCMD
DESCRIPTION = For mistral demo
End Queue
Once the job starter configuration has been added the queues must be
reconfigured by running the command: $ badmin reconfig
To check if the configuration has been successfully applied to the queue the
queues command can be used with the “ -l” long format option which will list
any job starter configured, e.g. $ queues -l mistral
QUEUE: mistral
— For mistral demo
PARAMETERS/STATISTICS
PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV
30 0 Open:Active – – – – 0 0 0 0 0 0
Interval for a host to accept two jobs is 0 seconds
SCHEDULING PARAMETERS
r15s r1m r15m ut pg io ls it tmp swp mem loadSched – – – – – – – – – –
loadStop – – – – – – – – – – SCHEDULING
POLICIES: NO_INTERACTIVE
USERS: all users
HOSTS: all hosts used by the OpenLava system
JOB_STARTER: . /apps/ellexus/mistral_launcher.sh ; %USRCMD
Univa Grid Engine
Launcher script
Create a script that defines the required environment variables and any
default settings, for example: #!/bin/bash
This script should be saved in an area accessible to all
execution nodes and added as a starter_method to each
queue that requires Mistral.
INSTALL= /apps/ellexus
export MISTRAL_INSTALL_DIRECTORY=${INSTALL} /mistral_latest_x86_64 export
MISTRAL_LICENSE=${ALTAIR_LICENSE_SERVER} :6200
This script hard codes a simple global contract but the
following lines can be replaced with whatever business
logic is required to set up an appropriate contract
for the submitted job.
export MISTRAL_CONTRACT_MONITOR_GLOBAL=${INSTALL} /global.contract export
MISTRAL_LOG_MONITOR_GLOBAL=${INSTALL} /global-%h.log # Set the shell we need
to use to invoke the submitted command shell=${SGE_STARTER_SHELL_PATH:-
/bin/sh } if [ ! -x $shell ]; then
Assume that if the check failed $shell was not
set to /bin/sh shell= /bin/sh fi shell_name=$( basename $shell) if [ ”
${shell_name: -3} ” = “csh” ]; then suffix= .csh fi
Check if a login shell is required if [ ” $SGE_STARTER_USE_LOGIN_SHELL ” =
“true” ]; then logopt= “-l” else logopt= “” fi
Wrap the job with Mistral. As we are doing this automatically
on UGE queues set Mistral to only manually insert itself in
rsh and ssh commands to other nodes. exec ${logopt} ${shell} ”
${MISTRAL_INSTALL_DIRECTORY} /mistral $suffix ” –remote=rsh,ssh ” $@ “
This will launch the default editor (either vi or the editor indicated by the
EDITOR environment variable). Find the setting for starter_method and replace
the current value, typically “ NONE”, with the path to the launcher script.
Save the configuration and exit the editor. For example, the following snippet
of queue configuration shows the appropriate setting to use the file described
above.
epilog NONE
shell_start_mode unix_behavior
starter_method /home/ellexus/ugedemo/launch.sh
suspend_method NONE
resume_method NONE
It is important to note that a starter_method will not be invoked for qsh,
login, or qrsh acting as login, and as a result, these jobs will not be
wrapped by Mistral.
To check if the configuration has been successfully applied to the conf
command can be used with the -sq option to show the full queue configuration
which will list any starter method configured, e.g.
$ qconf -sq mistral.q
qname mistral.q
hostlist @allhosts
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make
jc_list NO_JC,ANY_JC
rerun FALSE
slots 1
tmpdir /tmp
shell /bin/bash
prolog NONE
epilog NONE
shell_start_mode unix_behavior
starter_method /home/ellexus/ugedemo/launch.sh
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60| owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
d_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY
---|---
Slurm
TaskProlog script
Create a Slurm TaskProlog script that prints out the required environment
variables and any default settings, for example: #!/bin/bash
INSTALL= /apps/ellexus
MISTRAL_INSTALL_DIRECTORY=${INSTALL} /mistral_latest_x86_64
echo “export MISTRAL_INSTALL_DIRECTORY= $MISTRAL_INSTALL_DIRECTORY ”
Setup the license echo “export MISTRAL_LICENSE= ${ALTAIR_LICENSE_SERVER}
:6200”
Disable remote tracing; Singularity is always monitored echo “export
ELLEXUS_REMOTE=singularity”
Slurm has a mechanism which sends the environment variables from
the submission node to the execution nodes. We want Mistral to have
a fresh start on each execution node.
echo “unset ELLEXUS_ONETIME_SETUP_DONE”
echo “unset ELLEXUS_OUTPUT_DIRECTORY”
echo “unset ELLEXUS_ROOT_OUTPUT_DIRECTORY”
This script hard codes a simple global contract but the following
lines can be replaced with whatever business logic is required to
set up an appropriate contract for the submitted job.
echo “export MISTRAL_CONTRACT_MONITOR_GLOBAL= ${INSTALL} /global.contract”
echo “export MISTRAL_LOG_MONITOR_GLOBAL= ${INSTALL} /global-%h.log”
This script sets the Mistral temporary directory. This only needs
to be set if the slurm installation uses cgroups.
This should be the same path as in the TaskEpilog script.
ELLEXUS_OUTPUT_DIRECTORY= “/tmp/mistral. ${USER} . ${SLURM_JOB_ID} ” if [[ -n
” ${SLURM_ARRAY_TASK_ID} ” ]] ; then
ELLEXUS_OUTPUT_DIRECTORY= ” ${ELLEXUS_OUTPUT_DIRECTORY} .
${SLURM_ARRAY_TASK_ID} ” fi if [[ -n ” ${SLURM_STEP_ID} ” ]] ; then
ELLEXUS_OUTPUT_DIRECTORY= ” ${ELLEXUS_OUTPUTDIRECTORY} ${SLURM_STEP_ID} ”
fi mkdir ” ${ELLEXUS_OUTPUT_DIRECTORY} ” echo “export
ELLEXUS_OUTPUT_DIRECTORY= ${ELLEXUS_OUTPUT_DIRECTORY} ” # Finally, set
LD_PRELOAD echo “export LD_PRELOAD= ${MISTRAL_INSTALL_DIRECTORY} /dryrun/ \$
LIB/libdryrun.so” This script should be saved in an area accessible to all
execution nodes.
TaskEpilog Script
If Islam is set to use groups, it is necessary to create a Slurm TaskEpilog
script that signals to Mistral that the job is finished before the cgroup
kills the task. For example:
!/bin/bash
This script should be saved in an area accessible to all
execution nodes and set as the TaskEpilog script in the
to use cgroups to track processes.
If Mistral is still running there will be a PID identifier file
This path must match ELLEXUS_OUTPUT_DIRECTORY set in the TaskProlog
ELLEXUS_OUTPUT_DIRECTORY= “/tmp/mistral. ${USER} . ${SLURM_JOB_ID} ”
if [[ -n ” ${SLURM_ARRAY_TASK_ID} ” ]] ; then
ELLEXUS_OUTPUT_DIRECTORY= ” ${ELLEXUS_OUTPUT_DIRECTORY} .
${SLURM_ARRAY_TASK_ID} ” fi if [[ -n ” ${SLURM_STEP_ID} ” ]] ; then
ELLEXUS_OUTPUT_DIRECTORY= ” ${ELLEXUS_OUTPUTDIRECTORY} ${SLURM_STEP_ID} ”
fi
MONITOR_PID_FILE= ls ” ${ELLEXUS_OUTPUT_DIRECTORY} ” /tmp/monitor_pid_* 2> /dev/null
if [[ -f ” $MONITOR_PID_FILE ” ]] ; then
File exists get PID from the end of the file name
MONITOR_PID=${MONITOR_PIDFILE## * }
Send SIGTERM to Mistral, so that the final timeframe of data
is writen before the cgroup is Killed by SIGKILL kill -TERM $MONITOR_PID 2>
/dev/null while kill -0 $MONITOR_PID 2> /dev/null ; do
Wait unil the monitor has actually finished sleep 0.3 done fi
Update Slurm configuration
Configure Slurm to use the above TaskProlog and TaskEpilog scripts by adding
the following lines in your slurm. conf file: TaskProlog=/path/to/mistral/task
prolog.sh TaskEpilog=/path/to/mistral/taskepilog.sh, Each execution host
requires the same TaskProlog setting. Finally, instruct all Slurm daemons to
re-read the configuration file: $ control reconfigure Now all jobs submitted
with batch, run, and allow commands use Mistral.
Running Mistral on a Specific Partition
Rather than running Mistral on all jobs, Mistral can be configured to run only
on specific Partitions. Simply surround the examples in task prolog script and
task epilog script with an if statement comparing the $SLURM_JOB_PARTITION
variable, for example:
!/bin/bash
if [ ” $SLURM_JOB_PARTITION ” == “mistral” ]; then INSTALL= /apps/ellexus
MISTRAL_INSTALL_DIRECTORY=${INSTALL} /mistral_latest_x86_64 …
The Slurm configuration should then be updated as in the Update Slurm
configuration.
Any jobs submitted on the ‘mistral’ partition will now run under mistral.
PBS Professional
Hook script
Create a PBS hook script (python) that inserts the required environment
variables and any default settings into the job’s environment.
For example create a script called hook.py that contains:
import socket
import PBS
pbsevent = pbs.event()
jobname = pbsevent.job.queue.name
if jobname == “demo” ;
install_dir = “/home/users/ellexus/mistral_latest_x86_64/”
config_dir = “/home/users/ellexus/pbsconfig/”
pbsevent.env[ “MISTRAL_INSTALL_DIRECTORY” ] = install_dir
pbsevent.env[ “MISTRAL_LICENSE” ] = < server > : < port >
pbsevent.env[ “MISTRAL_CONTRACT_MONITOR_GLOBAL” ] = config_dir +
“global.contract”
host = socket.gethostname()
pbsevent.env[ “MISTRAL_LOG_MONITOR_GLOBAL” ] = config_dir + “global-” + host +
“.log”
pbsevent.env[ “MISTRAL_PLUGIN_CONFIG” ] = config_dir + “output_plugin.conf”
pbsevent.env[ “LD_PRELOAD” ] = install_dir + “dryrun/$LIB/libdryrun.so”
This script should be saved in an area accessible to all execution nodes.
Now the hook needs to be setup. Create a hook named “job_starter” (can use any
name) and import it:
$ qmgr -c “create hook job_starter event=execjob_launch”
$ qmgr -c “import hook job_starter application/x-python default
/path/to/hook.py”
Now all jobs submitted with qsub use Mistral.
Note: Every time the hook script is modified, it needs to be “imported” again
using the
$ qmgr -c “import hook …” command above.
Altair Accelerator & Flowtracer
The MISTRAL_BYPASS_PROGRAMS environment variable can be used to avoid tracing
I/O which originates in Altair Accelerator & Flowtracer while still recording
I/O caused by Accelerator & Flowtracer jobs.
If the MISTRAL_BYPASS_PROGRAMS environment variable is set to list of programs
and directories, then any program which matches an entry in the list will be
run in bypass mode. For example, export
MISTRAL_BYPASS_PROGRAMS=”emacs,/usr/local/”, would run emacs and any program
in /usr/local/, or a sub-directory such as /usr/local/bin, in bypass mode. If
MISTRAL_BYPASS_PROGRAMS is set to the parent directory of VOVDIR, all programs
in the Accelerator or Flowtracer installation will run in bypass mode.
If more control over the process is needed, you can take advantage of the fact
that both Accelerator and Flowtracer can set user-specified environment
variables when they run jobs. Since Breeze and Mistral can be started by
setting appropriate environment variables, this provides a basic mechanism for
tracing such jobs without having to trace the accelerator or flow tracer
infrastructure.
Accelerator
Following a similar approach to the PBS hook, you can create an environment
for mistral which just sets the version. Then create another special
environment file called MISTRAL.pre.TCL actually sets all the relevant
variables for the job.
$ cat MISTRAL.pre.tcl
setenv MISTRAL_LICENSE :
setenv MISTRAL_PLUGIN_CONFIG /path/to/mistral_tests/elastic_plugin.conf
setenv MISTRAL_CONTRACT_MONITOR_GLOBAL
/path/to/ellexus/pbsconfig/monitoring.kitchensink.contract
setenv MISTRAL_LOG_MONITOR_GLOBAL
/path/to/ellexus/pbsconfig/monitoring.kitchensink.contract%h.log setenv
LD_PRELOAD
/path/to/ellexus/ellexus/mistral_2.13.6_x86_64/dryrun/\$LIB/libdryrun.so $
To use this, you submit the job with a command such as… nc run -e
SNAPSHOT+MISTRAL — myJob
You could put all the mistral environment variables in SNAPSHOT.pre.tcl, but
having a separate environment is slightly clearer.
Flowtracer
If the variable VOV_ENV directory is set to $VOVDIR/local/mistral/environments
then any SNAPSHOT/SNAPPROP, the noninteractive job can automatically activate
Mistral monitoring.
This can be done at job submission or by the administrator by setting the
variable dynamically in a compute host ( vovslavemgr config -setenv
VOV_ENV_DIR=…).
Host-based enablement is likely to be the lowest impact. It can also be done
centrally (setup.TCL). It also takes effect immediately
whereas anything that is job submission based will on impact new jobs and
conversely roll back is difficult because of queued jobs carrying the
variable.
A typical calling sequence for a single compute host, here running 4 jobs,
looks something like this:
The first vovslaveroot runs as root. The second is the forked version that has
setuid to the user and can be thought of as the entry point to the user’s job.
Vw is a binary wrapper that does many magic things.
The triggering script is called in the first ‘vw’ – it runs a tcl intpreter on
the code below and then captures the results environment array. This environ
is then used in the exec of the job. This approach requires ‘vw’, which means
that interactive jobs cannot be monitored in this way
The triggering script should be something like: if [info exists
env(VOV_JOBID)] { if { [info exists env(ELLEXUS_ONETIME_SETUP_DONE)] == 0} {
setenv ELLEXUS_JOB_ID $env(VOV_JOBID) catch {setenv ELLEXUS_JOB_GROUP_ID
[exec vovselect jobclass from $env(VOV_JOBID)]} # add/change this list for
alternative locations foreach d [list $env(VOVDIR)/local/mistral/current ] {
if {[file isfile $d/mistral] == 1} { setenv MISTRAL_HAVE $d/mistral setenv
MISTRAL_INSTALL_DIRECTORY [file normalize $d] # arbitrary but the way we have
it for the evaluation setenv MISTRAL_LICENSE “:” set INSTALL
“[file dirname $env(MISTRAL_INSTALL_DIRECTORY)]” vovenvDebug “Mistral is in
$env(MISTRAL_INSTALL_DIRECTORY)”
possibly from the user’s environment or passed in as argument to MISTRAL env
if [info exists MISTRAL_CONTRACT] {setenv MISTRAL_CONTRACT_MONITOR_GLOBAL
env(MISTRAL_INSTALL_DIRECTORY)/monitoring.${MISTRAL_CONTRACT}.contract } else
{ setenv MISTRAL_CONTRACT_MONITOR_GLOBAL
$env(MISTRAL_INSTALL_DIRECTORY)/samples/monitoring.kitchensink.contract }
setenv MISTRAL_LOG_MONITOR_GLOBAL
$env(VNCSWD)/$env(VOV_PROJECT_NAME).swd/logs/mistral/global%h.log setenv
MISTRAL_PLUGIN_CONFIG
${INSTALL}/elastic_plugin.config vovenvDebug
“MISTRAL: Contract: $env(MISTRAL_CONTRACT_MONITOR_GLOBAL) with plugin
$env(MISTRAL_PLUGIN_CONFIG)” # Finally, set LD_PRELOAD
setenv LD_PRELOAD $env(MISTRAL_INSTALL_DIRECTORY)/dryrun/\$LIB/libdryrun.so
vovenvDebug “MISTRAL: LD preloader: $env(LD_PRELOAD)” } else { vovenvDebug
“MISTRAL: Already have Mistral monitoring”
Container Support
Singularity
Mistral will monitor workloads in Singularity containers by default. This will
add a number of bind paths to each singularity container so that Mistral is
able to read the configuration files and run the executables that it normally
would. If these files are all in one area of your filesystem you can minimize
the number of paths that are bound by setting the following environment
variable to that path:
MISTRAL_SINGULARITY_BIND_PATH
Docker
Mistral does not currently monitor workloads in Docker containers by default –
this feature is planned for a future release.
Mistral Healthcheck
If you are running Mistral on a small scale, for instance, to test the
functionality, it can sometimes be useful to log data to disk and then process
the log file(s) that it produces.
There are scripts and tools for doing this in the tools directory. There is a
master script in this directory mistral_report.sh, which creates separate CSV
files for the different rules, GNUplot graphs, and an HTML report.
mistral_report.sh
This script expects the path (or paths) to Mistral log files. Optionally you
can also specify an output directory with the -o argument.
e.g. $ tools/mistral_report.sh -o /tmp/mistral.out /tmp/job1.mistral.log
This will generate the HTML report, CSV files and GNUPlot graphs. To omit the
CSV files and GNUPlot graphs supply the -n option.
Mistral Healthcheck Reports
The Mistral Healthcheck report works best with the supplied monitoring.
kitchen sink. contract, as this contains rules that populate specific sections
of the report.
When the tools/mistral_report.the sh script is run it will create the
Healthcheck HTML file mistral_report.html and output the location of the file.
This is the main report file and has links to all the other data. The other
data is split by rule type into different HTML files.
References
Read User Manual Online (PDF format) >>
Download this manual >>