A tasklet represents a small peice of work that runs in combination with other tasklets to perform a task. An example of a tasklet might be calculating how many CPU hours a pipeline will take then resizing the cluster to handle the workload. Individual tasklets should generally be seen similar to UNIX utilities such as ls or grep which perform a single function well and are designed to be composed with other tools through stdin and stdout.
Runs a tasklet on a cluster.
Usage: vp-run-metrics --name=cluster [options] "metric1 | metric2 | .. | metricn"
Quotes are important
Options:
-h, --help show this help message and exit
--host=HOST Host of web services to connect to, defaults to local
host
--name=NAME Name of cluster to run on
-b, --block Block on task name
-t, --print-task-name
Print the name of the task at the end
--pipeline-name=PIPELINE
Name of pipeline to run against
-c CONFIG Add config options, multiple allowed in style -c
key=value -c key=value
Run a tasklet to calculate the number of CPU hours a BLAST run will take on the pipeline named clovr_search_12-01-2010-15:07:00. This demonstrates that a pipeline name can be specified, this is optional. The -c option can also be used to add config options to the run. --pipeline-name specifies that the configuration from a particular pipeline should be passed as the initial input to the tasklet run and any output variables should be added or updated in the pipeline configuration.
vp-run-metrics --pipeline-name=clovr_search_12-01-2010-15:07:00 -c cluster.CLUSTER_NAME=local "translate-keys input.REF_DB_TAG=misc.REP_DB | filter-keys input.INPUT_TAG cluster.CLUSTER_NAME misc.PROGRAM misc.REP_DB | tag-is-fasta | sequence-stats | cunningham_calc_cpu_hours"
Output:
Task: runMetric-1291216072.42 Type: runMetric State: completed Num: 7/7 (100%) LastUpdated: 2010/12/01 15:08:10 UTC
Debug - 2010/12/01 15:07:55 UTC: Starting to run /opt/vappio-metrics/get-pipeline-conf clovr_search_12-01-2010-15:07:00 |
/opt/vappio-metrics/translate-keys input.REF_DB_TAG=misc.REP_DB | /opt/vappio-metrics/filter-keys input.INPUT_TAG
cluster.CLUSTER_NAME misc.PROGRAM misc.REP_DB | /opt/vappio-metrics/tag-is-fasta | /opt/vappio-metrics/sequence-stats |
/opt/vappio-metrics/cunningham_calc_cpu_hours | /opt/vappio-metrics/set-pipeline-conf clovr_search_12-01-2010-15:07:00
Notification - 2010/12/01 15:08:10 UTC: Completed
Result - 2010/12/01 15:08:10 UTC: {u'mtype': u'result', u'timestamp': 1291216090.175765, u'result': u'kv
\ncluster.CLUSTER_NAME=local\ninput.INPUT_TAG=NC_000964_peps\nmisc.PROGRAM=blastp
\nmisc.REP_DB=NC_000964_blastpdb\nparams.MAX_QUERY_SEQ_LEN=5488\nparams.MIN_QUERY_SEQ_LEN=20
\nparams.AVG_QUERY_SEQ_LEN=297.614859927\nparams.NUM_QUERY_SEQ=4105
\npipeline.COMPUTED_CPU_HOURS=1.00539440771\n'}
/vappio/runMetrics_ws.py
Parameter | Required | Type | Meaning |
---|---|---|---|
cluster | Yes | String | Name of cluster to run tasklet on. |
conf | Yes | Dictionary | Dictionary of key-value pairs to pass as initial input to first tasklet. |
metrics | Yes | String | A string of tasklets to run seperated by |. |
The name of the task associated with the tasklet run.