Python API to run workflows on REANA

Hi all,

in ATLAS we’d like to integrate REANA with RECAST (naturally) and we would like to be able to launch REANA workflows using the recast command line client (a python client) so ideally we’d be able to use the python API to do so. Is there some documentation on how to use that API for e.g. the reana-atlas-recast example?

Thanks,
Lukas

Hello @lheinric, welcome to the REANA forum :confetti_ball:

Here you can find a step by step guide on reproducing what reana-client run does by using the Python client. This is a simple example, for illustrative purposes, using: the serial workflow engine and a simple analysis. Please let us know if you face any trouble!


Install reana-client:

$ pip install reana-client==0.6.0

And now, inside a Python interpreter, or a Python file:

# Title: Submit workflows to REANA using the REANA Python client
# Author: Diego Rodriguez - https://forum.reana.io/u/diegodelemos
# Date: 22-01-2020
# REANA version: 0.6.0 

# Let us use the REANA example - ROOT6/ RooFit analysis
# (https://github.com/reanahub/reana-demo-root6-roofit/)
# with the serial workflow engine as a workflow example. We will
# create it, run it and retrieve its results using the
# REANA Python client.

# First we need to provide the minimum configuration:
# - Point the REANA service URL (with `reana-client` it would be
#   `export REANA_SERVER_URL=https://reana.cern.ch`)
import os
os.environ['REANA_SERVER_URL'] = 'https://reana.cern.ch'
# - Provide our REANA access token (with `reana-client` it
#   would be `export REANA_ACCESS_TOKEN=XXXXXXX`)
from getpass import getpass
my_reana_token = \
  os.getenv("REANA_ACCESS_TOKEN") or getpass('Enter your REANA token: ')

# Now we will choose a name for the workflow and the kind of
# workflow specification we will use:
my_workflow_name = 'root6-roofit'
workflow_type = 'serial'

# As a workflow, we will take the REANA example - ROOT6/RooFit.
# Let us specify inputs and workflows specification:
# - So first we will provide it's inputs (files and parameters)
#   (see the reana.yaml code that corresponds to it
#   code https://github.com/reanahub/reana-demo-root6-roofit/blob/f29e98b482fe8cb801735ac2fa48bc01e6cc05b7/reana.yaml#L2-L9)
my_inputs = {
    'files': [
        'code/gendata.C',
        'code/fitdata.C'
    ],  # A list of files your analysis will be using
    'parameters': {
        'events': '20000',
        'data': 'results/data.root',
        'plot': 'results/plot.png',
    }  # Parameters your workflow takes
}

# - Then we will provide the actual workflow specification,
#   in this case using serial workflow:
#   (see the reana.yaml code that corresponds to it
#   https://github.com/reanahub/reana-demo-root6-roofit/blob/f29e98b482fe8cb801735ac2fa48bc01e6cc05b7/reana.yaml#L13-L22)
workflow_json = {
    'steps': [
        {'name': 'gendata',
         'environment': 'reanahub/reana-env-root6:6.18.04',
         'commands': [
           'mkdir -p results',
           'root -b -q \'code/gendata.C(${events},"${data}")\' | tee gendata.log']},
        {'name': 'fitdata',
         'environment': 'reanahub/reana-env-root6:6.18.04',
         'commands': [
           'root -b -q \'code/fitdata.C("${data}","${plot}")\' | tee fitdata.log']}]
}

# We are now ready to create our workflow in REANA
# (`reana-client create -n $my_workflow_name`)
from reana_client.api.client import create_workflow_from_json
create_workflow_from_json(
    workflow_json=workflow_json,
    name=my_workflow_name,
    access_token=my_reana_token,
    parameters=my_inputs,
    workflow_engine=workflow_type)

# Upload files to the workflow workspace
# (`reana-client upload -w $my_workflow_name`)
from reana_client.api.client import upload_to_server

abs_path_to_input_files = \
  [os.path.abspath(f) for f in my_inputs['files']]
upload_to_server(my_workflow_name,
                 abs_path_to_input_files,
                 my_reana_token)

# And now we can start our workflow
# (`reana-client start -w $my_workflow_name`)
from reana_client.api.client import start_workflow

start_workflow(my_workflow_name, my_reana_token, {})

# We can inquiry for it's status as follows:
# (`reana-client status -w $my_workflow_name`)
from reana_client.api.client import get_workflow_status
status_details = get_workflow_status(my_workflow_name,
                                     my_reana_token)
status_details['status']
# Once it returns `finished`, the workflow will be done

# Once finished, we can list the output files and download
# the ones we are interested in:
# - Listing files
#   (`reana-client ls -w $my_workflow_name`)
from reana_client.api.client import list_files
list_files(my_workflow_name, my_reana_token)
# - Downloading the result file
#   (`reana-client download results/plot.png -w $my_workflow_name`)
output_filename = 'results/plot.png'
from reana_client.api.client import download_file
file_binary_blob = download_file(
  my_workflow_name, output_filename, my_reana_token)

Hi @diegodelemos I get this output when I execute this from the reana-demo-root6-roofit repo root

Traceback (most recent call last):
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/reana_client/api/client.py", line 331, in download_file
    access_token=access_token).result()
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/bravado/http_future.py", line 271, in result
    swagger_result = self._get_swagger_result(incoming_response)
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/bravado/http_future.py", line 302, in _get_swagger_result
    self.request_config.response_callbacks,
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/bravado/http_future.py", line 352, in unmarshal_response
    raise_on_expected(incoming_response)
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/bravado/http_future.py", line 418, in raise_on_expected
    swagger_result=http_response.swagger_result)
bravado.exception.HTTPNotFound: 404 Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "../reaaaanaaaa.py", line 107, in <module>
    my_workflow_name, output_filename, my_reana_token)
  File "/Users/lukasheinrich/Code/pyhfdev/dev/pyhfdevenv/lib/python3.6/site-packages/reana_client/api/client.py", line 348, in download_file
    raise Exception(e.response.json()['message'])
Exception: results/plot.png does not exist.

Just need a loop that waits until the workflow is finished


import time
while True:
  # We can inquiry for it's status as follows:
  # (`reana-client status -w $my_workflow_name`)
  from reana_client.api.client import get_workflow_status
  status_details = get_workflow_status(my_workflow_name,
                                      my_reana_token)
  if status_details['status'] == 'finished':
    break
  print('sleep',status_details['status'])
  time.sleep(2)
  # Once it returns `finished`, the workflow will be done


1 Like

See also our brand new blog post and the corresponding Jupyter notebook with a detailed step-by-step guide on how to run REANA workflows using Python API.