Reana-client download all files

See the issue in
https://atlas-talk.web.cern.ch/t/download-files-from-reana-client/740/2

Hi @eballabe

I cannot access the ATLAS talk web site, so I cannot see exactly what the issue at hand… But assuming that you would like to download all files from the workspace, there are several techniques that can help:

  1. Download all files from a certain directory (and its subdirectories) of the workspace via a zip tarball, by using:

    $ reana-client download -w myanalysisrun.42 mydir

  2. If the files are too big to get zipped and send efficiently in one single tarball, you can write a shell loop such as:

    $ for file in $(reana-client ls -w myanalysisrun.42 | \
                         grep -v NAME | awk '{print $1}'); \ 
      do \
          reana-client download -w myanalysisrun.42 $file; \
      done
    
    

This is just an example to customise, e.g. you could grep only the filenames that you are really interested in transferring, etc.

Does this help to address the problem?

Hi @tiborsimko

thank you for the reply! Sorry if you couldn’t see the issue. It was about download all files from a workspace, without downloading directory by directory or file by file. If no folder or file is specified, reana-client shall be able to download all files according to reana-client docs

reana-client download # download all output files

However in my case it happens that only a file is downloaded. In any case, I can download all files with the proposed solutions, thanks.

Ah, I see, thanks. The issue is that the reana-client download command downloads by default all output files (note the word), i.e. the files that are declared in reana.yaml as the workflow outputs, via the outputs clause. For example, in reana-demo-atlas-recast:

version: 0.6.0
inputs:
  parameters:
    did: 404958
    xsec_in_pb: 0.00122
    dxaod_file: https://recastwww.web.cern.ch/recastwww/data/reana-recast-demo/mc15_13TeV.123456.cap_recast_demo_signal_one.root
  directories:
    - workflow
workflow:
  type: yadage
  file: workflow/workflow.yml
outputs:
  files:
    - statanalysis/fitresults/limit.png

Only limit.png is declared as a workflow output, so it would be only this file that will be downloaded when one does reana-client download without any argument.

(We do the inputs/temporary-files/outputs distinction so as not to unnecessarily download any big temporary files once the workflow finishes.)

If you enrich your reana.yaml and list all the interesting outputs in that clause, then this would also solve the issue.

1 Like