Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-698

support bypass of first level staging for for input files

XMLWordPrintable

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • master, 4.3
    • Affects Version/s: master
    • Component/s: Pegasus Planner
    • None

      We need to put in support whereby users can setup their workflows in such a way that the datasets are directly staged to the worker node, and are not passed via the staging site.

      The name of the property that enables this is a boolean property called

      pegasus.transfer.bypass.input.staging

      this is useful in the S3 case, where data may already reside in a S3 bucket or any PegasusLite mode where the data is directly accessible on the worker nodes.

      We need to make sure that this is supported in both the Pegasus Lite Modes

      • condorio ( using condor to transfer the files that are required)
        For the condorio mode, we can only bypass those files for which the URL's end with the LFN name. So for example with executable staging turned on the executables themselves cannot be bypassed most of the time. In montage for example the fits files don't have the same naming scheme as the LFN's in the DAX. Only file URL's that exist on the submit host with the pool attribute can be staged directly in the Condor IO mode.
      • nonsharedfs ( using pegasus-transfer to transfer the files )

      Also, in this case we need to make sure that the cleanup algorithm does not delete the original input files mentioned in the replica catalog.

            Assignee:
            vahi Karan Vahi
            Reporter:
            vahi Karan Vahi
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: