support bypass of first level staging for for input files

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XMLWordPrintable

    • Type: Improvement
    • Resolution: Fixed
    • Priority: Major
    • master, 4.3
    • Affects Version/s: master
    • Component/s: Pegasus Planner
    • None

      We need to put in support whereby users can setup their workflows in such a way that the datasets are directly staged to the worker node, and are not passed via the staging site.

      The name of the property that enables this is a boolean property called

      pegasus.transfer.bypass.input.staging

      this is useful in the S3 case, where data may already reside in a S3 bucket or any PegasusLite mode where the data is directly accessible on the worker nodes.

      We need to make sure that this is supported in both the Pegasus Lite Modes

      • condorio ( using condor to transfer the files that are required)
        For the condorio mode, we can only bypass those files for which the URL's end with the LFN name. So for example with executable staging turned on the executables themselves cannot be bypassed most of the time. In montage for example the fits files don't have the same naming scheme as the LFN's in the DAX. Only file URL's that exist on the submit host with the pool attribute can be staged directly in the Condor IO mode.
      • nonsharedfs ( using pegasus-transfer to transfer the files )

      Also, in this case we need to make sure that the cleanup algorithm does not delete the original input files mentioned in the replica catalog.

            Assignee:
            Karan Vahi
            Reporter:
            Karan Vahi
            Archiver:
            Rajiv Mayani

              Created:
              Updated:
              Resolved:
              Archived: