Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-669 support for site catalog version 4 schema
  3. PM-677

the planner generate cache file should refer to only get url's


    • Type: Icon: Sub-task Sub-task
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • master, 4.2
    • Affects Version/s: master
    • Component/s: Pegasus Planner
    • None

      The cache file tracks all the files on the staging site ( inputs staged, outputs created )

      The planner generated cache file in the submit directory should only refer to the GET URL's
      Currently, the planner generated cache file has PUT URL's logged for these files

      This can create performance issues for hierarchal workflows where the sub workflows have data dependencies between them

      For example, the user specifies HTTP server as a GET interface and SCP as the PUT interface

      When a child sub workflow executes, they would expect to retrieve from the GET interface not the PUT interface

      The PUT URL's are still required by the planner at the planning time

            vahi Karan Vahi
            vahi Karan Vahi
            3 Start watching this issue