Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1608

data dependencies between dax jobs and compute jobs in a workflow


      One of the disconnects right now for hierarchal workflows is that pegasus does not automatically take care of data dependencies between the dax jobs ( that result in a new workflow being run) and a compute job in the DAX.

      For example a workflow W, with jobs D -> C where D is a dax job and C is a compute job and the sub workflow invoked by D creates a file, that job C requires. Currently, the planner fails.

      The proposal to fix this is , to allow users to designate output files in the DAX job, like any other job in the workflow. Pegasus will ensure that when the sub workflow runs the output files required by job C are transferred to the scratch directory for the workflow W of which C is part of.

            vahi Karan Vahi
            vahi Karan Vahi
            4 Start watching this issue