Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1918

inplace cleanup broken when a sub workflow job and a parent compute job has a data dependency

XMLWordPrintable

      As part of PM-1766 the data dependency between a sub workflow and a parent compute job was supported. however, the way it is implemented, the sub workflow retrieves the file from the staging site(scratch dir) where the parent job runs. when inplace cleanup is run, there is a missing dependency to between the sub workflow job and cleanup job that removes the file after the parent compute job has completed.

      hence there is a race condition, and the sub workflow fails at runtime (not planning) when it tries to stage the file created by the parent job

      the image PM-1918-bug.png highlights the bug.

      The job pre-preprocess_ID0000001 generates an input file that a job in the sub workflow corresponding to job pegasus-plan_ID0000002 requires. the cleanup job clean_up_CCG_level_4_0 is responsible for cleaning this file f.a. However, there is a missing dependency from the sub workflow job pegasus-plan_ID0000002 to clean_up_CCG_level_4_0 because of which the cleanup job often runs before the sub workflow has completed, resulting in failure in the stage-in job of the workflow

      Another side effect is that a wrong cleanup job is created that is trying to delete file from site local. Not from the site CCG where it is created
      corbusier:bug-run vahi$ more 045-hierarchy-sharedfs-e/work/local-hierarchy-sharedfs-1686265806/00/00/clean_up_local_level_4_0.in
      [
      {
      "id": 1,
      "type": "remove",
      "target":

      { "site_label": "local", "url": "file:///scitech/nas/home/bamboo/test/045-hierarchy-sharedfs-e/work/local-site/scratch/local-hierarchy-sharedfs-1686265806/00/00/f.a", "recursive": "False" }

      }
      ]

        1. PM-1918.tgz
          162 kB
        2. PM-1918-bug.png
          PM-1918-bug.png
          120 kB
        3. PM-1918-fix.png
          PM-1918-fix.png
          127 kB

            Assignee:
            vahi Karan Vahi
            Reporter:
            vahi Karan Vahi
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: