Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1327

bypass input file staging broken for container execution

XMLWordPrintable

      Hi Karan,

      Bypass input staging seems to be broken in Pegasus 4.9. If I plan a DAX with 4.8.4, it gives me stage in jobs for the frames that symlink. An example generated with 4.8.4 is on sugwg-condor in

      /usr1/dbrown/pycbc-tmp.jQPabttzrE/work/o1-analysis-test-latest-LOSC_16_V1-main_ID0000001

      with the workflow in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output

      (pycbc-opengw) [dbrown@sugwg-condor o1-analysis-test-latest-LOSC_16_V1-main_ID0000001]$ cat stage_in_remote_local_2_0.in
      [
      { "type": "transfer",
      "lfn": "112848/L-L1_LOSC_16_V1-1128480768-4096.gwf",
      "id": 1,
      "src_urls": [

      { "site_label": "local", "url": "file:///cvmfs/gwosc.osgstorage.org/gwdata/O1/strain.16k/frame.v1/L1/1128267776/L-L1_LOSC_16_V1-1128480768-4096.gwf", "priority": 400 }

      ],
      "dest_urls": [

      { "site_label": "local", "url": "symlink:///home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/local-site-scratch/work/./o1-analysis-test-latest-LOSC_16_V1-main_ID0000001/./112848/L-L1_LOSC_16_V1-1128480768-4096.gwf" }

      ] }
      ]

      However, planning exactly the same workflow with 4.9.1dev gives me a gsiftp URL. An example is on sugwg-scitokens in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/submitdir/work/o1-analysis-test-latest-LOSC_16_V1-main_ID0000001

      with the workflow in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test-broken/o1-analysis-test-latest-LOSC_16_V1/output

      (pycbc-opengw) [dbrown@sugwg-scitokens o1-analysis-test-latest-LOSC_16_V1-main_ID0000001]$ cat stage_in_remote_local_2_0.in
      [
      { "type": "transfer",
      "linkage": "input",
      "lfn": "112848/L-L1_LOSC_16_V1-1128480768-4096.gwf",
      "id": 1,
      "src_urls": [

      { "site_label": "local", "url": "file:///cvmfs/gwosc.osgstorage.org/gwdata/O1/strain.16k/frame.v1/L1/1128267776/L-L1_LOSC_16_V1-1128480768-4096.gwf", "priority": 400 }

      ],
      "dest_urls": [

      { "site_label": "local", "url": "gsiftp://sugwg-scitokens.phy.syr.edu/home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/local-site-scratch/work/./o1-analysis-test-latest-LOSC_16_V1-main_ID0000001/./112848/L-L1_LOSC_16_V1-1128480768-4096.gwf" }

      ] }
      ]

      There's no differences between the properties files

      (pycbc-opengw) [dbrown@sugwg-condor analysis]$ diff -aru analysis-test*/*/output/pegasus-properties.conf

      and both have bypass turned on

      (pycbc-opengw) [dbrown@sugwg-condor analysis]$ grep bypass analysis-test*/*/output/pegasus-properties.conf
      analysis-test-broken/o1-analysis-test-latest-LOSC_16_V1/output/pegasus-properties.conf:pegasus.transfer.bypass.input.staging=true
      analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/pegasus-properties.conf:pegasus.transfer.bypass.input.staging=true

      There seems to have been some earlier regression to symlinking, as earlier versions didn't even create a symlink from /cvmfs to the local site scratch, but I didn't notice that as the symlink is a trivial operation and the scratch directory gets cleaned up when the workflow finishes. Looking at an even older workflow generated with 4.7.4 on sugwg-osg in

      /home/dbrown/projects/aligo/o2/analysis-21-extended/o2-c00-analysis-21-extended-v1.7.9-vdf-5c09540-auto-dch-gates-4e17592/output/submitdir/work

      there are no stage in jobs for the frames at all.

      Cheers,
      Duncan.

            Assignee:
            vahi Karan Vahi
            Reporter:
            dbrown Duncan Brown
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: