bypass input file staging broken for container execution

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XMLWordPrintable

      Hi Karan,

      Bypass input staging seems to be broken in Pegasus 4.9. If I plan a DAX with 4.8.4, it gives me stage in jobs for the frames that symlink. An example generated with 4.8.4 is on sugwg-condor in

      /usr1/dbrown/pycbc-tmp.jQPabttzrE/work/o1-analysis-test-latest-LOSC_16_V1-main_ID0000001

      with the workflow in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output

      (pycbc-opengw) [dbrown@sugwg-condor o1-analysis-test-latest-LOSC_16_V1-main_ID0000001]$ cat stage_in_remote_local_2_0.in
      [
      { "type": "transfer",
      "lfn": "112848/L-L1_LOSC_16_V1-1128480768-4096.gwf",
      "id": 1,
      "src_urls": [

      { "site_label": "local", "url": "file:///cvmfs/gwosc.osgstorage.org/gwdata/O1/strain.16k/frame.v1/L1/1128267776/L-L1_LOSC_16_V1-1128480768-4096.gwf", "priority": 400 }

      ],
      "dest_urls": [

      { "site_label": "local", "url": "symlink:///home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/local-site-scratch/work/./o1-analysis-test-latest-LOSC_16_V1-main_ID0000001/./112848/L-L1_LOSC_16_V1-1128480768-4096.gwf" }

      ] }
      ]

      However, planning exactly the same workflow with 4.9.1dev gives me a gsiftp URL. An example is on sugwg-scitokens in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/submitdir/work/o1-analysis-test-latest-LOSC_16_V1-main_ID0000001

      with the workflow in

      /home/dbrown/projects/o1-open-catalog/analysis/analysis-test-broken/o1-analysis-test-latest-LOSC_16_V1/output

      (pycbc-opengw) [dbrown@sugwg-scitokens o1-analysis-test-latest-LOSC_16_V1-main_ID0000001]$ cat stage_in_remote_local_2_0.in
      [
      { "type": "transfer",
      "linkage": "input",
      "lfn": "112848/L-L1_LOSC_16_V1-1128480768-4096.gwf",
      "id": 1,
      "src_urls": [

      { "site_label": "local", "url": "file:///cvmfs/gwosc.osgstorage.org/gwdata/O1/strain.16k/frame.v1/L1/1128267776/L-L1_LOSC_16_V1-1128480768-4096.gwf", "priority": 400 }

      ],
      "dest_urls": [

      { "site_label": "local", "url": "gsiftp://sugwg-scitokens.phy.syr.edu/home/dbrown/projects/o1-open-catalog/analysis/analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/local-site-scratch/work/./o1-analysis-test-latest-LOSC_16_V1-main_ID0000001/./112848/L-L1_LOSC_16_V1-1128480768-4096.gwf" }

      ] }
      ]

      There's no differences between the properties files

      (pycbc-opengw) [dbrown@sugwg-condor analysis]$ diff -aru analysis-test*/*/output/pegasus-properties.conf

      and both have bypass turned on

      (pycbc-opengw) [dbrown@sugwg-condor analysis]$ grep bypass analysis-test*/*/output/pegasus-properties.conf
      analysis-test-broken/o1-analysis-test-latest-LOSC_16_V1/output/pegasus-properties.conf:pegasus.transfer.bypass.input.staging=true
      analysis-test/o1-analysis-test-latest-LOSC_16_V1/output/pegasus-properties.conf:pegasus.transfer.bypass.input.staging=true

      There seems to have been some earlier regression to symlinking, as earlier versions didn't even create a symlink from /cvmfs to the local site scratch, but I didn't notice that as the symlink is a trivial operation and the scratch directory gets cleaned up when the workflow finishes. Looking at an even older workflow generated with 4.7.4 on sugwg-osg in

      /home/dbrown/projects/aligo/o2/analysis-21-extended/o2-c00-analysis-21-extended-v1.7.9-vdf-5c09540-auto-dch-gates-4e17592/output/submitdir/work

      there are no stage in jobs for the frames at all.

      Cheers,
      Duncan.

            Assignee:
            Karan Vahi
            Reporter:
            Duncan Brown
            Archiver:
            Rajiv Mayani

              Created:
              Updated:
              Resolved:
              Archived: