Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1539

Support PANDA GAHP - allow condorio for glite style

XMLWordPrintable

      For panda integration, there is a panda gahp that relies on glite style under development. this gahp handles condor file transfers for jobs launched using panda.

      The panda provisioned site entry in site catalog looks like
      <site handle="titan" arch="x86_64" os="linux" osrelease="" osversion="" glibc="">
      <directory type="shared-scratch">
      <file-server protocol="file" url="file://" mount-point="/gpfs/alpine/scratch/psvirin/csc343/pegasus/run" operation="all"/>
      <internal-mount-point mount-point="/gpfs/alpine/scratch/psvirin/csc343/pegasus/run" free-size="" total-size=""/>
      </directory>
      <directory type="shared-storage">
      <file-server protocol="file" url="file://" mount-point="/gpfs/alpine/scratch/psvirin/csc343/pegasus/output" operation="all"/>
      <internal-mount-point mount-point="/gpfs/alpine/scratch/psvirin/csc343/pegasus/output" free-size="" total-size=""/>
      </directory>
      <profile namespace="env" key="PEGASUS_HOME" >/gpfs/alpine/csc343/proj-shared/psvirin/pegasus/pegasus-4.9.3</profile>
      <profile namespace="condor" key="grid_resource" >batch pbs</profile>
      <profile namespace="condor" key="universe" >grid</profile>
      <profile namespace="pegasus" key="queue" >normal</profile>
      <profile namespace="pegasus" key="runtime" >30000</profile>
      <profile namespace="pegasus" key="style" >glite</profile>
      </site>

      currently the planner fails with an error.

      Invalid style glite for the job split_ID0000001
      2020.04.30 16:35:18.384 PDT: [DEBUG] Adding Edge wc_ID0000004 -> stage_out_local_local_1_0
      2020.04.30 16:35:18.384 PDT: [DEBUG] Adding relations for job wc_ID0000005
      2020.04.30 16:35:18.440 PDT: [FATAL ERROR] java.lang.RuntimeException: Unable to generate code
      at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:627)
      at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:312)
      at edu.isi.pegasus.planner.client.CPlanner.main(CPlanner.java:199)
      2020.04.30 16:35:18.384 PDT: [DEBUG] Adding Edge wc_ID0000005 -> stage_out_local_local_1_0
      2020.04.30 16:35:18.384 PDT: [DEBUG] Adding relations for job stage_out_local_local_1_0
      Caused by: java.lang.RuntimeException: Unable to modify job split_ID0000001 for worker node execution by SLS backend using Condor File Transfers to the worker node
      at edu.isi.pegasus.planner.code.gridstart.PegasusLite.wrapJobWithPegasusLite(PegasusLite.java:

            Assignee:
            vahi Karan Vahi
            Reporter:
            vahi Karan Vahi
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: