Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-588

PegasusLite seems to think binaries are in a "bin" sub-directory when they are not

XMLWordPrintable

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Critical Critical
    • master, 4.0, 4.1
    • Affects Version/s: master
    • Component/s: Pegasus Planner
    • None

      It appears as if pegasus-lite in condorio mode does something unexpected. I get Condor failures from the remote startd that it cannot transfer files. These, however, derive from the fact that nothing was executed. When I look into the worker node:

      vm-8.sdsc.futuregrid.org:/var/lib/condor # dir execute/dir_5412
      total 22804
      drwxr-xr-x 2 nobody nobody 4096 Mar 12 15:00 ./
      drwxr-xr-x 4 condor condor 4096 Mar 12 15:00 ../
      -rwxr-xr-x 1 nobody nobody 760 Mar 12 15:00 condor_exec.exe*
      rw-rr- 1 nobody nobody 438 Mar 12 15:00 _condor_stderr
      rw-rr- 1 nobody nobody 0 Mar 12 15:00 _condor_stdout
      rw-rr- 1 condor condor 5512 Mar 12 15:00 .job.ad
      rw-rr- 1 condor condor 3319 Mar 12 15:00 .machine.ad
      -rwxr-xr-x 1 nobody nobody 3531450 Mar 12 15:00 mDiff*
      -rwxr-xr-x 1 nobody nobody 50429 Mar 12 15:00 mDiffFit-3.0*
      -rwxr-xr-x 1 nobody nobody 3093104 Mar 12 15:00 mFitplane*
      rw-rr- 1 nobody nobody 4141440 Mar 12 15:00 p2mass-atlas-990502s-j1430092_area.fits
      rw-rr- 1 nobody nobody 4141440 Mar 12 15:00 p2mass-atlas-990502s-j1430092.fits
      rw-rr- 1 nobody nobody 4152960 Mar 12 15:00 p2mass-atlas-990502s-j1440186_area.fits
      rw-rr- 1 nobody nobody 4152960 Mar 12 15:00 p2mass-atlas-990502s-j1440186.fits
      -rwxr-xr-x 1 nobody nobody 7258 Mar 12 15:00 pegasus-lite-common.sh*
      rw-rr- 1 nobody nobody 304 Mar 12 15:00 region_20080505_143233_14944.hdr

      However, the script seems to think that the binary files are in a "bin" sub-directory? The following can be found in _condor_stderr:

      vm-8.sdsc.futuregrid.org:/var/lib/condor # cat execute/dir_5412/_condor_stderr
      PegasusLite: version 4.1.0cvs
      2012-03-12 15:00:08: Not creating a new work directory as it is already set to /var/lib/condor/execute/dir_5412
      2012-03-12 15:00:09: Using existing Pegasus binaries in /opt/pegasus/default/bin
      /bin/chmod: cannot access `bin/mFitplane': No such file or directory
      /bin/chmod: cannot access `bin/mDiff': No such file or directory
      2012-03-12 15:00:09: FAILURE: Last command exited with 1
      PegasusLite: exitcode 1

            Assignee:
            vahi Karan Vahi
            Reporter:
            voeckler Jens Voeckler
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: