-
Type: Bug
-
Resolution: Cannot Reproduce
-
Priority: Critical
-
Affects Version/s: master
-
Component/s: Pegasus Planner
-
None
It appears as if pegasus-lite in condorio mode does something unexpected. I get Condor failures from the remote startd that it cannot transfer files. These, however, derive from the fact that nothing was executed. When I look into the worker node:
vm-8.sdsc.futuregrid.org:/var/lib/condor # dir execute/dir_5412
total 22804
drwxr-xr-x 2 nobody nobody 4096 Mar 12 15:00 ./
drwxr-xr-x 4 condor condor 4096 Mar 12 15:00 ../
-rwxr-xr-x 1 nobody nobody 760 Mar 12 15:00 condor_exec.exe*
rw-rr- 1 nobody nobody 438 Mar 12 15:00 _condor_stderr
rw-rr- 1 nobody nobody 0 Mar 12 15:00 _condor_stdout
rw-rr- 1 condor condor 5512 Mar 12 15:00 .job.ad
rw-rr- 1 condor condor 3319 Mar 12 15:00 .machine.ad
-rwxr-xr-x 1 nobody nobody 3531450 Mar 12 15:00 mDiff*
-rwxr-xr-x 1 nobody nobody 50429 Mar 12 15:00 mDiffFit-3.0*
-rwxr-xr-x 1 nobody nobody 3093104 Mar 12 15:00 mFitplane*
rw-rr- 1 nobody nobody 4141440 Mar 12 15:00 p2mass-atlas-990502s-j1430092_area.fits
rw-rr- 1 nobody nobody 4141440 Mar 12 15:00 p2mass-atlas-990502s-j1430092.fits
rw-rr- 1 nobody nobody 4152960 Mar 12 15:00 p2mass-atlas-990502s-j1440186_area.fits
rw-rr- 1 nobody nobody 4152960 Mar 12 15:00 p2mass-atlas-990502s-j1440186.fits
-rwxr-xr-x 1 nobody nobody 7258 Mar 12 15:00 pegasus-lite-common.sh*
rw-rr- 1 nobody nobody 304 Mar 12 15:00 region_20080505_143233_14944.hdr
However, the script seems to think that the binary files are in a "bin" sub-directory? The following can be found in _condor_stderr:
vm-8.sdsc.futuregrid.org:/var/lib/condor # cat execute/dir_5412/_condor_stderr
PegasusLite: version 4.1.0cvs
2012-03-12 15:00:08: Not creating a new work directory as it is already set to /var/lib/condor/execute/dir_5412
2012-03-12 15:00:09: Using existing Pegasus binaries in /opt/pegasus/default/bin
/bin/chmod: cannot access `bin/mFitplane': No such file or directory
/bin/chmod: cannot access `bin/mDiff': No such file or directory
2012-03-12 15:00:09: FAILURE: Last command exited with 1
PegasusLite: exitcode 1