-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 4.8.1
-
Component/s: Pegasus Planner
-
None
For 4.9, we have introduced integrity checking.
on our internal test cases when we turned on, we had an old montage test case that failed.
[bamboo@colo-vm63 20180228T083022-0800]$ pwd
/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT49-T27B/test/core/027-montage-bypass-staging-site-condorio/work/2018-02-28_082934/work/bamboo/pegasus/montage/20180228T083022-0800
[bamboo@colo-vm63 20180228T083022-0800]$ more 00/00/mJPEG_ID000232.err.000
2018-02-28 08:46:02: PegasusLite: version 4.9.0dev
2018-02-28 08:46:02: Executing on host compute-1.isi.edu
########################[Pegasus Lite] Setting up workdir ########################
2018-02-28 08:46:02: Not creating a new work directory as it is already set to /var/lib/condor/execute/dir_7084
##############[Pegasus Lite] Figuring out the worker package to use ##############
2018-02-28 08:46:02: Warning: Pegasus binaries in /usr/bin do not match Pegasus version used for current workflow
2018-02-28 08:46:02: Downloading Pegasus worker package from http://download.pegasus.isi.edu/pegasus/4.9.0dev/pegasus-worker-4.9.0dev-x86_64_rhel_7.tar.gz
##############[Pegasus Lite] Setting the xbit for executables staged ##############
##############[Pegasus Lite] Checking file integrity for input files ##############
Integrity check: images_20180228_082935_1150930.tbl: Expected checksum (c7901af8d99dd8910c803e31d2a7178ed52e13c53fed38356a6e489b88775a1b) does not match the calculated checksum (e3b0c44298fc1c149afbf4c8996fb92427ae41e4
649b934ca495991b7852b855) (timing: 0.037)
2018-02-28 08:46:04: Last command exited with 1
the file in question was being overwritten as a zero byte file in condor io
it was traced to faulty description in the DAX
<job id="ID000232" name="mJPEG" version="3.0" level="1" dv-name="mJPEG1" dv-version="1.0">
<argument>
-ct 1
-gray <filename file="shrunken_20180228_082935_1150930.fits"/>
min max gaussianlog
-out <filename file="shrunken_20180228_082935_1150930.jpg"/>
</argument>
<uses file="shrunken_20180228_082935_1150930.fits" link="input" transfer="true"/>
<uses file="shrunken_20180228_082935_1150930.jpg" link="output" register="true" transfer="true"/>
<uses file="dag_20180228_082935_1150930.xml" link="input" transfer="true"/>
<uses file="dag_20180228_082935_1150930.xml" link="output" register="false" transfer="true"/>
<uses file="images_20180228_082935_1150930.tbl" link="input" transfer="true"/>
<uses file="images_20180228_082935_1150930.tbl" link="output" register="false" transfer="true"/>
</job>
images_20180228_082935_1150930 is both listed as input and output