-
Type: Sub-task
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master
-
Component/s: Integrity Checking, Pegasus Planner
-
None
Pegasus has a notion of a checkpoint file that user can designate for their jobs.
<job id="j1" namespace="pegasus" name="checkpoint" version="4.0">
<argument>-o <file name="f.b1"/> -o <file name="f.b2"/></argument>
<uses name="f.a" link="input"/>
<uses name="f.b1" link="output" transfer="true" register="true"/>
<uses name="f.b2" link="output" transfer="true" register="true"/>
<uses name="test.checkpoint" link="checkpoint" transfer="true" register="true"/>
</job>
the semantics of this file, is that the updated version of the checkpoint file is available whenever a job is retried with it's last copy.
When the application code succeeds the application code is expected to delete the checkpoint file.