-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 4.5.0
-
Component/s: Pegasus Planner
-
None
Hi Larne,
The dag file:
/local/user/lppekows/pycbc-tmp.dRq2USWe1Y/work/main_ID0000001/main-0.dag
on atlas2 is invalid. If you look at the end of the file you can see a
partial entry, which suggests that the process writing the dag
terminated ... as the PARENT...CHILD entries are written last, this
file has none of them. If you look in:
/local/user/lppekows/pycbc-tmp.dRq2USWe1Y/work/subdax_main_ID0000001.pre.log.000
this theory seems to be confirmed: A failure message for the dag
writing process is given (out of memory!).
However when this runs a second time in:
/local/user/lppekows/pycbc-tmp.dRq2USWe1Y/work/subdax_main_ID0000001.pre.log.001
it sees the existing .dag file and just tries to submit it. That looks
like a bug in pegasus.
Cheers
Ian
On 24 July 2015 at 04:06, Larne Pekowsky <lppekows@syr.edu> wrote:
Hi all,
I have a workflow on atlas, started from
/home/lppekows/projects/cbc/pycbc1.1_review/analysis8_ahope-same-harm-exact-nomax-nosubbank/962582415-963187215
and running in
/local/user/lppekows/pycbc-tmp.dRq2USWe1Y/work
It looks like none of the inspiral jobs were scheduled. They’re in
main-0.dag, but there are no inspiral*out* or inspiral*err* files, the
workflow seems to have just jumped directly to the llwadd jobs.
Has anyone seen anything like this before?
Thanks,
- Larne