-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 5.1.0, 5.0.5
-
Component/s: Pegasus Planner, Planner: Code Generators
-
None
ames reported a Pegasus/HTCondor interaction issue on Slack. I moved
it here to email to make sure the right people can join the
discussion. The issue is described below.
As far as I know, we in the Pegasus team have not seen this before,
and we have added that attribute since 2007. Can somebody add the
HTCondor and Pegasus versions to this thread, and let us know if you
have made any config changes to the access point recently?
The problem is that condor_submit is complaining about an extra
attribute added by Pegasus:
05/02/23 11:18:07 From submit: condor_submit: invalid attribute name
'+DAGNodeRetry' for attrib=value assigment
05/02/23 11:18:07 failed while reading from pipe.
In the dagman.out:
05/02/23 11:18:38 submit command was: /usr/bin/condor_submit -a
dag_node_name=create_dir_o3_sbbh2_0p985.dax_0_local -a
My.DAGManJobId=68812570 -a DAGManJobId=68812570 -batch-name
o3_sbbh2_0p985.dax-0.dag+68812570 -batch-id 68812570.0 -a
submit_event_notes' '=' 'DAG' 'Node:'
'create_dir_o3_sbbh2_0p985.dax_0_local -a
dagman_log=/home/praveen.kumar/focused_search/pipeline/hlv/sbbh2_0p985/a1_2/output/pycbc-tmp_i_a0uf2l/work/./o3_sbbh2_0p985.dax-0.dag.nodes.log
-a My.DAGManNodesMask="0,1,2,4,5,7,9,10,11,12,13,16,17,24,27,35,36"
priority=800 JOB=create_dir_o3_sbbh2_0p985.dax_0_local +DAGNodeRetry=0
DAG_STATUS=0 FAILED_COUNT=0 My.KeepClaimIdle=20 -a notification=never
My.DAGParentNodeNames="" ./create_dir_o3_sbbh2_0p985.dax_0_local.sub
And in the DAG itself:
VARS create_dir_o3_sbbh2_0p985.dax_0_local +DAGNodeRetry="$(RETRY)"