-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master
-
Component/s: Monitord
-
None
-
Environment:https://bamboo.isi.edu/browse/PEGASUS-WT-T19A-588/log
9
06-Apr-2015 20:10:59 1428375871 - 2015-04-06T20:04:31-0700 - MONITORD_STARTED - 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0
06-Apr-2015 20:10:59 2015-04-06 20:05:57,999:Pegasus.monitoring.workflow:103: WARNING: NL-LOAD-ERROR --> 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0
06-Apr-2015 20:10:59 NL-LOAD-ERROR --> 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0
06-Apr-2015 20:10:59 2015-04-06 20:05:58,000:Pegasus.monitoring.workflow:104: WARNING: error sending event: inv.end --> {'xwf__id': '9f1b37a0-e733-40c2-9d46-7519d20e00b7', 'executable': '', 'job_inst__id': 7, 'job__id': 'stage_in_remote_condorpool_0_0', 'ts': None, 'inv__id': 1, 'level': 'Error'}
06-Apr-2015 20:10:59 error sending event: inv.end --> {'xwf__id': '9f1b37a0-e733-40c2-9d46-7519d20e00b7', 'executable': '', 'job_inst__id': 7, 'job__id': 'stage_in_remote_condorpool_0_0', 'ts': None, 'inv__id': 1, 'level': 'Error'}
06-Apr-2015 20:10:59 2015-04-06 20:05:58,003:Pegasus.monitoring.workflow:105: WARNING: Traceback (most recent call last):
06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/monitoring/workflow.py", line 98, in output_to_db
06-Apr-2015 20:10:59 self._sink.send(event, kwargs)
06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/monitoring/event_output.py", line 187, in send
06-Apr-2015 20:10:59 self._db.notify(d)
06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/db/modules/__init__.py", line 146, in notifyhttps://bamboo.isi.edu/browse/PEGASUS-WT-T19A-588/log 9 06-Apr-2015 20:10:59 1428375871 - 2015-04-06T20:04:31-0700 - MONITORD_STARTED - 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0 06-Apr-2015 20:10:59 2015-04-06 20:05:57,999:Pegasus.monitoring.workflow:103: WARNING: NL-LOAD-ERROR --> 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0 06-Apr-2015 20:10:59 NL-LOAD-ERROR --> 9f1b37a0-e733-40c2-9d46-7519d20e00b7 - blackdiamond-0 06-Apr-2015 20:10:59 2015-04-06 20:05:58,000:Pegasus.monitoring.workflow:104: WARNING: error sending event: inv.end --> {'xwf__id': '9f1b37a0-e733-40c2-9d46-7519d20e00b7', 'executable': '', 'job_inst__id': 7, 'job__id': 'stage_in_remote_condorpool_0_0', 'ts': None, 'inv__id': 1, 'level': 'Error'} 06-Apr-2015 20:10:59 error sending event: inv.end --> {'xwf__id': '9f1b37a0-e733-40c2-9d46-7519d20e00b7', 'executable': '', 'job_inst__id': 7, 'job__id': 'stage_in_remote_condorpool_0_0', 'ts': None, 'inv__id': 1, 'level': 'Error'} 06-Apr-2015 20:10:59 2015-04-06 20:05:58,003:Pegasus.monitoring.workflow:105: WARNING: Traceback (most recent call last): 06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/monitoring/workflow.py", line 98, in output_to_db 06-Apr-2015 20:10:59 self._sink.send(event, kwargs) 06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/monitoring/event_output.py", line 187, in send 06-Apr-2015 20:10:59 self._db.notify(d) 06-Apr-2015 20:10:59 File "/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/pegasus/lib/python2.7/dist-packages/Pegasus/db/modules/__init__.py", line 146, in notify
04/06/15 20:05:04 Currently monitoring 1 Condor log file(s)
04/06/15 20:05:04 Event: ULOG_POST_SCRIPT_TERMINATED for Condor Node create_dir_blackdiamond_0_condorpool (2083632.0.0)
04/06/15 20:05:04 POST Script of Node create_dir_blackdiamond_0_condorpool completed successfully.
04/06/15 20:05:04 Of 26 nodes total:
04/06/15 20:05:04 Done Pre Queued Post Ready Un-Ready Failed
04/06/15 20:05:04 === === === === === === ===
04/06/15 20:05:04 1 0 0 0 5 20 0
04/06/15 20:05:04 0 job proc(s) currently held
04/06/15 20:05:09 Submitting Condor Node stage_in_remote_condorpool_0_0 job(s)...
04/06/15 20:05:09 Adding a DAGMan auxiliary log /lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log
04/06/15 20:05:09 Masking the events recorded in the DAGMAN auxiliary log
04/06/15 20:05:09 Mask for auxiliary log is 0,1,2,4,5,7,9,10,11,12,13,16,17,24,27
04/06/15 20:05:09 submitting: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:09 From submit: Submitting job(s)
04/06/15 20:05:09 From submit: ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:09 failed while reading from pipe.
04/06/15 20:05:09 Read so far: Submitting job(s)ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:09 ERROR: submit attempt failed
04/06/15 20:05:09 submit command was: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:09 Job submit try 1/6 failed, will try again in >= 1 second.
04/06/15 20:05:14 Submitting Condor Node stage_in_remote_condorpool_0_0 job(s)...
04/06/15 20:05:14 Adding a DAGMan auxiliary log /lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log
04/06/15 20:05:14 Masking the events recorded in the DAGMAN auxiliary log
04/06/15 20:05:14 Mask for auxiliary log is 0,1,2,4,5,7,9,10,11,12,13,16,17,24,27
04/06/15 20:05:14 submitting: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:14 From submit: Submitting job(s)
04/06/15 20:05:14 From submit: ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:14 failed while reading from pipe.
04/06/15 20:05:14 Read so far: Submitting job(s)ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:14 ERROR: submit attempt failed
04/06/15 20:05:14 submit command was: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:14 Job submit try 2/6 failed, will try again in >= 2 seconds.
04/06/15 20:05:19 Submitting Condor Node stage_in_remote_condorpool_0_0 job(s)...
04/06/15 20:05:19 Adding a DAGMan auxiliary log /lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log
04/06/15 20:05:19 Masking the events recorded in the DAGMAN auxiliary log
04/06/15 20:05:19 Mask for auxiliary log is 0,1,2,4,5,7,9,10,11,12,13,16,17,24,27
04/06/15 20:05:19 submitting: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:19 From submit: Submitting job(s)
04/06/15 20:05:19 From submit: ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:19 failed while reading from pipe.
04/06/15 20:05:19 Read so far: Submitting job(s)ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:19 ERROR: submit attempt failed
04/06/15 20:05:19 submit command was: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:19 Job submit try 3/6 failed, will try again in >= 4 seconds.
04/06/15 20:05:24 Submitting Condor Node stage_in_remote_condorpool_0_0 job(s)...
04/06/15 20:05:24 Adding a DAGMan auxiliary log /lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log
04/06/15 20:05:24 Masking the events recorded in the DAGMAN auxiliary log
04/06/15 20:05:24 Mask for auxiliary log is 0,1,2,4,5,7,9,10,11,12,13,16,17,24,27
04/06/15 20:05:24 submitting: condor_submit -a dag_node_name' '=' 'stage_in_remote_condorpool_0_0 -a +DAGManJobId' '=' '2083629 -a DAGManJobId' '=' '2083629 -a submit_event_notes' '=' 'DAG' 'Node:' 'stage_in_remote_condorpool_0_0 -a dagman_log' '=' '/lfs1/software/bamboo/data/xml-data/build-dir/PEGASUS-WT-T19A/test/core/019-black-label/work/bamboo/pegasus/blackdiamond/20150406T200413-0700/./blackdiamond-0.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27" -a DAG_STATUS' '=' '0 -a FAILED_COUNT' '=' '0 -a +DAGParentNodeNames' '=' '"create_dir_blackdiamond_0_condorpool" -a +KeepClaimIdle' '=' '20 -a notification' '=' 'never stage_in_remote_condorpool_0_0.sub
04/06/15 20:05:24 From submit: Submitting job(s)
04/06/15 20:05:24 From submit: ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:24 failed while reading from pipe.
04/06/15 20:05:24 Read so far: Submitting job(s)ERROR: Can't open "/tmp/x509up_u550" with flags 00 (No such file or directory)
04/06/15 20:05:24 ERROR: submit attempt failed