Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 3.0
Affects Version/s: master
Component/s: Monitord
Labels:
None

I ran a FG periodogram workflow, which is all "vanilla" jobs but most resources are remote (Condor-I/O). After the workflow is long gone, there's still:

26025 ? S 0:44 python /home/voeckler/src/svn/pegasus/trunk/bin/pegasus-monitord periodogram-0.dag.dagman.out

which according to "strace -p 26025" is doing nothing by sleeps of 100ms:

select(0, NULL, NULL, NULL,

{0, 100000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 100000}

) = 0 (Timeout)

Here are some files:

$ cat monitord.log
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
self.run()
File "/home/voeckler/src/svn/pegasus/trunk/lib/python/netlogger/analysis/modules/_base.py", line 282, in run
self.queue.task_done()
AttributeError: Queue instance has no attribute 'task_done'

The "monitord.done" file was written, but it is still there! Maybe something wrong with your thread handling? Or maybe you final condition didn't match properly:

$ tail periodogram-0.dag.dagman.out
11/04/10 19:32:37 1599 0 0 0 0 0 0
11/04/10 19:32:37 0 job proc(s) currently held
11/04/10 19:32:37 Note: 176726 total job deferrals because of -MaxIdle limit (100)
11/04/10 19:32:37 All jobs Completed!
11/04/10 19:32:37 Note: 0 total job deferrals because of -MaxJobs limit (0)
11/04/10 19:32:37 Note: 176726 total job deferrals because of -MaxIdle limit (100)
11/04/10 19:32:37 Note: 0 total job deferrals because of node category throttles
11/04/10 19:32:37 Note: 0 total PRE script deferrals because of -MaxPre limit (20)
11/04/10 19:32:37 Note: 0 total POST script deferrals because of -MaxPost limit (100)
11/04/10 19:32:37 **** condor_scheduniv_exec.12.0 (condor_DAGMAN) pid 26022 EXITING WITH STATUS 0

Assignee:: Unassigned
Reporter:: Jens Voeckler

Created:: 04/Nov/10 8:43 PM
Updated:: 08/Jun/12 11:21 AM
Resolved:: 16/Nov/10 2:21 PM
Archived:: 14/Dec/24 10:43 PM

Details

Description

Attachments

Activity

People

Dates