-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 4.5.2
-
Component/s: Pegasus Planner
-
None
-
Environment:HTCondor with partition able slots
For 4.5 release, we had started associating concurrency limits with the jobs by default. however, it seems to have a side affect if there are partitionalble slots.
email from Mats below
Pegasus has recently started to add concurrency limits on certain jobs by default. The idea was to always label our jobs, and then a user could set the limits later if they felt they needed to control the jobs. A side effect seems to be that scheduling on partitionable slots has slowed down. We did a simple test on a single machine and 15 jobs:
universe = vanilla
requirements = Machine == "workflow.isi.edu"
concurrency_limits = peg.foo
executable = test.sh
output = outputs/$(Cluster).$(Process).out
error = outputs/$(Cluster).$(Process).err
log = outputs/$(Cluster).$(Process).log
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Notification = never
queue 15
With the concurrency limit, we get one job started per negotiation cycle, which means ~15 minutes for the job to start. Commenting out the concurrency line makes all 15 jobs start almost instantaneously. Is this expected behavior? Is there something wrong in out configuration?
We have no limits for peg.foo, and CONCURRENCY_LIMIT_DEFAULT is the default.
$ condor_config_val CONCURRENCY_LIMIT_DEFAULT
2308032
The HTCondor version is 8.2.9. I have attached out config dump.
Thanks,