It is common to see bag-of-tasks workflows, where the jobs have a few common inputs (executable, common db, container, ...) and then one or more unique input per job. With the current refiner, we get a bunch of stage_in jobs with the same priority. I think it would be nice to increase the priority on the jobs with common files (which can probably be detected by just looking at the fan-out of the stage in job). That means common inputs will be staged first, and give the workflow a better chance to start jobs when only a part of the unique inputs have been staged, which means better overlap of input transfers and compute jobs.
Prioritize transfers bases on dependencies
This issue belongs to an archived project. You can view it, but you can't modify it. Learn more
- Assignee:
- Karan Vahi
- Reporter:
- Mats Rynge
- Archiver:
- Rajiv Mayani
- Created:
- Updated:
- Resolved:
- Archived: