By default, when monitord starts up tracking a live dagman.out file, it sleeps intermittently, waiting for new lines to be logged in the dagman.out file.
This behavior, however causes monitord to lag considerably
- when starting for large workflows
- when monitord gets restarted due to some failure by pegasus-dagman, or we submit a rescue dag.
For new LIGO ahope worfklows ( there are about 190K jobs in a single DAX), this creates a problem. And one way to do this is alleviate this is for monitord to not sleep intermittently till it catches up with the dagman.out file.