Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-947

fast start mode for monitord

XMLWordPrintable

    • Type: Icon: New Feature New Feature
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • master, 4.6.0, 4.5.1
    • Affects Version/s: master, 4.5.0
    • Component/s: Monitord
    • None

      By default, when monitord starts up tracking a live dagman.out file, it sleeps intermittently, waiting for new lines to be logged in the dagman.out file.

      This behavior, however causes monitord to lag considerably

      • when starting for large workflows
      • when monitord gets restarted due to some failure by pegasus-dagman, or we submit a rescue dag.

      For new LIGO ahope worfklows ( there are about 190K jobs in a single DAX), this creates a problem. And one way to do this is alleviate this is for monitord to not sleep intermittently till it catches up with the dagman.out file.

            Assignee:
            vahi Karan Vahi
            Reporter:
            dbrown Duncan Brown
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: