Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-947

fast start mode for monitord

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: master, 4.5.0
    • Fix Version/s: master, 4.6.0, 4.5.1
    • Component/s: Monitord
    • Labels:
      None

      Description

      By default, when monitord starts up tracking a live dagman.out file, it sleeps intermittently, waiting for new lines to be logged in the dagman.out file.

      This behavior, however causes monitord to lag considerably
      - when starting for large workflows
      - when monitord gets restarted due to some failure by pegasus-dagman, or we submit a rescue dag.

      For new LIGO ahope worfklows ( there are about 190K jobs in a single DAX), this creates a problem. And one way to do this is alleviate this is for monitord to not sleep intermittently till it catches up with the dagman.out file.

        Attachments

          Activity

            People

            • Assignee:
              vahi Karan Vahi
              Reporter:
              dbrown Duncan Brown
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: