Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-947

fast start mode for monitord

    XMLWordPrintable

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • master, 4.5.0
    • master, 4.6.0, 4.5.1
    • Monitord
    • None

    Description

      By default, when monitord starts up tracking a live dagman.out file, it sleeps intermittently, waiting for new lines to be logged in the dagman.out file.

      This behavior, however causes monitord to lag considerably
      - when starting for large workflows
      - when monitord gets restarted due to some failure by pegasus-dagman, or we submit a rescue dag.

      For new LIGO ahope worfklows ( there are about 190K jobs in a single DAX), this creates a problem. And one way to do this is alleviate this is for monitord to not sleep intermittently till it catches up with the dagman.out file.

      Attachments

        Activity

          People

            vahi Karan Vahi
            dbrown Duncan Brown
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: