Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-723

monitord to handle missing DAGMAN finished events

XMLWordPrintable

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • master, 4.3
    • Affects Version/s: master, 4.2.2
    • Component/s: Monitord
    • None

      if on a running workflow, dagman goes away suddenly ( lets say because of power failure ), the dagman finished event is not logged in the dagman out file. when condor comes up again a new dagman instance is started.

      monitord in this case should warn about the case, and add a dagman finished event when it see's dagman starting again without it having ended earlier. this is to ensure that the workflow states are matched up in teh workflow state table

            Assignee:
            vahi Karan Vahi
            Reporter:
            vahi Karan Vahi
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: