Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-806

ability to show failing workflows in the dashboard

XMLWordPrintable

      We would like to add an ability to the dashboard to highlight failing jobs i.e those jobs that have failed once, and are being retried.

      Currently the stampede layer focusses on the last job instance for each condor job. The idea behind that was in case a job fails and then succeeds on retries it is shown as successful.

      However, this results in the users not knowing when looking at the dashboard if something is wrong. If a job fails, and has retries left DAGMan will almost immediately retry the job. The result is that the dashboard will show the job as running not as failed.

      We want an extra tab in the dashboard that is able to show us jobs that have failed once but still have some retries left. When a job fails for good, then the job should be moved to the failed tab and not appear in the failing tab.

            Assignee:
            mayani Rajiv Mayani
            Reporter:
            vahi Karan Vahi
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: