Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1070

monitord should handle case where jobs have missing JOB_FAILURE/JOB_TERMINATED events

XMLWordPrintable

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • master, 4.7.0, 4.6.1
    • Affects Version/s: master, 4.6.0
    • Component/s: Monitord
    • None

      in the case where a job has missing job terminated and job failure events, but has post script terminated and failure event, monitord fails with non null integrity constraint when populating the invocation end event for the task making up the job.

      2016-03-02 07:04:30,348:ERROR:Pegasus.db.modules.stampede_loader.Analyzer(426): (IntegrityError) NOT NULL constraint failed: invocation.exitcode u'INSERT INTO invocation (job_instance_id, task_submit_seq, start_time, remote_duration, remote_cpu_time, exitcode, transformation, executable, argv, abs_task_id, wf_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)' (182208, 1, 1455253988.0, 957.0, None, None, 'inspiral-BBH02_INJ-H1_ID156', '/usr1/amber.lenon/pycbc-tmp.cqBQirspEl/work/./main_ID0000001.000/inspiral-BBH02_INJ-H1_ID156_ID0030745.sh', None, 'ID0030745', 1)
      Traceback (most recent call last):
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/python/Pegasus/db/modules/stampede_loader.py", line 423, in individual_commit
      event.commit_to_db(self.session)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/python/Pegasus/db/schema.py", line 113, in commit_to_db
      self._commit(session, batch)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/python/Pegasus/db/schema.py", line 103, in _commit
      session.flush()
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/scoping.py", line 149, in do
      return getattr(self.registry(), name)(*args, **kwargs)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/session.py", line 1814, in flush
      self._flush(objects)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/session.py", line 1896, in _flush
      flush_context.execute()
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/unitofwork.py", line 372, in execute
      rec.execute(self)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/unitofwork.py", line 525, in execute
      uow
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/persistence.py", line 63, in save_obj
      table, insert)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/orm/persistence.py", line 565, in _emit_insert_statements
      execute(statement, params)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/engine/base.py", line 664, in execute
      params)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/engine/base.py", line 764, in _execute_clauseelement
      compiled_sql, distilled_params
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/engine/base.py", line 878, in _execute_context
      context)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/engine/base.py", line 871, in _execute_context
      context)
      File "/Volumes/Work/lfs1/devel/Pegasus/git/pegasus/lib/pegasus/externals/python/sqlalchemy/engine/default.py", line 320, in do_execute
      cursor.execute(statement, parameters)
      IntegrityError: (IntegrityError) NOT NULL constraint failed: invocation.exitcode u'INSERT INTO invocation (job_instance_id, task_submit_seq, start_time, remote_duration, remote_cpu_time, exitcode, transformation, executable, argv, abs_task_id, wf_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)' (182208, 1, 1455253988.0, 957.0, None, None, 'inspiral-BBH02_INJ-H1_ID156', '/usr1/amber.lenon/pycbc-tmp.cqBQirspEl/work/./main_ID0000001.000/inspiral-BBH02_INJ-H1_ID156_ID0030745.sh', None, 'ID0030745', 1)
      2016-03-02 07:04:30,349:ERROR:Pegasus.db.modules.stampede_loader.Analyzer(427): Insert failed for event <class 'Pegasus.db.schema.Invocation'>:

      • wf_id : 1
      • job_instance_id : 182208
      • executable : /usr1/amber.lenon/pycbc-tmp.cqBQirspEl/work/./main_ID0000001.000/inspiral-BBH02_INJ-H1_ID156_ID0030745.sh
      • job_submit_seq : 182208
      • abs_task_id : ID0030745
      • wf_uuid : ec1f3107-74d1-40cb-b4d9-4653aabf0591
      • start_time : 1455253988.0
      • ts : 1455254945.0
      • event : stampede.inv.end
      • task_submit_seq : 1
      • remote_duration : 957
      • invocation_id : None
      • exec_job_id : inspiral-BBH02_INJ-H1_ID156_ID0030745
      • transformation : inspiral-BBH02_INJ-H1_ID156

            Assignee:
            vahi Karan Vahi
            Reporter:
            dbrown Duncan Brown
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: