Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-757

quoting breaks for non ascii characters in utils.py

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: master, 4.3.2
    • Fix Version/s: master, 4.4.0, 4.3.3
    • Component/s: Monitord
    • Labels:
      None
    • Environment:
      LIGO run on sugar.phy.syr.edu

      Description

      monitord is tripping over the quoting function it uses to quote the stdout section in the kickstart output. Gideon can you take a look?

      Recently I have been noticing that when running a number of my
      workflows pegasus-analyser starts reporting incorrect information
      after some amount of time. This has the knock-on effect that
      pegasus-plots and the dashboard also report incorrect information.
      This generally has the effect that analyser reports my job is only
      (say) 10% complete, when really it is much further along.

      Two examples of this on sugar-dev3:

      /usr1/spxiwh/log/spxiwh/pegasus/weekly_ahope/run0002/
      /usr1/spxiwh/log/spxiwh/pegasus/weekly_ahope/run0005/

      One recommendation from pegasus-plots was to run pegasus-monitord in
      replay mode. So I tried this:

      pegasus-monitord --verbose --replay
      /usr1/spxiwh/log/spxiwh/pegasus/weekly_ahope/run0002/*dag.dagman.out

      After about 10 minutes that command failed with:

      2014-05-22 09:42:58,265:workflow.py:parse_job_output:1705: INFO:
      Starting extraction of job_info from job output file
      /usr1/spxiwh/log/spxiwh/pegasus/weekly_ahope/run0002/pycbc_inspiral_ID022937.out.000
      /usr/lib64/pegasus/python/Pegasus/tools/utils.py:75: UnicodeWarning:
      Unicode equal comparison failed to convert both arguments to Unicode -
      interpreting them as being unequal
       return ''.join(map(_mapping.__getitem__, s))
      Traceback (most recent call last):
       File "/usr/bin/pegasus-monitord", line 1349, in <module>
         process_output = process_dagman_out(workflow_entry.wf,
      workflow_entry.ml_buffer[0:ml_pos])
       File "/usr/bin/pegasus-monitord", line 758, in process_dagman_out
         add(wf, my_jobid, "JOB_SUCCESS", sched_id=my_sched_id, status=0)
       File "/usr/bin/pegasus-monitord", line 589, in add
         wf.update_job_state(jobid, sched_id, my_job_submit_seq, event,
      status, my_time)
       File "/usr/lib64/pegasus/python/Pegasus/monitoring/workflow.py",
      line 1981, in update_job_state
         self.parse_job_output(my_job, job_state)
       File "/usr/lib64/pegasus/python/Pegasus/monitoring/workflow.py",
      line 1706, in parse_job_output
         my_invocation_found = my_job.extract_job_info(self._run_dir, my_output)
       File "/usr/lib64/pegasus/python/Pegasus/monitoring/job.py", line
      331, in extract_job_info
         stdout_text_list.append(utils.quote(my_record["stdout"]))
       File "/usr/lib64/pegasus/python/Pegasus/tools/utils.py", line 75, in quote
         return ''.join(map(_mapping.__getitem__, s))
      KeyError: u'\xe2'

      The only strange thing I notice about the listed .out file is that it
      contains some warning messages from scipy compiling a function. These
      messages contain non-ASCII characters:

      /home/spxiwh/.python26_compiled/sc_fb2424d04c9b3822b33b4d49e59ccce70.cpp:670:
      warning: unused variable ‘Narr’

      (specifically the quotes around Narr).

      Is it obvious what the problem is here? Is there anything I can do to
      fix it? The workflows in question are being used to profile different
      stages of ahope, and pegasus-plots is extremely useful in clearly
      displaying which jobs are running the longest and where we need to
      optimize.

        Attachments

          Activity

            People

            • Assignee:
              gideon Gideon Juve (Inactive)
              Reporter:
              dbrown Duncan Brown
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: