Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-848

MPI_ERR_TRUNCATE: message truncated in PMC

XMLWordPrintable

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • 4.5.0, 4.4.2
    • Affects Version/s: master, 4.5.0, 4.4.2
    • Component/s: PMC
    • None

      Dear Pegasus support,
      I am testing your pegasus-mpi-cluster tool that is of great interest.

      On a test case, I encountered a reproducible error.
      The bug occurs when mpirun -n value is >= to 10.

      $ mpirun -n 11 pegasus-mpi-cluster test.dag2C
      Version: 4.5.0cvs
      Compiled: Feb 24 2015 15:12:55
      Compiler: 4.4.7 20120313 (Red Hat 4.4.7-11)
      MPI: 3.0
      OpenMPI: 1.8.1
      [info] Setting max cached files = 256
      [info] Master starting with 10 workers
      [info] Starting workflow
      [etna0:8937] *** An error occurred in MPI_Recv
      [etna0:8937] *** reported by process [139861257027585,18446603344811130880]
      [etna0:8937] *** on communicator MPI_COMM_WORLD
      [etna0:8937] *** MPI_ERR_TRUNCATE: message truncated
      [etna0:8937] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
      [etna0:8937] *** and potentially your MPI job)

      Using this dag file content:
      $ cat test.dag2C
      TASK 1 -c 2 sh -c "uname -n && echo 1 && sleep 30"
      TASK 2 -c 2 sh -c "uname -n && echo 2 && sleep 30"
      TASK 3 -c 2 sh -c "uname -n && echo 3 && sleep 30"
      TASK 4 -c 2 sh -c "uname -n && echo 4 && sleep 30"
      TASK 5 -c 2 sh -c "uname -n && echo 5 && sleep 30"
      TASK 6 -c 2 sh -c "uname -n && echo 6 && sleep 30"
      TASK 7 -c 2 sh -c "uname -n && echo 7 && sleep 30"
      TASK 8 -c 2 sh -c "uname -n && echo 8 && sleep 30"
      TASK 9 -c 2 sh -c "uname -n && echo 9 && sleep 30"
      TASK 10 -c 2 sh -c "uname -n && echo 10 && sleep 30"

      I am on CentOS 6.6 with open MPI 1.8.1

      Do you have an idea about the error?
      Which version of mpi is recommended?

      Thanks for you help,

      David

            Assignee:
            gideon Gideon Juve (Inactive)
            Reporter:
            gideon Gideon Juve (Inactive)
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: