Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-735

CPU affinity and pegasus-mpi-cluster

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4.0, 4.3.2
    • Component/s: None
    • Labels:
      None
    • Environment:
      Observed on TACC Stampede

      Description

      When running multi-threaded codes under pegasus-mpi-cluster on Stampede, I observed all threads for a task being bunched together on a single core. I believe this is due to Linux CPU affinity, as when I cleared the affinity mask with:

      taskset -pc 0-15 99657

      The task spread out and the 4 threads used 4 cores.

      My guess is that the affinity is inherited from the MPI launcher. I think we should clear the affinity with sched_setaffinity() inside pegasus-mpi-cluster, and the easiest is probably to just pass 0 to sched_setaffinity and clear it for pegasus-mpi-cluster and forked processes.

        Attachments

          Activity

            People

            • Assignee:
              gideon Gideon Juve (Inactive)
              Reporter:
              rynge-page Mats Rynge [X] (Inactive)
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: