Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-735

CPU affinity and pegasus-mpi-cluster

XMLWordPrintable

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major Major
    • 4.4.0, 4.3.2
    • Affects Version/s: None
    • Component/s: None
    • None
    • Environment:
      Observed on TACC Stampede

      When running multi-threaded codes under pegasus-mpi-cluster on Stampede, I observed all threads for a task being bunched together on a single core. I believe this is due to Linux CPU affinity, as when I cleared the affinity mask with:

      taskset -pc 0-15 99657

      The task spread out and the 4 threads used 4 cores.

      My guess is that the affinity is inherited from the MPI launcher. I think we should clear the affinity with sched_setaffinity() inside pegasus-mpi-cluster, and the easiest is probably to just pass 0 to sched_setaffinity and clear it for pegasus-mpi-cluster and forked processes.

            Assignee:
            gideon Gideon Juve (Inactive)
            Reporter:
            rynge-page Mats Rynge [X] (Inactive)
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: