Running Hybrid MPI/OpenMP Simulations

From Micro and Nano Mechanics Group
Jump to navigation Jump to search

Running Hybrid MPI/OpenMP Simulations

This tutorial explains how to run a multi-threaded hybrid MPI/OpenMP simulation using mvapich2 library. mvapich2 is currently installed on both MC2 and WCR clusters. In this page we assume the code is already written in the hybrid fashion, e.g. both MPI and OpenMP headers are included in the code and you are able to compile and run your code.

Assigning the number of CPUs

Suppose you have 16 CPUs and want to have two MPI processors, each with 8 OpenMP threads.

  • Assigning OpenMP threads:
export OMP_NUM_THREADS=8
  • Running:
mpirun -np 2 a.out

where a.out is name of the executable.

OpenMP setting

One of the crucial settings that we need be careful about is the "processor affinity". By default, multi-threading affinity is not set in mvapich2. That means, by default, all your OpenMP threading will be bound to a single processor (if you have compiled it through mvapich2 wrappers such as mpicc, mpic++ or mpif90). This is because mvapich2 gives priority to MPI processors and does not allow OpenMP to use more than one CPU (even if there are available cpus for OpenMP threads). This setting can be changed through MV2_ENABLE_AFFINITY variable. In your terminal or PBS file type in the following command to enable OpenMP threads to run on different cores:

export MV2_ENABLE_AFFINITY=0

Binding MPI runs to different nodes

By default, mvapich2 tries to put all MPI processors on adjacent cores (not nodes). For example if you request 2 nodes and 8 cores through your PBS file and use "mpirun -np 2 a.out" both MPI processors will be assigned to the first node and the second node will be empty. In order to reserve the cores for OpenMP and put MPI processors on different nodes, you must use a modified nodefile as following:

sed -n  "1~${PBS_NUM_PPN}p" $PBS_NODEFILE > my_nodefile
mpirun -f my_nodefile  -np $PBS_NUM_NODES my_program

References