DIRSIG5 supports the automated processing of single/multi-frame simulations across multiple processes/nodes using either the OpenMPI or MPICH implementation of the Message Passing Interface (MPI). Separate builds for MPI and non-MPI environments are available from myDIRSIG. Non-MPI builds use conventional file I/O to write output files, while the MPI builds leverage MPI-IO. It is recommended that end-users running on a single workstation with a single CPU socket use non-MPI builds to reduce the number of run-time dependencies. Workstations featuring more than one CPU socket can gain a significant run-time performance benefit by using the MPI build with a proper process topology.
The OpenMPI and MPICH run-time libraries are not distributed with DIRSIG5 because they are assumed to be installed on the host. The build system used to produce the releases is Redhat Enterprise Linux (RHEL) 7.
For OpenMPI, DIRSIG5 is compiled against OpenMPI v1.10.7.
For MPICH, DIRSIG5 is compiled against MPICH v3.2.
Due to NUMA considerations, it is recommended that one runs one bound process per CPU socket. Each process should then spawn a number of threads equal to the number of CPU cores available on said socket. In general, it is recommended that one disable simultaneous multi-threading (SMT, e.g. Hyper-Threading) to better control NUMA characteristics, but the performance impact of this varies between system configurations and may not be optimal in all situations.
DIRSIG supports two scheduling modes:
- All Processes per-Frame (APF)
This is the default behavior. APF distributes the processing of each frame (capture) in a simulation across all processes. It is used to maximize complete frame throughput.
- Unique Frame per-Process (UFP)
UFP assigns one unique frame (capture) to each process. Under certain circumstances it can achieve a lower overall simulation run-time than APF for multi-frame simulations (e.g. poor filesystem/MPI-IO) implementation support for writing simultaneously to the same file and/or if the scene changes significantly between frames). It only is available for multi-frame simulations with a platform truth/image schedule of capture (i.e. one image and/or truth file per frame).
|Currently, DIRSIG doesn’t transition from UFP to AFP scheduling mode for remainder frames, so efficiency peaks when the number of frames is evenly divisible by the number of processes.|
A user can easily load the OpenMPI utilities/libraries into their shell
the MPI environment environment module is installed with the openmpi
package in RHEL-like Linux distributions when deployed via
Load the OpenMPI environment
Load the environment module for OpenMPI into the user’s shell environment:
$ module load mpi/openmpi-x86_64
Run DIRSIG with 1 process per socket (default APF schedule)
--npersocket option to
mpirun to have OpenMPI run a single
process per socket:
$ mpirun --npersocket 1 dirsig5 MY_SIM.sim
Same as above, but with UFP schedule
Use the DIRSIG5
--mpi_one_event_per_node option to switch the internal
scheduler into the unique frame per-process (UFP) mode:
$ mpirun --npersocket 1 dirsig5 --mpi_one_event_per_node MY_SIM.sim