OpenMP(Open Multi-Processing) is an application programming interface (library) that supports multi-platform shared-memory parallel programming in C, C++ and Fortran on all architectures, including Unix and Windows platform. It is designed for developing parallel applications in the shared memory environment, where multiple threads share the same memory address space and data.

MPI, or the Message Passing Interface, is a standardized and portable message-passing system designed to function on a wide variety of parallel computing architectures(Just a architecture/model/protocol, NOT a library). It’s one of the most widely used models for parallel computing and enables efficient communication among processors in a computing system. Implementations that follow the MPI programming model include OpenMPI, MPICH, and Microsoft MPI.

Let‘s explore the differences between OpenMP and MPI(MPICH/Open MPI).

1. Memory Model

OpenMP operates in a shared memory model where multiple threads have access to the same memory space. It’s well-suited for multicore and multiprocessor systems within a single node.

MPI is designed for a distributed memory model, where each process has its own private memory space. It’s used in system where memory is not shared, such as clusters of nodes or parallel programming with multiple CPUs.

2. Programming Model

OpenMP uses compiler directives (pragmas) to parallelize code, making it relatively easy to convert existing serial code to parallel. It’s also know for incremental parallelism, where developers can gradually parallelize their code.

Here is a piece of LightGBM source code (C++) that demonstrates how to use OpenMP to calculate the number of positive and negative samples.

#pragma omp parallel for num_threads(OMP_NUM_THREADS()) schedule(static) reduction(+:cnt_positive, cnt_negative)
for (data_size_t i = 0; i < num_data_; ++i) {
  if (is_pos_(label_[i])) {
    ++cnt_positive;
  } else {
    ++cnt_negative;
  }
}

MPI requires explicit programming of message passing between processes, which can be more complex and requires a good understanding of the parallel algorithm’s communication pattern.

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注