MPI

What is MPI?

MPI stands for Message Passing Interface. The MPI standard is defined by the Message Passing Interface Forum. The homepage for the current version of MPI can be found at http://www-unix.mcs.anl.gov/mpi. The standard defines the interface for a set of functions that can be used to pass messages between processes on the same computer or on different computers. MPI can be used to program shared memory or distributed memory computers. There are a large number of implementations of MPI from various computer vendors and academic groups; two open-source versions are MPICH and LAM.

The MPI interface consists of nearly two hundred functions but in general most codes use only a small sub-set of the functions. Below are a few small example MPI programs to illustrate how MPI can be used. Systems supporting MPI at UVa are described below.

The first code prints a simple message to the standard output.

#include < stdio.h >
#include "mpi.h"

int main(int argc,char* argv[])
{
  MPI_Init(&argc, &argv);

  printf("hello world\n");

  MPI_Finalize();
  return 0;
}
Fortran Version

When the code is executed with four processes the following output is produced.

hello world
hello world
hello world
hello world

The file mpi.h holds all the declarations for the MPI functions and must be included. The function MPI_Init(&argc, &argv) must be called before any other MPI functions and the function MPI_Finalize() should be called after the last MPI function has been called. In the above example these are the only two MPI functions used. MPI_Init is called with the same arguments as the main function to allow the system to perform any special initializations for the MPI library.

When executed using four processes, four copies of the code are executed. Each process has an integer associated with it called its rank which can be used to identify it. The processes are numbered sequentially from zero. In the next code we use a MPI function call to find out the rank of each process.

#include < stdio.h >
#include "mpi.h"

int main(int argc,char* argv[])
{
  int my_rank;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

  printf("hello world from processes %d \n",my_rank);

  MPI_Finalize();
  return 0;
}
Fortran Version

We have added two lines to the code. The integer my_rank is used to store the rank of each process. The function MPI_Comm_rank(MPI_COMM_WORLD, &my_rank) retrieves the rank of the processes and stores it in my_rank. MPI_COMM_WORLD is called a communicator and defines a collection of processes that can send messages to each other. The user can define different communicators to hold separate groups of processes but in general only one communicator is needed and it is already defined for the user as MPI_COMM_WORLD in mpi.h. The output from the above code when run across four processes is:

hello world from processes 0 
hello world from processes 2 
hello world from processes 1 
hello world from processes 3 

When the above code is executed each process acts independently of each other. The order in which they complete is random; executing the code a number of times will demonstrate this. If we wish to order the output it is necessary to synchronize the separate processes. The next example shows how use MPI functions to order the output. Each process will send their rank to the process with rank zero which will print the ranks in turn. This is an example of message passing, each process will send a message containing an integer representing its rank to the 'root' process.

#include < stdio.h >
#include "mpi.h"

int main(int argc,char* argv[])
{
  int my_rank;
  int num_proc;
  int dest= 0;
  int tag= 0;
  int tmp,i;
  MPI_Status status;

  MPI_Init(&argc, &argv);
  /* get my_rank */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
  /* find out how many processes there are  */
  MPI_Comm_size(MPI_COMM_WORLD, &num_proc);

  if( my_rank != 0 )    
  {  
    MPI_Send(&my_rank,1,MPI_INT,dest,tag,MPI_COMM_WORLD);
  }
  else
  {
     for(i=1; i < num_proc; i++)
     {
       MPI_Recv(&tmp,1,MPI_INT,i,tag,MPI_COMM_WORLD,&status);
       printf("hello world from processes %d \n",tmp);
     }
  }
  MPI_Finalize();
  return 0;
}
Fortran Version

MPI_Status status is a struct defined in mpi.h used by MPI to hold information such as error status, it is rarely used directly by the programmer. The function MPI_Comm_size(MPI_COMM_WORLD, &num_proc) returns the total number of processes that are running in the communicator MPI_COMM_WORLD and stores it in num_proc. The if block can be interpreted as follows

  if( my_rank != 0 )
  {
    I am not 'root' process::  send my_rank to root
  }
  else
  {
    I am 'root' process::  receive the rank of each
    process in turn and print it
  }

The MPI_Send(&my_rank,1,MPI_INT,dest,tag,MPI_COMM_WORLD) function can be understood as follows
&my_rank	  address of value to be sent
1		  number of values to be sent,
		  by setting this number greater than
        	  one arrays can be sent.    
MPI_INT 	  "type" of value to be sent
dest		  process the data is to be sent to, in this case process 0
tag		  An identifier for the message - 
	          a process may be expecting a number of 
        	  messages from another process and the
        	  tag is simply an integer used to differentiate them,
		  in this case the tag was set to 0. 
MPI_COMM_WORLD    Defines which communicator the processes belong
                  to.    
The function takes the address of the variable to be sent, effectively it takes a chunk of memory from that address and sends it. The size of the chunk of memory is defined by the 'type' of variable and how many items it contains if it is an array. The variable MPI_INT is used to indicate that an integer is being sent, MPI supports all the common data types eg MPI_FLOAT for float, MPI_DOUBLE for double and MPI_CHAR for char. It is possible to create your own MPI data type to send C type struct .

The corresponding receive function MPI_Recv(&tmp,1,MPI_INT,i,tag,MPI_COMM_WORLD,&status) can be understood as follows

&tmp              address were the incoming value will be stored
1                 number of values to receive
MPI_INT           type of values that are to be received
i                 process that is sending the data
tag               identifier for the message
MPI_COMM_WORLD    communicator for the processes
&status           a structure that is used to hold any error
                  messages from the MPI function
The output from the code when run with four processes is
hello world from processes 1 
hello world from processes 2 
hello world from processes 3 

The above simple examples illustrate how MPI can be used to pass data between processes. Perhaps the simplest use of MPI is for Monte Carlo simulations where the same code is executed a large number of times with slightly different parameters, perhaps generated with a random number. MPI can be used to start up a number of processes across different CPU's executing the same code but seeded with a different random number. For example if the main function is called Monte_Carlo() and the random number generator is seeded with the function random_seed(int) the code would have the outline:

int main(int argc,char* argv[]);
{
  int my_rank;
  MPI_Init(&argc, &argv); 
/* Get the rank of each process  */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

/* seed the random number generator with the rank of the process */
  random_seed(my_rank);

/* execute the code */
  Monte_Carlo();

  MPI_Finalize();
  return 0;
}  
When executed across multiple CPU's each process would have its random number generator seeded by an integer corresponding to its rank. In the above case there are only three MPI function calls and no communication between the processes. However there is a problem with output, if all processes write to the same file at the same time the output will be mangled. One solution is to open a file for each process and name the file according to the rank of the process. For example the following code will create a string "Output." with the rank of the process appended to it. The string can then be used to open a file for that process. In this way up to 100 files can be opened.
 FILE *fp;
 char file[100];
 char c1,c2;
 
/* c1 and c2 are two characters used to represent my_rank */
 c1=my_rank/10 +'0';
 c2=my_rank%10 +'0';
 
/* copy "Output." to the character array file */
 strcpy(file,"OutPut.");

/* add c1 and c2 to the end of the character array file */
 strncat(file,&c1,1);
 strncat(file,&c2,1);
 
/* open a file with the name held in the character array file */ 
 fp=fopen(file,"w");

The above piece of code will open files Output.00, Output.01, Output.02 etc depending on the number of processes and each process will have its own file with pointer fp for output.

Putting the two pieces of code together:


FILE *fp;

int main(int argc,char* argv[]);
{
  int my_rank;
  char file[100];
  char c1,c2;


  MPI_Init(&argc, &argv); 
/* Get the rank of each process  */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

/* c1 and c2 are two characters used to represent my_rank */
 c1=my_rank/10 +'0';
 c2=my_rank%10 +'0';
 
/* copy "Output." to the character array file */
 strcpy(file,"OutPut.");

/* add c1 and c2 to the end of the character array file */
 strncat(file,&c1,1);
 strncat(file,&c2,1);
 
/* open a file with the name held in the character array file */ 
 fp=fopen(file,"w");

/* seed the random number generator with the rank of the process */
  random_seed(my_rank);

/* execute the code */
  Monte_Carlo();

  MPI_Finalize();
  return 0;
}  
Fortran Version

Systems Supporting MPI at UVa

MPI is supported on the ITC Linux clusters. Accounts on the clusters can be requested here. The MPI compilers can be invoked with mpcc (C), mpCC (C++) and mpf90 (Fortran). More details on the clusters can be found at the cluster web page.

Tutorials on the Web

Practical MPI Programming by IBM
MPI tutorials at Lawrence Livermore National Laboratory
MPICH home page
LAM home page
MPI Forum
MPI at the University of San Francisco

Implementations of MPI

Two popular open-source implementations of MPI are MPICH http://www-unix.mcs.anl.gov/mpi/mpich/ and LAM http://www.lam-mpi.org/. Both will work with Windows. MPICH2 is recommended for Windows but works only under 2000/XP. LAM works with Windows within the cygwin environment. Both MPICH and LAM work under all Unix operating systems. In addition, vendors often supply their own versions of MPI for many non-Linux Unix operating systems.

Books on MPI

The following books can be found in UVa libraries or at the Research Computing Support Center, room 244 Wilson Hall.


Parallel Programming with MPI by Peter Pacheco
Using Mpi : Portable Parallel Programming With the Message-Passing Interface by William Gropp, Ewing Lusk, Anthony Skjellum
Using MPI-2: Advanced Features of the Message- Passing Interface by William Gropp, Ewing Lusk, Rajeev Thakur
Mpi the Complete Reference : The Mpi Core by Marc Snir
Mpi the Complete Reference : The Mpi-2 Extensions by William Gropp, Steven Huss-Lederman, Bill Nitzberg and Ewing Lusk

© 2008 by the Rector and Visitors of the University of Virginia.

The information contained on the University of Virginia’s Department of Information Technology and Communication (ITC) website is provided as a public service with the understanding that ITC makes no representations or warranties, either expressed or implied, concerning the accuracy, completeness, reliability or suitability of the information, including warrantees of title, non-infringement of copyright or patent rights of others. These pages are expected to represent the University of Virginia community and the State of Virginia in a professional manner in accordance with the University of Virginia’s Computing Policies.