Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 SRSPrecompiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Utils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.1 stop application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 rss ckpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.3 rss restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.4 change ckpt interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.5 ibp move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.6 bandwidth matrix gen.sh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.1 SRS Restart example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.2 SRS Register example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.3 SRS Check Stop example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.4 SRS Read examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.5 SRS DistributeFunc Create example . . . . . . . . . . . . . . . . . . . . . . . 19
6.6 SRS DistributeMap Create example . . . . . . . . . . . . . . . . . . . . . . . . 20
6.7 The big picture, a working example . . . . . . . . . . . . . . . . . . . . . . . . 21
7 Test Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
8 Credits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Motivation
SRS library allows the user to stop his running application and restart it
on different number of processors with different data distribution.
This is useful in a number of cases:
• Application migration
The machines in a cluster on which the application is currently running
may become unavailable after a period of time. After getting this in-
formation from the system administrators, the user determines that the
application will not be able to complete within the period of time. Hence
he wants to stop the application and move the application to another clus-
ter and continue the application from the point where the application was
stopped.
Application migration is also useful for resource management systems like
Condor where an application has to be migrated when the workstation
owner returns to using the machine. Since Condor does not have ade-
quate support for MPI programs, the SRS library can be readily used in
Condor to support MPI programs in Condor framework.
Certain scheduling systems in distributed systems use preemptive methods
to ensure the progress of different applications. In this case, the stop and
restart mechanism provided by the SRS library can be used for context
switching between different applications.
• Expanding the processors set
In many cases, the users of parallel programs do not have an idea about
the exact number of processors to use. The user may want to determine
this on trial and error basis. Hence he can start the application on an
initial set of processors, determine that the application is not running at
sufficient speed, stop the application, restart and continue it with more
number of processors, stop the application again and so on.
• Reducing the processor set
The user may want to reduce the number of processors he is using for the
application either to increase the performance of the application or due to
non-availability of some resources.
• Changing the data distribution
Like with the number of processors, the user of parallel programs are
at a loss regarding the type of data distribution he has to use for the
data in his program. The user can use a initial data distribution, say
block data distribution, and run his application. If the performance of
the application is not satisfactory, he can stop the application, compile
the application with a new data distribution, say block cyclic, restart the
application and continue from the point when it was stopped, but this time
with the block-cyclic data distribution, note the performance change, stop
the application again and so on.
• Fault tolerance
Apart from the pro-active stopping of applications by the user, the ap-
plication may terminate abnormally due to sudden host failures. SRS
provides periodic checkpointing so that when the host is brought back up,
the application can be restarted and continued from the point when it was
terminated abnormally.
Introduction
SRSPrecompiler
• Introduction:
SRSPrecompiler is intended to automate the process of inserting the SRS
calls into the application, so that the resulting programs become fault-
tolerant, migratable and malleable.
• Prerequisites:
1. gdsl-1.4(Generic Data Structure Library): It can be downloaded from
http://download.gna.org/gdsl/. Follow the installation steps specified in
the INSTALL file of gdsl.
2. gcc-3.4.4: To parse the user program, SRSprecompiler uses the gcc-3.4.4
macros.
• Installtion:
a)>> gunzip SRS-1.2.1.tgz
b)>> tar -xvf SRS-1.2.1.tar
LOC be the location of SRS-1.2.1
c)>> cd LOC/SRSPrecompiler
d)>> ./make-precompiler.sh
The step d) should result in a binary in LOC/bin/srs compile.
To clean srs compile binary run make clean in the LOC/SRSPrecompiler.
e)>> cd LOC.
f) Follow the steps given in ”Quick start to install SRS”,start from step
3.
• Compilation:
This section describes how to compile plain C/MPI program. srs compile
utility converts plain C/MPI application into C/MPI self-restartable, self
checkpointing, malleable and migratable. Compiling source code with
srs compile is similar to compiling using gcc. The difference being only in
the binary name and the output file.
Put the srs compile binary in your path.
If you are using tcsh:
>>setenv PATH $PATH:LOC/SRSPrecompiler/bin
If you are using bash:
>>export PATH=$PATH:LOC/SRSPrecompiler/bin
In the application, place SRS PollPoint() (look at redistribute test
example from the test directory) function calls at locations where there
should be potential checkpoints.
An example of such a file with SRS PollPoint() is
/SRS/test/redistribute test.c. The following statements illustrates
Installation
Requirements
• MPI
The SRS library is built on top of MPI. Hence the user’s application that
uses SRS must be MPI based programs.
• Headers
The application that uses SRS must include 2 header files, srs.h and
datatype.h.
• Libs
The user must link his application with libsrs.a (lib/ directory). The user
must also link his application with his pthread library. This is for using
the IBP functionality.
• Runtime
Before starting the application, the user must run the executable rss (bin/
directory). This is a sequential program and can be run on a machine that
can be different from the machines where the actual application is run.
• Optional
– IBP
This is needed when the user wants to use IBP distributed storage
framework for storing checkpoints. IBP depots must be started on
the pool of machines where the user can potentially run his applica-
tion.(see ibp server mt). Note that the default storage mechanism
is using simple file-based. If you want to use IBP, change Line 4
in SRS/include/dsi.h from ”#define FSERVER” to ”#define IBPD”
and recompile SRS.
– HADOOP
This is needed when the user wants to use Yahoo!’s Hadoop infras-
tructure for storing checkpoints.
• Config file
Before starting the application, the user must have the file ’srs.config’ in
the same location where the process 0 of his application will be executing.
A sample srs.config is given below.
RUNTIME_HOST = torc1.cs.utk.edu
RUNTIME_PORT = 9009
FAULT_TOLERANCE = yes
CKPT_INTERVAL = 40
NO_IBP_SERVERS = 3
IBP_SERVERS = garl-intel4 garl-intel2 garl-intel3
The RUNTIME HOST line points to the host where rss was started.
The RUNTIME PORT points to the port where rss will be accepting
connections. This port will be printed out when running the rss program.
The SRS library also does periodic checkpointing of data to provide fault
tolerance. Providing ’y’ or ’yes’ in the FAULT TOLERANCE line enables
this mechanism (you can also give ’n’ or ’no’). The interval of periodic
checkpointing in seconds can be specified using CKPT INTERVAL. The
checkpointing interval can also be changed during application execution
(See Utils).
Utils
Names
5.1 stop application ................................ 11
5.2 rss ckpt ................................ 12
5.3 rss restore ................................ 12
5.4 change ckpt interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.5 ibp move ................................ 13
5.6 bandwidth matrix gen.sh . . . . . . . . . . . . . . . . . . . . . . . 14
SRS provides some utils (in the bin/ directory after successfully building SRS).
5.1
stop application
Whenever the user wants to stop his program, he uses a program called stop
application that is included in the SRS library.
The command is:
>> stop application <runtime host> <runtime port>
• runtime host
the host where rss was started
• runtime port
the port where rss will be accepting connections. This port will be printed
out when running the rss program.
The user can restart his application in the same way he started his application
initially.
The rss program should be still running and srs.config should not have been
changed between runs. When the application runs to completion, the rss pro-
gram terminates.
5.2
rss ckpt
This utility is useful when using large distributed infrastructed with different
clusters. When the application is migrated from cluster-1 to cluster-2, instead
of requiring the application to contact the rss daemon of cluster-1, the rss
daemon can also be checkpointed and migrated from cluster-1 to cluster-2
using rss ckpt and rss restore(next section).
Whenever user wants to store the rss daemon, he uses a program called
rss ckpt that is included in the SRS library.
The command is:
>> rss ckpt <runtime host> <runtime port>
• runtime host
the host where rss was started
• runtime port
the port where rss will be accepting connections. This port will be printed
out when running the rss program.
The rss ckpt stores the rss daemon that is currently running on the port ot a
file rss ckpt.dat. The user can load the rss ckpt.dat into the rss daemon which
is running on another machine. Then user can restart his application in the
same way he started his application initially.
When the application runs to completion, the rss program terminates normally.
5.3
rss restore
Whenever user wants to load the previously stored rss daemon in the file
rss ckpt.dat to the current rss daemon, he uses a program called rss restore
that is included in the SRS library.
The command is:
>> rss restore <runtime host> <runtime port>
• runtime host
the host where rss was started
• runtime port
the port where rss will be accepting connections. This port will be printed
out when running the rss program.
The rss restore loads the rss ckpt.dat to the currently running rss daemon on
the runtime port. Then user can restart his application in the same way he
started his application initially.
When the application runs to completion, the rss program terminates.
5.4
Whenever the user wants to change the interval between checkpoints (only if
FAULT TOLERANCE is set to yes in srs.config). The command is:
>> change ckpt interval <runtime host> <runtime port> <new
interval>
• runtime host
the host where rss was started
• runtime port
the port where rss will be accepting connections. This port will be printed
out when running the rss program.
• new interval
the new time to wait between two checkpoints.
This change will only take effect after calling SRS Check Stop in the user’s
application.
5.5
ibp move
Whenever the user wants to move the IBP depots from a set of machines to a
different set of machines, this utility comes in handy. Before using the utility, the
user must have the file ’ibp servers.config’ in the same location where the utility
is called from. This utility cannot be used when the application is running. It
can be used when the application is put to stop and is useful when the user
wants to restart his application on a different set of machines for which the
present IBP servers are not accessible. The command is:
>> ibp move <runtime host> <runtime port>
• runtime host
the host where rss was started
• runtime port
the port where rss will be accepting connections.
CHANGES = 3
garl-intel1 garl-intel2
garl-intel3 garl-intel2
garl-intel4 garl-intel2
The CHANGES specifies the number of IBP depots to be moved. The subse-
quent lines contain the source IBP depots (first field) from which the data is
being moved to the destination IBP depots(second field). In this sample config
file, the data from IBP Depots “garl-intel1”, “garl-intel3” and “garl-intel4” is
moved to “garl-intel2”. When the rss has done moving the IBP depots, it prints
out a message. Once this utility is executed the IBP Depots on the source
machines can be shut down (“garl-intel1”, “garl-intel3” and “garl-intel4”) and
the user application restarted with only the destination IBP depots running
(“garl-intel2”).
5.6
This utility is used to generate the file “bandwidth matrix.dat”. The file is
generic and contains the bandwidth data representing the bandwidth between
the machines on which the application can be run and the machines on which
the IBP depots are available.
The command is:
>> sh bandwidth matrix gen.sh machines info.txt &
A sample machines info.txt is given below.
garl-intel1
hosts
garl-intel1
garl-intel2
ibpservers
garl-intel4
Examples
Names
6.1 SRS Restart example . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.2 SRS Register example . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.3 SRS Check Stop example . . . . . . . . . . . . . . . . . . . . . . . 17
6.4 SRS Read examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.5 SRS DistributeFunc Create example . . . . . . . . . . 19
6.6 SRS DistributeMap Create example . . . . . . . . . . 20
6.7 The big picture, a working example . . . . . . . . . . 21
6.1
If the application uses a matrix and if the matrix will be checkpointed when
the application is stopped in the middle of its execution,
then the matrix needs to be initialized with initial values only when the
application is executed for the first time.
void main( ){
int* matrix;
int restart_value;
matrix = (int*)malloc(sizeof(int)*10);
restart_value = SRS_Restart_Value();
if(restart_value == 0){
for(i=0; i<10; i++){
matrix[i] = i;
}
}
}
6.2
int main(){
double x[10];
int i;
int local_size = 2;
SRS_Register("X", x, GRADS_DOUBLE, 10, CYCLIC, &local_size);
SRS_Register("iterator", &i, GRADS_INT, 1, 0, NULL);
}
6.3
int main(){
int stop_value;
int *a;
stop_value = SRS_Check_Stop(NULL);
if(stop_value == 1){
free(a);
MPI_Finalize();
exit(0);
}
}
6.4
In the following examples, only partial code statements are shown to demon-
strate SRS Read call.
Example 1:
This is a simple example in which an array of integers are copied from the set
of processes in the old application to the corresponding set of processes in the
new application run.
int main(){
int A[10];
int i;
SRS_Init();
restart_value = SRS_Restart_Value();
if(restart_value == 1){
SRS_Read("A", A, 0, NULL);
}
SRS_Register("A", A, GRADS_INT, 10, 0, NULL);
}
Example 2:
In this example, block-cyclic data distribution is used for both old and new
application runs. Thus the same data is distributed in a block cyclic fashion
over a new set of processes when the application is restarted. Unlike Example
1, this example can be stopped and restarted on a different set of processes.
int main(){
int A[10];
int i;
SRS_Init();
restart_value = SRS_Restart_Value();
if(restart_value == 1){
SRS_Read("A", A, BLOCK, NULL);
}
SRS_Register("A", A, GRADS_INT, 10, BLOCK, NULL);
}
Example 3:
This example demonstrates the use of SAME value for new distribution in
SRS Read(). In this example, SAME is used for propagating the checkpointed
iterator to all the processes so that all the processes in the current application
run can start from the same iteration.
int main(){
int i, iter_start;
SRS_Init();
restart_value = SRS_Restart_Value();
if(restart_value == 1){
SRS_Read("iterator", &iter_start, SAME, NULL);
}
SRS_Register("iterator", &i, GRADS_INT, 1, 0, NULL);
for(i=iter_start; i<10; i++){
// computation
}
}
6.5
int main(){
int A[10];
int restart_value;
int distributefunc_handle;
DataMapInfo* (*distribute_func)(int, int , void*, char*);
MPI_Init();
SRS_Init();
restart_value = SRS_Restart_Value();
distribute_func = block_distribution;
SRS_DistributeFunc_Create(distribute_func, &distributefunc_handle);
SRS_Register(‘‘A"", A, GRADS_INT, 10, distributefunc_handle, NULL);
SRS_Finish();
MPI_Finalize();
}
6.6
In this example, the block cyclic data distribution is constructed using the data
map structure and a handle is created using SRS DistributeMap Create().
This handle is used in the subsequent SRS Register() call.
int main(){
int A[10];
int handle;
int restart_value;
MPI_Init();
SRS_Init();
restart_value = SRS_Restart_Value();
dataMap = (DataMapInfo*)malloc(sizeof(DataMapInfo));
dataMap->info_count = 5;
dataMap->offset = (int*)malloc(sizeof(int)*5);
dataMap->size = (int*)malloc(sizeof(int)*5);
dataMap->proc = (int*)malloc(sizeof(int)*5);
dataMap->offset[0] = 0;
dataMap->size[0] = 2;
dataMap->proc[0] = 0;
dataMap->offset[1] = 2;
dataMap->size[1] = 2;
dataMap->proc[1] = 1;
dataMap->offset[2] = 4;
dataMap->size[2] = 2;
dataMap->proc[2] = 2;
dataMap->offset[3] = 6;
dataMap->size[3] = 2;
dataMap->proc[3] = 0;
dataMap->offset[4] = 8;
dataMap->size[4] = 2;
dataMap->proc[4] = 1;
SRS_DistributeMap_Create(dataMap, &handle);
SRS_Register(‘‘A"", A, GRADS_INT, 10, handle, NULL);
SRS_Finish();
MPI_Finalize();
}
6.7
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "mpi.h"
#include "srs.h"
#include "datatype.h"
int main(int argc, char** argv){
int* global_A;
int* local_A;
int rank, size;
int global_size, local_size;
int proc_number, local_index;
int i, j, iter_start, restart_value, stop_value;
MPI_Comm comm = MPI_COMM_WORLD;
MPI_Init(&argc, &argv);
SRS_Init();
MPI_Comm_rank(comm, &rank);
MPI_Comm_size(comm, &size);
global_size = atoi(argv[1]);
local_size = global_size/size;
restart_value = SRS_Restart_Value();
global_A = (int*)malloc(sizeof(int)*global_size);
local_A = (int*)malloc(sizeof(int)*local_size);
if(restart_value == 0){
if(rank == 0){
for(i=0; i<global_size; i++){
global_A[i] = i;
}
}
MPI_Scatter (global_A, local_size, MPI_INT, local_A, local_size,
MPI_INT, 0, comm );
iter_start = 0;
}
else{
SRS_Read("A", local_A, BLOCK, NULL);
SRS_Read("iterator", &iter_start, SAME, NULL);
}
SRS_Register("A", local_A, GRADS_INT, local_size, BLOCK, NULL);
SRS_Register("iterator", &i, GRADS_INT, 1, 0, NULL);
printf("Proc. %d initial: ", rank);
for(j=0; j<local_size; j++){
printf("%d ", local_A[j]);
}
printf("\n");
for(i=iter_start; i<global_size; i++){
stop_value = SRS_Check_Stop();
if(stop_value == 1){
free(global_A);
free(local_A);
MPI_Finalize();
exit(0);
}
proc_number = i/local_size;
local_index = i%local_size;
if(rank == proc_number){
local_A[local_index] += 10;
}
printf("Proc. %d Iter. %d: ", rank, i);
for(j=0; j<local_size; j++){
printf("%d ", local_A[j]);
}
printf("\n");
sleep(1);
}
free(global_A);
free(local_A);
SRS_Finish();
MPI_Finalize();
exit(0);
}
Test Programs
To help you with your firsts SRS experiences, we provide you somes ready to
use applications.
• In the bin/ directory, two basic test applications are built in:
– simple test: a simple MPI test.
– redistribute test: a more complex test including redistribution
(don’t forget the size argument).
• In the src/test/ directory, some well known numerical applications are
instrumented with SRS. Currently, ScaLAPACK eigen value problem, and
PETSc CG, CGS, BICG and BCGS, are provided. In the corresponding
directories, you will be find the specific README file for compiling and
executing the applications.
Credits
References
1. MPICH. http://www-unix.mcs.anl.gov/mpi/mpich2/ or
2. MPI-LAM. http://www.lam-mpi.org/
3. The Internet Backplane Protocol, http://loci.cs.utk.edu/ibp/
4. Vadhiyar, S. and Dongarra, J. SRS - A Framework for Developing Mal-
leable and Migratable Parallel Applications for Distributed Systems. Par-
allel Processing Letters, Vol. 13, number 2, pp. 291-312, June 2003.
http://garl.serc.iisc.ernet.in/SRS/SRS.htm
5. Vadhiyar, S. and Dongarra, J. Performance Oriented Mi-
gration Framework for the Grid. Proceedings of The 3rd
IEEE/ACM International Symposium on Cluster Computing and
the Grid (CCGrid 2003), pp 130-137, May 2003, Tokyo, Japan.
http://garl.serc.iisc.ernet.in/SRS/SRS.htm