Differences between revisions 52 and 53
Revision 52 as of 2010-10-28 13:56:49
Size: 16235
Comment:
Revision 53 as of 2010-10-29 06:58:13
Size: 16335
Editor: MarcoRatto
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
This page documents the parallelization system developped by Marco Ratto for Dynare. This page documents the parallelization system developped at the Joint Research Centre (Marco Ratto) for Dynare. This work has been funded by FP7 Project MONFISPOL [Grant no.: 225149].

This page documents the parallelization system developped at the Joint Research Centre (Marco Ratto) for Dynare. This work has been funded by FP7 Project MONFISPOL [Grant no.: 225149].

The idea is to provide a general framework in Dynare for parallelizing tasks which require very little inter-process communication.

The implementation is done by running several MATLAB or Octave processes, either on local or on remote machines. Communication between master and slave processes are done through SMB on Windows and SSH on UNIX. Input and output data, and also some short status messages, are exchanged through network filesystems.

Currently the system works only with homogenous grids: only Windows or only Unix machines.

Routines currently parallelized:

  • the Metropolis-Hastings algorithm (implemented in random_walk_metropolis_hastings.m)

  • the independent Metropolis-Hastings algorithm (implemented in independent_metropolis_hastings.m)

  • the Metropolis-Hastings diagnostics (implemented in McMCDiagnostics.m)

  • pm3.m (plotting routine)

  • Posterior_IRF.m

  • prior_posterior_statistics.m

1. Requirements

1.1. For a Windows grid

  1. a standard Windows network (SMB) must be in place
  2. PsTools must be installed in the path of the master Windows machine

  3. the Windows user on the master machine has to be user of any other slave machine in the cluster, and that user will be used for the remote computations.

1.2. For a UNIX/Mac grid

  1. SSH must be installed on the master and on the slave machines
  2. SSH keys must be installed so that the SSH connection from the master to the slaves can be done without passwords, or using an SSH agent (see SshKeysHowto)

2. User Interface

2.1. Parallel Computation options

Parallel computation will be triggered by the following options passed to the dynare command:

  • Command line options:
    • conffile=<path>: specify the location of the configuration file if it is not standard ($HOME/.dynare under Unix/Mac, %APPDATA%\dynare.ini on Windows)

    • parallel: trigger parallel computation using the first cluster specified in the configuration file

    • parallel=<clustername>: trigger parallel computation, using the given cluster

    • parallel_slave_open_mode: use the leaveSlaveOpen mode in the cluster

    • parallel_test: just test the cluster, don’t actually run the MOD file

2.2. Configuration File

To configure a cluster, the user must specify information about every node and cluster to be used in a separate configuration file. A valid configuration file will contain at least one cluster and one node. NB: All options are case-sensitive.

2.2.1. Node Options

Node Options

type

example

default

Meaning

Req. Local Win

Req. Remote Win

Req. Local Unix

Req. Remote Unix

Name

string

n1

(stop processing)

name of the node

*

*

*

*

CPUnbr

int or Matlab array specifying the consecutively-numbered processors to use

1, [2:4]

(stop processing)

The specific CPU's to be used on a node

*

*

*

*

ComputerName

string

localhost, karaba.cepremap.org

(stop processing)

Computer name on the network or IP address

*

*

*

*

UserName

string

houtanb

empty

required for remote login

*

*

Password

string

password

empty

required for remote login (only under Windows)

*

RemoteDrive

string

c

empty

Drive to be used on remote computer

*

RemoteDirectory

string

/home/houtanb

empty

Directory to be used on remote computer

*

*

DynarePath

string

/home/houtanb/dynare

empty

path to matlab directory within the Dynare installation directory

MatlabOctavePath

string

matlab

empty

path to MATLAB or Octave executable

SingleCompThread

boolean

true

true

disable MATLAB's native multithreading ?

2.2.2. Cluster Options

Cluster Options

type

example

default

Meaning

Required

Name

string

c1

empty

name of the node

*

Members

space-separated string

n1 n2 n3 n4

empty

list of members in this cluster

*

2.2.3. Example

The syntax of the configuration file will take the following form (the order in which the clusters and nodes are listed is not significant):

[cluster]
Name = c1
Members = n1 n2 n3

[cluster]
Name = c2
Members = n2 n3

[node]
Name = n1
ComputerName = localhost
CPUnbr = 1

[node]
Name = n2
ComputerName = karaba.cepremap.org
CPUnbr = 5
UserName = houtanb
RemoteDirectory = /home/houtanb/Remote
DynarePath = /home/houtanb/dynare/matlab
MatlabOctavePath = matlab

[node]
Name = n3
ComputerName = hal.cepremap.ens.fr
CPUnbr = [2:4]
UserName = houtanb
RemoteDirectory = /home/houtanb/Remote
DynarePath = /home/houtanb/dynare/matlab
MatlabOctavePath = matlab

3. Information for Dynare developers

3.1. General architecture of the system

The generic parallelization system is organized around five routines: masterParallel; slaveParallel.m and fParallel; fMessageStatus.m; closeSlaves.

  • masterParallel is the entry point to the parallelization system.

    • It is called from the master computer, at the point where the parallelization system should be activated. Its main arguments are the name of the function containing the task to be run on every slave computer, inputs to that function stored in two structures (one for local and the other for global variables), and the configuration of the cluster; this function exits when the task has finished on all computers of the cluster, and returns the output in a structure vector (one entry per slave);
    • all file exchange through the filesystem is concentrated in this masterParallel routine: so it prepares and send the input information for slaves, it retrieves from slaves the info about the status of remote computations stored on remote slaves by the remote processes; finally it retrieves outputs stored on remote machines by slave processes;

    • there are two modes of parallel execution, triggered by option parallel_slave_open_mode:

      • when parallel_slave_open_mode=0, the slave processes are closed after the completion of each task, and new instances are initiated when a new job is required; this mode is managed by fParallel.m;

      • when parallel_slave_open_mode=1, the slave processes are kept running after the completion of each task, and wait for new jobs to be performed; this mode is managed by slaveParallel.m;

  • slaveParallel.m and fParallel are the top-level functions to be run on every slave; their main arguments are the name of the function to be run (containing the computing task), and some information identifying the slave; the functions use the input information that has been previously prepared and sent by masterParallel through the filesystem, call the computing task, finally the routines store locally on remote machines the outputs such that masterParallel retrieves back the outputs to the master computer;

  • fMessageStatus.m provides the core for simple message passing during slave execution: using this routine, slave processes can store locally on remote machine basic info on the progress of computations; such information is retrieved by the master process (i.e. masterParallel.m) allowing to echo progress of remote computations on the master; the routine fMessageStatus.m is also the entry-point where a signal of interruption sent by the master can be checked and executed; this routine typically replaces calls to waitbar.m;

  • closeSlave.m is the utility that sends a signal to remote slaves to close themselves. In the standard operation, this is only needed when parallel_slave_open_mode=1 and it is called when DYNARE computations are completed. At that point, slaveParallel.m will get a signal to terminate and no longer wait for new jobs. However, this utility is also useful in any parallel mode if, for any reason, the master needs to interrupt the remote computations which are running;

The parallel toolbox also includes a number of utilities:

  • AnalyseComputationalEnviroment.m: this a testing utility that checks that the cluster works properly and echoes error messages when problems are detected;

  • InitializeComputationalEnviroment.m: initializes some internal variables and remote directories;

  • distributeJobs.m: uses a simple algorithm to distribute evenly jobs across the available CPU's;

  • a number of generalized routines that properly perform delete, copy, mkdir, rmdir commands through the network file-system (i.e. used from the master to operate on slave machines); the routines are adaptive to the actual environment (Windows or Unix);

    • dynareParallelDelete.m: generalized delete;

    • dynareParallelDir.m: generalized dir;

    • dynareParallelGetFiles.m: generalized copy FROM slaves TO master machine;

    • dynareParallelMkDir.m: generalized mkdir on remote machines;

    • dynareParallelRmDir.m: generalized rmdir on remote machined;

    • dynareParallelSendFiles.m: generalized copy TO slaves FROM master machine;

A more complete developer documentation (but a bit outdated) is in parallel.pdf.

3.2. Internal representation

The parallelization mechanism is triggered by the use of options_.parallel. By default, this option is equal to zero, no parallelization is used.

To trigger the parallelization, this option must be filled with a vector of structures. Each structure represents a slave machine (possibly using several CPU cores on the machine).

The fields are:

  • Local: equal to 0 or 1. Use 1 if this slave is the local machine, 0 if it is a remote machine
  • PcName: for a remote slave, name of the machine. Use the NETBIOS name under Windows, or the DNS name under Unix

  • NumCPU: a vector of integers representing the CPU cores to be used on that slave machine. The first core has number 0. So, on a quadcore, use [0:3] here to use the four cores
  • user: for a remote slave, username to be used. On Windows, the group needs also to be specified here, like DEPT\JohnSmith, i.e. user JohnSmith in windows group DEPT

  • passwd: for a remote slave, password associated to the username
  • RemoteDrive: for a remote Windows slave, letter of the remote drive (C, D, ...) where the computations will take place

  • RemoteFolder: for a remote slave, path of the directory on the remote drive where the computations will take place

There is currently no interface in the preprocessor to construct this option structure vector; this has to be done by hand by the user in the MOD file.

3.2.1. Example syntax for win and unix, for local parallel runs (assuming quad-core)

All empty fields, except Local and NumCPU

options_.parallel = struct('Local', 1, 'PcName','', 'NumCPU', [0:3], 'user','','passwd','',
'RemoteDrive', '', 'RemoteFolder','', 'MatlabOctavePath', '', 'DynarePath', '');

3.2.2. Example Windows syntax for remote runs

  • win passwd has to be typed explicitly!
  • RemoteDrive has to be typed explicitly!

  • for user, ALSO the group has to be specified, like DEPT\JohnSmith, i.e. user JohnSmith in windows group DEPT

  • PcName is the name of the computer in the windows network, i.e. the output of hostname, or the full IP adress

options_.parallel = struct('Local', 0, 'PcName','RemotePCName','NumCPU', [4:6], 'user',
'DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'd:\dynare\matlab');

3.2.2.1. Example to use several remote PC's to build a grid

A vector of parallel structures has to be built:

options_.parallel = struct('Local', 0, 'PcName','RemotePCName1','NumCPU', [0:3], 
'user', 'DEPT\JohnSmith', 'passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'c:\dynare\matlab');

options_.parallel(2) = struct('Local', 0, 'PcName','RemotePCName2','NumCPU', [0:3], 
'user', 'DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'D', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'c:\dynare\matlab');

options_.parallel(3) = struct('Local', 0, 'PcName','RemotePCName3','NumCPU', [0:1], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'c:\dynare\matlab');

options_.parallel(4) = struct('Local', 0, 'PcName','RemotePCName4','NumCPU', [0:3], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'c:\dynare\matlab');

3.2.2.2. Example of combining local and remote runs

options_.parallel=struct('Local', 1, 'PcName','','NumCPU', [0:3],
 'user','','passwd','','RemoteDrive', '', 'RemoteFolder','','MatlabOctavePath', '', 'DynarePath', '');

options_.parallel(2)=struct('Local', 0, 'PcName','RemotePCName','NumCPU', [0:1], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote', 'MatlabOctavePath', 'matlab', 'DynarePath', 'c:\dynare\matlab');

3.2.3. Example Unix/Mac syntax for remote runs

  • no passwd and RemoteDrive needed!

  • PcName: full IP address or address

3.2.3.1. Example with only one remote slave

options_.parallel=struct('Local', 0, 'PcName','name.domain.org','NumCPU', [0:3], 
'user','JohnSmith','passwd','', 'RemoteDrive', '', 'RemoteFolder','/home/rattoma/Remote','MatlabOctavePath', 'matlab', 'DynarePath', '/home/rattoma/dynare/matlab');

3.2.3.2. Example of combining local and remote runs (on unix):

options_.parallel=struct('Local', 1, 'PcName','','NumCPU', [0:3], 
'user','','passwd','','RemoteDrive', '', 'RemoteFolder','','MatlabOctavePath', '', 'DynarePath', '');

options_.parallel(2)=struct('Local', 0, 'PcName','name.domain.org','NumCPU', [0:3], 'user','JohnSmith','passwd','', 'RemoteDrive', '', 'RemoteFolder','/home/rattoma/Remote','MatlabOctavePath', 'matlab', 'DynarePath', '/home/rattoma/dynare/matlab');

3.3. Improvements to be made (by decreasing order of importance)

  • Improve the way we deal with MATLAB's native multithreading, which was introduced in MATLAB 7.4, and enable by default on MATLAB 7.6 (see MatlabVersionsCompatibility). The default behavior of the parallel toolbox should be to disable that feature, which can be done using maxNumCompThreads or -singleCompThread depending on the MATLAB version. An option to the parallel code should exist for giving control of number of threads to MATLAB.

  • Rename internal options to reflect the names of options in the config file (done!)
  • Implement console mode for MATLAB (already done for Octave), by testing options_.console_mode

  • Allow for the possibility of specifying a weight for each slave in the cluster, for taking into account the heterogeneity of performances; slaves with a low weight would be allocated less blocks. [NOTE: This will require a generalization of the routine distributeJobs.m plus a syntax to provide weights (i.e. a new field CPUWeight in the options_.parallel structure)];

  • Network performance: let the master download files from the slaves continuously, instead of waiting for the slaves to end their computations, in order to minimize transfer time. [NOTE: Here the infrastructure build with fMessageStatus.m can be used: slave processes can provide info about files produced by their processes; masterParallel.m can retrieve this info and retrieve files while remote computations are still running.

DynareWiki: ParallelDynare (last edited 2012-05-09 10:05:10 by HoutanBastani)