Differences between revisions 2 and 3
Revision 2 as of 2009-06-03 07:45:22
Size: 3777
Editor: MarcoRatto
Comment:
Revision 3 as of 2009-11-10 15:30:55
Size: 6059
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= MATLAB Parallel computation utilities for Dynare. =
Parallelized routines:
#pragma section-numbers 1
Line 4: Line 3:
 * Random_walkMetropolis_hastings.m;
 * mcmcdiagnostics.m.
This page documents the parallelization system developped by Marco Ratto for Dynare.
Line 7: Line 5:
== For windows we assume: ==
 1. a standard windows network;
 2. PsTools are in the path of the WIndows machine
 3. the user on the PC of the master thread has to be user of any other PC in the cluster.
The idea is to provide a general framework in Dynare for parallelizing tasks which require no inter-process communication.
Line 12: Line 7:
== For linux we assume: ==
 1. matlab is in the path of the machine, e.g. in the etc/environment file the matlab bin folder is in the PATH variable;
 2. ssh is properly installed and working;
 3. use ssh-keygen to allow remote machines to be connected to the master machine, for grid computation;
 4. for a grid of unix machines, sshfs also needs to be installed and working;
The implementation is done by running several MATLAB or Octave processes, either on local or on remote machines. Communication between master and slave processes are done through SMB on Windows and SSH on UNIX. Input and output data, and also some short status messages, are exchanged through network filesystems.
Line 18: Line 9:
So far, mixed platform grids (linux and win) are not working: only homogeneous grids (unix or win) are tested. Currently the system works only with homogenous grids: only Windows or only Unix machines.

Two routines are currently parallelized:
 * the Metropolis-Hastings algorithm (implemented in {{{Random_walkMetropolis_hastings.m}}})
 * the Metropolis-Hastings diagnostics (implemented in {{{McMCDiagnostics.m}}})

= Requirements =

== For a Windows grid ==

 1. a standard Windows network (SMB) must be in place
 2. [[http://technet.microsoft.com/en-us/sysinternals/bb896649.aspx|PsTools]] must be installed in the path of the master Windows machine
 3. the Windows user on the master machine has to be user of any other slave machine in the cluster, and that user will be used for the remote computations

== For a UNIX grid ==

 1. MATLAB executable must be in the path of the slave machines
 2. SSH must be installed on the master and on the slave machines
 3. the UNIX user on the master machine has to be user of any other slave machine in the cluster, and that user will be used for the remote computations
 4. SSH keys must be installed so that the SSH connections from the slaves to the master can be done without passwords (see SshKeysHowto)
 5. SSHFS must be installed on the slave machines
Line 21: Line 32:
A substructure of options_ named "parallel" defines the charactieristics of the grid:
Line 23: Line 33:
=== Common syntax for win and unix, for local parallel runs (assuming quad-core) === The parallelization mechanism is triggered by the use of {{{options_.parallel}}}. By default, this option is equal to zero, no parallelization is used.

To trigger the parallelization, this option must be filled with a vector of structures. Each structure represents a slave machine (possibly using several CPU cores on the machine).

The fields are:

 * Local: equal to 0 or 1. Use 0 if this slave is the local machine, 1 if it is a remote machine
 * Pc``Name: for a remote slave, name of the machine. Use the NETBIOS name under Windows, or the DNS name under Unix
 * NumCPU: a vector of integers representing the CPU cores to be used on that slave machine. The first core has number 0. So, on a quadcore, use [0:3] here to use the four cores
 * user: for a remote slave, username to be used. On Windows, the group needs also to be specified here, like {{{DEPT\JohnSmith}}}, i.e. user {{{JohnSmith}}} in windows group {{{DEPT}}}
 * passwd: for a remote slave, password associated to the username
 * Remote``Drive: for a remote Windows slave, letter of the remote drive (C, D, ...) where the computations will take place
 * Remote``Folder: for a remote slave, path of the directory on the remote drive where the computations will take place

There is currently no interface in the preprocessor to construct this option structure vector; this has to be done by hand by the user in the MOD file.

== Example syntax for win and unix, for local parallel runs (assuming quad-core) ==
Line 31: Line 57:
=== Windows syntax for remote runs (Local=0) === == Example Windows syntax for remote runs ==
Line 33: Line 59:
 * RemoteDrive has to be typed explicitly!
 * for user, ALSO the group has to be specified, like DEPT\JohnSmith, i.e. user JohnSmith in windows group DEPT
 * PcName is the name of the computer in the windows network, i.e. the output of hostname, or the full IP adress
 * Remote``Drive has to be typed explicitly!
 * for user, ALSO the group has to be specified, like {{{DEPT\JohnSmith}}}, i.e. user {{{JohnSmith}}} in windows group {{{DEPT}}}
 * Pc``Name is the name of the computer in the windows network, i.e. the output of hostname, or the full IP adress
Line 42: Line 68:
==== Example to use several remote PC's to build a grid on windows: ==== === Example to use several remote PC's to build a grid ===
Line 60: Line 86:
==== Example of combining local and remote runs (on win): ==== === Example of combining local and remote runs ===
Line 70: Line 96:
=== Unix syntax for remote runs (Local=0): ===
 * no passwd and RemoteDrive needed!
 * PcName: full IP address or address
== Example Unix syntax for remote runs ==
 * no passwd and Remote``Drive needed!
 * Pc``Name: full IP address or address

=== Example with only one remote slave ===
Line 79: Line 107:
==== Example of combining local and remote runs (on unix): ==== === Example of combining local and remote runs (on unix): ===

This page documents the parallelization system developped by Marco Ratto for Dynare.

The idea is to provide a general framework in Dynare for parallelizing tasks which require no inter-process communication.

The implementation is done by running several MATLAB or Octave processes, either on local or on remote machines. Communication between master and slave processes are done through SMB on Windows and SSH on UNIX. Input and output data, and also some short status messages, are exchanged through network filesystems.

Currently the system works only with homogenous grids: only Windows or only Unix machines.

Two routines are currently parallelized:

  • the Metropolis-Hastings algorithm (implemented in Random_walkMetropolis_hastings.m)

  • the Metropolis-Hastings diagnostics (implemented in McMCDiagnostics.m)

1. Requirements

1.1. For a Windows grid

  1. a standard Windows network (SMB) must be in place
  2. PsTools must be installed in the path of the master Windows machine

  3. the Windows user on the master machine has to be user of any other slave machine in the cluster, and that user will be used for the remote computations

1.2. For a UNIX grid

  1. MATLAB executable must be in the path of the slave machines
  2. SSH must be installed on the master and on the slave machines
  3. the UNIX user on the master machine has to be user of any other slave machine in the cluster, and that user will be used for the remote computations
  4. SSH keys must be installed so that the SSH connections from the slaves to the master can be done without passwords (see SshKeysHowto)

  5. SSHFS must be installed on the slave machines

2. Usage

The parallelization mechanism is triggered by the use of options_.parallel. By default, this option is equal to zero, no parallelization is used.

To trigger the parallelization, this option must be filled with a vector of structures. Each structure represents a slave machine (possibly using several CPU cores on the machine).

The fields are:

  • Local: equal to 0 or 1. Use 0 if this slave is the local machine, 1 if it is a remote machine
  • PcName: for a remote slave, name of the machine. Use the NETBIOS name under Windows, or the DNS name under Unix

  • NumCPU: a vector of integers representing the CPU cores to be used on that slave machine. The first core has number 0. So, on a quadcore, use [0:3] here to use the four cores
  • user: for a remote slave, username to be used. On Windows, the group needs also to be specified here, like DEPT\JohnSmith, i.e. user JohnSmith in windows group DEPT

  • passwd: for a remote slave, password associated to the username
  • RemoteDrive: for a remote Windows slave, letter of the remote drive (C, D, ...) where the computations will take place

  • RemoteFolder: for a remote slave, path of the directory on the remote drive where the computations will take place

There is currently no interface in the preprocessor to construct this option structure vector; this has to be done by hand by the user in the MOD file.

2.1. Example syntax for win and unix, for local parallel runs (assuming quad-core)

All empty fields, except Local and NumCPU

options_.parallel = struct('Local', 1, 'PcName','', 'NumCPU', [1:3], 'user','','passwd','',
'RemoteDrive', '', 'RemoteFolder',''); 

2.2. Example Windows syntax for remote runs

  • win passwd has to be typed explicitly!
  • RemoteDrive has to be typed explicitly!

  • for user, ALSO the group has to be specified, like DEPT\JohnSmith, i.e. user JohnSmith in windows group DEPT

  • PcName is the name of the computer in the windows network, i.e. the output of hostname, or the full IP adress

options_.parallel = struct('Local', 0, 'PcName','RemotePCName','NumCPU', [4:6], 'user',
'DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote');

2.2.1. Example to use several remote PC's to build a grid

A vector of parallel structures has to be built:

options_.parallel = struct('Local', 0, 'PcName','RemotePCName1','NumCPU', [0:3], 
'user', 'DEPT\JohnSmith', 'passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote');

options_.parallel(2) = struct('Local', 0, 'PcName','RemotePCName2','NumCPU', [0:3], 
'user', 'DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'D', 'RemoteFolder','dynare_calcs\Remote');

options_.parallel(3) = struct('Local', 0, 'PcName','RemotePCName3','NumCPU', [0:1], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote');

options_.parallel(4) = struct('Local', 0, 'PcName','RemotePCName4','NumCPU', [0:3], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote');

2.2.2. Example of combining local and remote runs

options_.parallel=struct('Local', 1, 'PcName','','NumCPU', [0:3],
 'user','','passwd','','RemoteDrive', '', 'RemoteFolder','');

options_.parallel(2)=struct('Local', 0, 'PcName','RemotePCName','NumCPU', [0:1], 
'user','DEPT\JohnSmith','passwd','****', 'RemoteDrive', 'C', 'RemoteFolder','dynare_calcs\Remote');

2.3. Example Unix syntax for remote runs

  • no passwd and RemoteDrive needed!

  • PcName: full IP address or address

2.3.1. Example with only one remote slave

options_.parallel=struct('Local', 0, 'PcName','name.domain.org','NumCPU', [0:3], 
'user','JohnSmith','passwd','', 'RemoteDrive', '', 'RemoteFolder','/home/rattoma/Remote');

2.3.2. Example of combining local and remote runs (on unix):

options_.parallel=struct('Local', 1, 'PcName','','NumCPU', [0:3], 
'user','','passwd','','RemoteDrive', '', 'RemoteFolder','');

options_.parallel(2)=struct('Local', 0, 'PcName','name.domain.org','NumCPU', [0:3], 'user','JohnSmith','passwd','', 'RemoteDrive', '', 'RemoteFolder','/home/rattoma/Remote');

DynareWiki: ParallelDynare (last edited 2012-05-09 10:05:10 by HoutanBastani)