Page 1 of 1

Nvidia CUDA

PostPosted: Tue Oct 20, 2009 11:39 pm
by ldx00
Any chance we could see a CUDA implementation of Dynare in the future? Or has anybody done this to run in Matlab already? I am looking to do this myself if I had to, just wanted to save myself the trouble if it had been done by someone already or might be implemented soon.

Thanks

ldx00

Re: Nvidia CUDA

PostPosted: Wed Oct 21, 2009 12:59 pm
by SébastienVillemot
Hi,

As far as I know, Dynare has no support for CUDA. This would require MATLAB itself to take advantage of it, and we don't have any control over MATLAB development.

However there is a prototype of parallelization of Dynare codes, developped by Marco Ratto. See http://www.dynare.org/DynareWiki/ParallelDynare

Note that this works only with the snapshot, not with version 4.0.

Best,

Re: Nvidia CUDA

PostPosted: Wed Oct 21, 2009 3:58 pm
by ldx00
Sébastien,

Thanks for the reply. It is true that this would require Matlab to run, at least the way I describe it. I am going to try compiling the C++ binaries already available, to run on the GPU through Matlab. I can produce a walkthrough if you want (assuming I get it working and it does genuinely provide a speed boost). The steps should not be too different with different versions of Matlab or CUDA. At the moment, things are taking far too long to process, even on a quad core (I'm trying to do Metropolis-Hastings iterations).

Thanks

ldx00

Re: Nvidia CUDA

PostPosted: Mon Oct 26, 2009 10:32 am
by SébastienVillemot
Hi,

What do you mean by recompiling C++ binaries? Actually, to take advantage of CUDA, my understanding is that it is not only a matter of recompiling: you first need to identify the pieces of code which can take advantage of parallelization, and then modify the C++ source code by introducing calls to the CUDA primitives at those places.

So this is not a straightforward task, and furthermore there are at this time not many places in the C++ code where you can take advantage of parallelization (the only relevant places at this time are the two kronecker product DLLs, where we have already tried some parallelization optimizations using OMP).

In the future we hope to provide the full estimation chain in C++: this will be the right place to introduce parallel optimizations, especially in the metropolis hastings.

Best,