## Hessian computation

This forum is closed. You can read the posts but cannot write. We have migrated the forum to a new location where you will have to reset your password.
Forum rules
This forum is closed. You can read the posts but cannot write. We have migrated the forum to a new location (https://forum.dynare.org) where you will have to reset your password.

### Hessian computation

Hi All
A common problem that plagues the estimation of DSGE models is the computation of the inverse of the Hessian at the posterior mode, that is not positive definite.
I am aware that the option mode_compute=5 offers a different way of computing the Hessian. optim={'Hessian',2}. This seems to work well, from what I have seen. There is very little documentation available on this in the manual.
Do other optimisers also offer similar options? Perhaps not documented?
Cheers
Reuben
reubenpjacob

Posts: 133
Joined: Fri Oct 06, 2006 3:23 pm
Location: Reserve Bank of New Zealand

### Re: Hessian computation

Dear Reuben,

I believe the answer is no. However, mode_compute=6, which is based on a MCMC, provides an estimate of the posterior covariance matrix not based on the inverse of the hessian matrix (we use MCMC draws, so it works as long as the acceptance ratio is not too close to zero). Also, it is not mandatory to estimate the posterior mode (with hessian at the estimated mode) for running a metropolis. You can optionally use another covariance matrix for the jumping distribution.

Best,
Stéphane.
Université du Maine, GAINS and DynareTeam

Posts: 429
Joined: Wed Jan 05, 2005 4:24 pm
Location: Paris, France.

### Re: Hessian computation

mode_compute=5 with
Code: Select all
`optim={'Hessian',2} `
relies on the outer product of gradients and requires the use of a univariate Kalman filter. Therefore, something similar is not available for other optimizers. As Stéphane indicated, mode_compute=6 also differs.
The reason for using the inverse Hessian at the mode is that this is the most efficient choice if the posterior would be normal. But any positive definite matrix for the proposal density will do. You could provide any arbitrary matrix via the
Code: Select all
`mcmc_jumping_covariance`

command.
------------
Johannes Pfeifer
University of Cologne
jpfeifer

Posts: 6940
Joined: Sun Feb 21, 2010 4:02 pm
Location: Cologne, Germany

### Re: Hessian computation

Sorry for my question,

we use MCMC draws, so it works as long as the acceptance ratio is not too close to zero

From what stage of this method are you referring? I ask this because I see severals acceptance rates in different stages of procedure of this method.

¿Is it not assured that acceptance ratio = 1/3?
I ask this because I see in manual '
AcceptanceRateTarget'. Default: 1.0/3.0
Aldo

Posts: 56
Joined: Fri Aug 07, 2015 4:47 pm

### Re: Hessian computation

We use the draws of the metropolis (run in mode_compute=6) to estimate the posterior covariance matrix. We need to have enough variability in these draws. If all the draws, were identical (or more generally if the number of different draws was smaller than the number of estimated parameters) the sample covariance matrix would obviously not be full rank. That's why we target, by default, an acceptance rate of one third (which is the value commonly considered in the literature.

The optimization routine has several steps:

1. We run a metropolis, where we tune the scale parameter so that the acceptance rate is close to the target. In this first step the draws are not used to estimate the covariance matrix, we only continuously keep record of the vector of parameters which gives the best value for the objective. In this first step the covariance matrix if the jumping distribution is an identity matrix.

2. We run a metropolis to estimate the covariance matrix. Again we continuously keep record of the vector of parameters which gives the best value for the objective.

3. We return to 1. with the new estimate of the posterior covariance matrix.

By default we iterate on these steps two or three times (I do not remember, look at the reference manual). And in the last step we run a last metropolis, where we only update our estimate of the posterior mode. In this last round we decrease slowly the size of the jumps.

Best,
Stéphane.
Université du Maine, GAINS and DynareTeam

Posts: 429
Joined: Wed Jan 05, 2005 4:24 pm
Location: Paris, France.

### Re: Hessian computation

This discussion was very useful. Thanks a lot!
When we use mode_compute=6, sequentially after using a 'more efficient' optimiser as #4 or #8, would the MCMC be initialised at the latest available mode file or would it still use the values specificied in the estimated_params_init block?
Thanks again!
reubenpjacob

Posts: 133
Joined: Fri Oct 06, 2006 3:23 pm
Location: Reserve Bank of New Zealand

### Re: Hessian computation

Dear Reuben,

The initial state of the MCMC will be (or centered around, if you have more than one chain) the posterior mode estimate returned by the last optimization routine.

Best,
Stéphane.
Université du Maine, GAINS and DynareTeam

Posts: 429
Joined: Wed Jan 05, 2005 4:24 pm
Location: Paris, France.

### Re: Hessian computation

That depends on what exactly you are doing. Stephane is right when you use the
Code: Select all
`mode_file`

option. It will start at the last found mode.
------------
Johannes Pfeifer
University of Cologne
jpfeifer

Posts: 6940
Joined: Sun Feb 21, 2010 4:02 pm
Location: Cologne, Germany

### Re: Hessian computation

Thanks Johannes.
Do you recommend any readings to understand why the Hessian found by the numerical optimiser(s) is not positive definite?
Since this seems to be a very common problem, I thought a few readings would be useful!
Thanks
Reuben
reubenpjacob

Posts: 133
Joined: Fri Oct 06, 2006 3:23 pm
Location: Reserve Bank of New Zealand

### Re: Hessian computation

Dear Reuben,

In Dynare we do not use the hessian matrices returned by the optimizers. For instance, mode_compute=4, the default algorithm derived from Chris Sims code, returns a crude estimate of the hessian that is not used by Dynare (see any presentation of the BFGS, there is a nice page on wikipedia, algorithm which is close to what is done here). Instead we compute the hessian with finite differences (except mode_compute=6, and mode_compute=5 which uses a gradient the outer product approach), by calling hessian.m function.

The main culprit is that the optimization routine failed in finding a (local) minimum of minus the likelihood (or posterior kernel). That's why you need to play with other optimization routines and/or the initial guesses. Another culprit, may be the noise in the objective function. In this case you have to change the length of the steps in the finite difference routine (controlled by options_.gstep).

Best,
Stéphane.
Université du Maine, GAINS and DynareTeam

Posts: 429
Joined: Wed Jan 05, 2005 4:24 pm
Location: Paris, France.

### Re: Hessian computation

- the Hessian only needs to be positive definite at an interior solution. If you have a corner solution, i.e you are at the bound of your prior parameter space, there will be a problem
- if a parameter is not identified or if there is collinearity in the Jacobian of the likelihood, the Hessian will also be non-positive definite

That is why a look at the mode_check plots is often revealing to see many pathological issues not simply due to the finite difference approximation to the Hessian.
------------
Johannes Pfeifer
University of Cologne