Package daredare :: Package extern :: Module helpers
[hide private]
[frames] | no frames]

Module helpers

source code

(Mostly time-series-related) functions needed and written by Sven Schreiber.

This is free but copyrighted software, distributed under the same license terms (as of January 2007) as the 'gretl' program by Allin Cottrell and others, see gretl.sf.net (in short: GPL v2, see www.gnu.org/copyleft/gpl.html).

(see end of this file for a changelog)

Functions [hide private]
 
rank(m, rcond=1e-10)
Returns the (algebraic, not numpy-jargon) rank of m.
source code
 
vec(m)
Returns all columns of the input as a stacked (column) vector.
source code
 
unvec(m, rows, cols)
Turns (column) vector into matrix of shape == (rows, cols).
source code
 
mat2gretlmatstring(m)
Turns numpy matrix or array (or scalar!) m into gretl string representation.
source code
 
startobs2obslist(startperiod, numofobs)
Constructs list of observation labels following the input pattern.
source code
 
writecsv(filename, data, orientation='cols', delim=',', varnames=[], obslabels=[], comments=[], commentchar='# ')
Saves array or matrix <data> in csv format in file <filename> (path string).
source code
 
readcsv(filename, delim=',', commentchar='#', colheader='names', rowheader='obs')
Read in a csv file (may contain comments starting with commentchar).
source code
 
floatAndNanConverter(datapoint, nacode='na')
Converts nacode to numpy.nan value.
source code
 
dateString2dateFloat(datestring)
Converts '1999q2' -> 1999.25, '1999m2' -> 1999.0833, etc.
source code
 
getQuarterlyDates(startyear, startquarter, t)
Constructs a list of quarterly date labels for t obs.
source code
 
null(m, rcond=1e-10) source code
 
getOrthColumns(m)
Constructs the orthogonally complementing columns of the input.
source code
 
addLags(m, maxlag)
Adds (contiguous) lags as additional columns to the TxN input.
source code
 
getDeterministics(nobs, which='c', date=0.5)
Returns various useful deterministic terms for a given sample length T.
source code
 
getImpulseDummies(sampledateslist, periodslist)
Returns a (numpy-)matrix of impulse dummies for the specified periods.
source code
 
geneigsympos(A, B)
Solves symmetric-positive-def.
source code
 
vecm2varcoeffs(gammas, maxlag, alpha, beta)
Converts Vecm coeffs to levels VAR representation.
source code
 
gammas2alternativegammas(gammas, alpha, beta)
Converts Vecm-coeffs for ect at t-1 to the ones for ect at t-maxlag.
source code
 
write_gretl_mat_xml(outfile, matrices, matnames=[])
Writes a gretl matrix xml file to transfer matrices.
source code
 
autocovar(series, LagInput, Demeaned=False)
Computes the autocovariance of a uni- or multivariate time series.
source code
 
longrunvar(series, Demeaned=False, LagTrunc=4)
Estimates the long-run variance (aka spectral density at frequency zero) of a uni- or multivariate time series.
source code
 
commontrendstest(series, LagTrunc=4, determ='c', breakpoint=0.5)
The Nyblom&Harvey(2000)-type tests for K_0 against K>K_0 common stochastic trends in time series.
source code
Variables [hide private]
  quarter2month = {1: 1, 2: 4, 3: 7, 4: 10}
  month2quarter = {1: 1, 2: 1, 3: 1, 4: 2, 5: 2, 6: 2, 7: 3, 8: ...
  qNumber2qFloat = {1: 0.0, 2: 0.25, 3: 0.5, 4: 0.75}
  mNumber2mFloat = {1: 0.0, 2: 0.0833, 3: 0.1666, 4: 0.2499, 5: ...
  qFracstring2qString = {0.0: 1, 0.25: 2, 0.5: 3, 0.75: 4}
  mFloat2mNumber = {0.0: 1, 0.0833: 2, 0.1666: 3, 0.2499: 4, 0.3...
Function Details [hide private]

vec(m)

source code 

Returns all columns of the input as a stacked (column) vector.

If m is a numpy-array, a 1d-array is returned. For a numpy-matrix m,
 the output has shape (n*m, 1).

unvec(m, rows, cols)

source code 

Turns (column) vector into matrix of shape == (rows, cols).

Also accepts 1d-array input, but always returns numpy matrix.

startobs2obslist(startperiod, numofobs)

source code 

Constructs list of observation labels following the input pattern.

Example: startperiod = '1999q3', numofobs = 2 -> ['1999q3', '1999q4'] Currently supports only annual (pure number), monthly, quarterly. Years must be in 4-digit format.

writecsv(filename, data, orientation='cols', delim=',', varnames=[], obslabels=[], comments=[], commentchar='# ')

source code 

Saves array or matrix <data> in csv format in file <filename> (path string).

<comments> must be passed as a sequence of strings, one for each line,
 and will be written at the top of the file, each line starting with 
 <commentchar>.
<orientation> can be 'cols' or 'rows', determines whether the
 variable names will be used as column or row headers, and how to treat
 1d-input. (And observation labels will be written accordingly.)
<varnames> and <obslabels> must be sequences of strings.

readcsv(filename, delim=',', commentchar='#', colheader='names', rowheader='obs')

source code 

Read in a csv file (may contain comments starting with commentchar).

The contents of the first non-comment row and column must be indicated in
 rowheader and colheader as one of 'names', 'obs' (labels), or None.
The array (matrix) of the data is returned as is, i.e. w/o transpose, hence
 the caller must know whether variables are in rows or columns.
If both colheader and rowheader are not None, the upper-left cell (header
 of the first row/col) is ignored (but must be non-empty).

Returns a five-element tuple:
0. numpy-matrix of the actual data as floats
1. orientation of variables: 'cols', 'rows', or 'unknown'
2. 1d-array of variable names (or None)
3. 1d-array of observation labels (or None)
4. the type/frequency of the data
    (currently one of 'a', 'q', 'm', guessed from the first date label)
    (if this deduction failed, 'unknown' is returned here)

Easiest example with upper-left data cell in second row/second column:
mydata = readcsv('myfile.csv')[0]

floatAndNanConverter(datapoint, nacode='na')

source code 

Converts nacode to numpy.nan value.

Also returns other input as float (e.g. for matplotlib's load, asarray).

dateString2dateFloat(datestring)

source code 

Converts '1999q2' -> 1999.25, '1999m2' -> 1999.0833, etc.

So far only for quarterly and monthly.

getQuarterlyDates(startyear, startquarter, t)

source code 

Constructs a list of quarterly date labels for t obs.

Algorithm to get a sequence of strings relating to quarterly dates:

  1. start with first day in the startquarter, e.g. 2006-04-01
  2. map the month to quarter and make string year + 'q' + quarter
  3. the longest quarters are 3rd and 4th (2*31 days + 30 days = 92 days), 1st the shortest (90 or 91), so add a timedelta (in days, apparently default) of 100 days (anything between 92+1 and sum of shortest quarter plus one month = approx. 118)
  4. reset the day of that intermediate date to 1
  5. return to step 2

getOrthColumns(m)

source code 

Constructs the orthogonally complementing columns of the input.

Input of the form pxr is assumed to have r<=p, and have either full column rank r or rank 0 (scalar or matrix) Output is of the form px(p-r), except: a) if M square and full rank p, returns scalar 0 b) if rank(M)=0 (zero matrix), returns I_p (Note you cannot pass scalar zero, because dimension info would be missing.) Return type is as input type.

addLags(m, maxlag)

source code 

Adds (contiguous) lags as additional columns to the TxN input.

Early periods first. If maxlag is zero, original input is returned. maxlag rows are deleted (the matrix is shortened)

getDeterministics(nobs, which='c', date=0.5)

source code 

Returns various useful deterministic terms for a given sample length T.

Return object is a numpy-matrix-type of dimension Tx(len(which)); (early periods first, where relevant). In the 'which' argument pass a string composed of the following letters, in arbitrary order: c - constant (=1) term t - trend (starting with 0) q - centered quarterly seasonal dummies (starting with 0.75, -0.25...) m - centered monthly seasonal dummies (starting with 11/12, -1/12, ...) l - level shift (date applies) s - slope shift (date applies) i - impulse dummy (date applies)

If the date argument is a floating point number (between 0 and 1), it is treated as the fraction of the sample where the break occurs. If instead it is an integer between 0 and T, then that observation is treated as the shift date.

getImpulseDummies(sampledateslist, periodslist)

source code 

Returns a (numpy-)matrix of impulse dummies for the specified periods.

sampledateslist must consist of 1999.25 -style dates (quarterly or monthly).
However, because periodslist is probably human-made, it expects strings
 such as '1999q3' or '1999M12'.
Variables in columns.
So far only for quarterly and monthly data.

geneigsympos(A, B)

source code 

Solves symmetric-positive-def. generalized eigenvalue problem Az=lBz.

Takes two real-valued symmetric matrices A and B (B must also be positive-definite) and returns the corresponding (also real-valued) eigenvalues and eigenvectors.

Return format: as in scipy.linalg.eig, tuple (l, Z); l is taken from eigh output (a 1-dim array of length A.shape[0] ?) ordered ascending, and Z is an array or matrix (depending on type of input A) with the corresponding eigenvectors in columns (hopefully).

Steps:

  1. get lower triang Choleski factor of B: L*L.T = B <=> A (LL^-1)' z = l LL' z <=> (L^-1 A L^-1') (L'z) = l (L'z)
  2. standard eig problem, with same eigvals l
  3. premultiply eigvecs L'z by L^-1' to get z

vecm2varcoeffs(gammas, maxlag, alpha, beta)

source code 

Converts Vecm coeffs to levels VAR representation.

Gammas need to be coeffs in shape #endo x (maxlag-1)*#endo, such that contemp_diff = alpha*ect + Gammas * lagged_diffs is okay when contemp_diff is #endo x 1. We expect matrix input!

gammas2alternativegammas(gammas, alpha, beta)

source code 

Converts Vecm-coeffs for ect at t-1 to the ones for ect at t-maxlag.

The input gammas (shortrun coeffs) refer to a Vecm where the levels are 
 lagged one period. In the alternative representation with the levels 
 lagged maxlag periods the shortrun coeffs are different; the relation is:
     alt_gamma_i = alpha * beta' + gamma_i

Actually with numpy's broadcasting the function is a one-liner so this here
 is mainly for documentation and reference purposes.
In terms of the levels VAR coefficients A_i (i=1..maxlag) the gammas are
 defined as:
     gamma_i = - \sum_{j=i+1)^maxlag A_j for i=1..maxlag-1;
 and the alternative gammas (used py Proietti e.g.) are:
     alt_gamma_i = -I + \sum_{j=1}^i A_j for i=1..maxlag-1.
 (And lpha eta' = -I + \sum_{j=1}^maxlag A_j.)

write_gretl_mat_xml(outfile, matrices, matnames=[])

source code 

Writes a gretl matrix xml file to transfer matrices.

outfile should be a path string,
matrices is a list of numpy matrices, 
matnames is a string list of wanted matrix names (if empty, matrices
 are named m1, m2, etc.)

autocovar(series, LagInput, Demeaned=False)

source code 

Computes the autocovariance of a uni- or multivariate time series.

Usage: autocovar(series, Lag [, Demeaned=False]) returns the NxN autocovariance matrix (even for N=1), where series is an TxN matrix holding the N-variable T-period data (early periods first), and Lag specifies the lag at which to compute the autocovariance. Specify Demeaned=True if passing ols-residuals to avoid double demeaning. Returns a numpy-matrix-type.

longrunvar(series, Demeaned=False, LagTrunc=4)

source code 

Estimates the long-run variance (aka spectral density at frequency zero) of a uni- or multivariate time series.

Usage: lrv = longrunvar(series [, Demeaned, LagTrunc]), where series is a TxN matrix holding the N-variable T-period data (early periods first). The Bartlett weighting function is used up to the specified lag truncation (default = 4). Specify Demeaned=True when passing Ols-residuals etc. (default False). Returns an NxN matrix (even for N=1).

commontrendstest(series, LagTrunc=4, determ='c', breakpoint=0.5)

source code 

The Nyblom&Harvey(2000)-type tests for K_0 against K>K_0
common stochastic trends in time series.

Usage: 
commontrendstest(series [, LagTrunc, Deterministics, breakpoint])
 returns a 1d N-array with the test statistics (partial sums of relevant
 eigenvalues), starting with the null hypothesis K_0=N-1 and ending with 
 K_0=0.

Input:
TxN array of data in series (early periods first).

LagTrunc:
determines the truncation lag of the nonparametric estimate of the
 longrun variance.

Deterministics:
'c' - constant mean,
't' - to automatically de-trend the data (linearly),

or use one of the following models with (one-time) deterministic shifts
(see Busetti 2002):
'1' - (a string!) for a level shift w/o trend,
'2' - for a model with breaks in the mean and the trend slope,
'2a' - for a trend model where only the mean shifts.
(Case 2b --broken trends with connected segments-- is not implemented.)

For these models '1' through '2a' the relative breakpoint in the sample can
 be chosen (otherwise it is ignored).


Variables Details [hide private]

month2quarter

Value:
{1: 1,
 2: 1,
 3: 1,
 4: 2,
 5: 2,
 6: 2,
 7: 3,
 8: 3,
...

mNumber2mFloat

Value:
{1: 0.0,
 2: 0.0833,
 3: 0.1666,
 4: 0.2499,
 5: 0.3332,
 6: 0.4165,
 7: 0.4998,
 8: 0.5831,
...

mFloat2mNumber

Value:
{0.0: 1,
 0.0833: 2,
 0.1666: 3,
 0.2499: 4,
 0.3332: 5,
 0.4165: 6,
 0.4998: 7,
 0.5831: 8,
...