Multivariate¶
|
Multivariate normal log-likelihood. |
|
Matrix-valued normal log-likelihood. |
|
Multivariate normal log-likelihood with Kronecker-structured covariance. |
|
Multivariate Student-T log-likelihood. |
|
Wishart log-likelihood. |
|
Wrapper function for covariance matrix with LKJ distributed correlations. |
|
The LKJ (Lewandowski, Kurowicka and Joe) log-likelihood. |
|
Multinomial log-likelihood. |
|
Dirichlet log-likelihood. |
|
Dirichlet Multinomial log-likelihood. |
-
class
pymc3.distributions.multivariate.
Dirichlet
(name, *args, **kwargs)¶ Dirichlet log-likelihood.
\[f(\mathbf{x}|\mathbf{a}) = \frac{\Gamma(\sum_{i=1}^k a_i)}{\prod_{i=1}^k \Gamma(a_i)} \prod_{i=1}^k x_i^{a_i - 1}\]Support
\(x_i \in (0, 1)\) for \(i \in \{1, \ldots, K\}\) such that \(\sum x_i = 1\)
Mean
\(\dfrac{a_i}{\sum a_i}\)
Variance
\(\dfrac{a_i - \sum a_0}{a_0^2 (a_0 + 1)}\) where \(a_0 = \sum a_i\)
- Parameters
- a: array
Concentration parameters (a > 0).
-
logp
(value)¶ Calculate log-probability of Dirichlet distribution at specified value.
- Parameters
- value: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Dirichlet distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
DirichletMultinomial
(name, *args, **kwargs)¶ Dirichlet Multinomial log-likelihood.
Dirichlet mixture of Multinomials distribution, with a marginalized PMF.
\[\]- f(x mid n, a) = frac{Gamma(n + 1)Gamma(sum a_k)}
{Gamma(n + sum a_k)}
prod_{k=1}^K frac{Gamma(x_k + a_k)}
{Gamma(x_k + 1)Gamma(a_k)}
Support
\(x \in \{0, 1, \ldots, n\}\) such that \(\sum x_i = n\)
Mean
\(n \frac{a_i}{\sum{a_k}}\)
- Parameters
- nint or array
Total counts in each replicate. If n is an array its shape must be (N,) with N = a.shape[0]
- aone- or two-dimensional array
Dirichlet parameter. Elements must be strictly positive. The number of categories is given by the length of the last axis.
- shapeinteger tuple
Describes shape of distribution. For example if n=array([5, 10]), and a=array([1, 1, 1]), shape should be (2, 3).
-
logp
(value)¶ Calculate log-probability of DirichletMultinomial distribution at specified value.
- Parameters
- value: integer array
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Dirichlet-Multinomial distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
KroneckerNormal
(name, *args, **kwargs)¶ Multivariate normal log-likelihood with Kronecker-structured covariance.
\[f(x \mid \mu, K) = \frac{1}{(2\pi |K|)^{1/2}} \exp\left\{ -\frac{1}{2} (x-\mu)^{\prime} K^{-1} (x-\mu) \right\}\]Support
\(x \in \mathbb{R}^N\)
Mean
\(\mu\)
Variance
\(K = \bigotimes K_i\) + sigma^2 I_N
- Parameters
- mu: array
Vector of means, just as in MvNormal.
- covs: list of arrays
The set of covariance matrices \([K_1, K_2, ...]\) to be Kroneckered in the order provided \(\bigotimes K_i\).
- chols: list of arrays
The set of lower cholesky matrices \([L_1, L_2, ...]\) such that \(K_i = L_i L_i'\).
- evds: list of tuples
The set of eigenvalue-vector, eigenvector-matrix pairs \([(v_1, Q_1), (v_2, Q_2), ...]\) such that \(K_i = Q_i \text{diag}(v_i) Q_i'\). For example:
v_i, Q_i = tt.nlinalg.eigh(K_i)
- sigma: scalar, variable
Standard deviation of the Gaussian white noise.
References
- 1
Saatchi, Y. (2011). “Scalable inference for structured Gaussian process models”
Examples
Define a multivariate normal variable with a covariance \(K = K_1 \otimes K_2\)
K1 = np.array([[1., 0.5], [0.5, 2]]) K2 = np.array([[1., 0.4, 0.2], [0.4, 2, 0.3], [0.2, 0.3, 1]]) covs = [K1, K2] N = 6 mu = np.zeros(N) with pm.Model() as model: vals = pm.KroneckerNormal('vals', mu=mu, covs=covs, shape=N)
Effeciency gains are made by cholesky decomposing \(K_1\) and \(K_2\) individually rather than the larger \(K\) matrix. Although only two matrices \(K_1\) and \(K_2\) are shown here, an arbitrary number of submatrices can be combined in this way. Choleskys and eigendecompositions can be provided instead
chols = [np.linalg.cholesky(Ki) for Ki in covs] evds = [np.linalg.eigh(Ki) for Ki in covs] with pm.Model() as model: vals2 = pm.KroneckerNormal('vals2', mu=mu, chols=chols, shape=N) # or vals3 = pm.KroneckerNormal('vals3', mu=mu, evds=evds, shape=N)
neither of which will be converted. Diagonal noise can also be added to the covariance matrix, \(K = K_1 \otimes K_2 + \sigma^2 I_N\). Despite the noise removing the overall Kronecker structure of the matrix, KroneckerNormal can continue to make efficient calculations by utilizing eigendecompositons of the submatrices behind the scenes [1]. Thus,
sigma = 0.1 with pm.Model() as noise_model: vals = pm.KroneckerNormal('vals', mu=mu, covs=covs, sigma=sigma, shape=N) vals2 = pm.KroneckerNormal('vals2', mu=mu, chols=chols, sigma=sigma, shape=N) vals3 = pm.KroneckerNormal('vals3', mu=mu, evds=evds, sigma=sigma, shape=N)
are identical, with covs and chols each converted to eigendecompositions.
-
logp
(value)¶ Calculate log-probability of Multivariate Normal distribution with Kronecker-structured covariance at specified value.
- Parameters
- value: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Multivariate Normal distribution with Kronecker-structured covariance.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
pymc3.distributions.multivariate.
LKJCholeskyCov
(name, eta, n, sd_dist, compute_corr=False, store_in_trace=True, *args, **kwargs)¶ Wrapper function for covariance matrix with LKJ distributed correlations.
This defines a distribution over Cholesky decomposed covariance matrices, such that the underlying correlation matrices follow an LKJ distribution [1] and the standard deviations follow an arbitray distribution specified by the user.
- Parameters
- name: str
The name given to the variable in the model.
- eta: float
The shape parameter (eta > 0) of the LKJ distribution. eta = 1 implies a uniform distribution of the correlation matrices; larger values put more weight on matrices with few correlations.
- n: int
Dimension of the covariance matrix (n > 1).
- sd_dist: pm.Distribution
A distribution for the standard deviations.
- compute_corr: bool, default=False
If True, returns three values: the Cholesky decomposition, the correlations and the standard deviations of the covariance matrix. Otherwise, only returns the packed Cholesky decomposition. Defaults to False to ensure backwards compatibility.
- store_in_trace: bool, default=True
Whether to store the correlations and standard deviations of the covariance matrix in the posterior trace. If True, they will automatically be named as {name}_corr and {name}_stds respectively. Effective only when compute_corr=True.
- Returns
- packed_chol: TensorVariable
If compute_corr=False (default). The packed Cholesky covariance decomposition.
- chol: TensorVariable
If compute_corr=True. The unpacked Cholesky covariance decomposition.
- corr: TensorVariable
If compute_corr=True. The correlations of the covariance matrix.
- stds: TensorVariable
If compute_corr=True. The standard deviations of the covariance matrix.
Notes
Since the Cholesky factor is a lower triangular matrix, we use packed storage for the matrix: We store the values of the lower triangular matrix in a one-dimensional array, numbered by row:
[[0 - - -] [1 2 - -] [3 4 5 -] [6 7 8 9]]
The unpacked Cholesky covariance matrix is automatically computed and returned when you specify compute_corr=True in pm.LKJCholeskyCov (see example below). Otherwise, you can use pm.expand_packed_triangular(packed_cov, lower=True) to convert the packed Cholesky matrix to a regular two-dimensional array.
References
- 1
Lewandowski, D., Kurowicka, D. and Joe, H. (2009). “Generating random correlation matrices based on vines and extended onion method.” Journal of multivariate analysis, 100(9), pp.1989-2001.
- 2
J. M. isn’t a mathematician (http://math.stackexchange.com/users/498/ j-m-isnt-a-mathematician), Different approaches to evaluate this determinant, URL (version: 2012-04-14): http://math.stackexchange.com/q/130026
Examples
with pm.Model() as model: # Note that we access the distribution for the standard # deviations, and do not create a new random variable. sd_dist = pm.Exponential.dist(1.0) chol, corr, sigmas = pm.LKJCholeskyCov('chol_cov', eta=4, n=10, sd_dist=sd_dist, compute_corr=True) # if you only want the packed Cholesky (default behavior): # packed_chol = pm.LKJCholeskyCov('chol_cov', eta=4, n=10, sd_dist=sd_dist) # chol = pm.expand_packed_triangular(10, packed_chol, lower=True) # Define a new MvNormal with the given covariance vals = pm.MvNormal('vals', mu=np.zeros(10), chol=chol, shape=10) # Or transform an uncorrelated normal: vals_raw = pm.Normal('vals_raw', mu=0, sigma=1, shape=10) vals = tt.dot(chol, vals_raw) # Or compute the covariance matrix cov = tt.dot(chol, chol.T)
Implementation In the unconstrained space all values of the cholesky factor are stored untransformed, except for the diagonal entries, where we use a log-transform to restrict them to positive values.
To correctly compute log-likelihoods for the standard deviations and the correlation matrix seperatly, we need to consider a second transformation: Given a cholesky factorization \(LL^T = \Sigma\) of a covariance matrix we can recover the standard deviations \(\sigma\) as the euclidean lengths of the rows of \(L\), and the cholesky factor of the correlation matrix as \(U = \text{diag}(\sigma)^{-1}L\). Since each row of \(U\) has length 1, we do not need to store the diagonal. We define a transformation \(\phi\) such that \(\phi(L)\) is the lower triangular matrix containing the standard deviations \(\sigma\) on the diagonal and the correlation matrix \(U\) below. In this form we can easily compute the different likelihoods separately, as the likelihood of the correlation matrix only depends on the values below the diagonal, and the likelihood of the standard deviation depends only on the diagonal values.
We still need the determinant of the jacobian of \(\phi^{-1}\). If we think of \(\phi\) as an automorphism on \(\mathbb{R}^{\tfrac{n(n+1)}{2}}\), where we order the dimensions as described in the notes above, the jacobian is a block-diagonal matrix, where each block corresponds to one row of \(U\). Each block has arrowhead shape, and we can compute the determinant of that as described in [2]. Since the determinant of a block-diagonal matrix is the product of the determinants of the blocks, we get
\[\text{det}(J_{\phi^{-1}}(U)) = \left[ \prod_{i=2}^N u_{ii}^{i - 1} L_{ii} \right]^{-1}\]
-
class
pymc3.distributions.multivariate.
LKJCorr
(name, *args, **kwargs)¶ The LKJ (Lewandowski, Kurowicka and Joe) log-likelihood.
The LKJ distribution is a prior distribution for correlation matrices. If eta = 1 this corresponds to the uniform distribution over correlation matrices. For eta -> oo the LKJ prior approaches the identity matrix.
Support
Upper triangular matrix with values in [-1, 1]
- Parameters
- n: int
Dimension of the covariance matrix (n > 1).
- eta: float
The shape parameter (eta > 0) of the LKJ distribution. eta = 1 implies a uniform distribution of the correlation matrices; larger values put more weight on matrices with few correlations.
Notes
This implementation only returns the values of the upper triangular matrix excluding the diagonal. Here is a schematic for n = 5, showing the indexes of the elements:
[[- 0 1 2 3] [- - 4 5 6] [- - - 7 8] [- - - - 9] [- - - - -]]
References
- LKJ2009
Lewandowski, D., Kurowicka, D. and Joe, H. (2009). “Generating random correlation matrices based on vines and extended onion method.” Journal of multivariate analysis, 100(9), pp.1989-2001.
-
logp
(x)¶ Calculate log-probability of LKJ distribution at specified value.
- Parameters
- x: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from LKJ distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
MatrixNormal
(name, *args, **kwargs)¶ Matrix-valued normal log-likelihood.
\[f(x \mid \mu, U, V) = \frac{1}{(2\pi^{m n} |U|^n |V|^m)^{1/2}} \exp\left\{ -\frac{1}{2} \mathrm{Tr}[ V^{-1} (x-\mu)^{\prime} U^{-1} (x-\mu)] \right\}\]Support
\(x \in \mathbb{R}^{m \times n}\)
Mean
\(\mu\)
Row Variance
\(U\)
Column Variance
\(V\)
- Parameters
- mu: array
Array of means. Must be broadcastable with the random variable X such that the shape of mu + X is (m,n).
- rowcov: mxm array
Among-row covariance matrix. Defines variance within columns. Exactly one of rowcov or rowchol is needed.
- rowchol: mxm array
Cholesky decomposition of among-row covariance matrix. Exactly one of rowcov or rowchol is needed.
- colcov: nxn array
Among-column covariance matrix. If rowcov is the identity matrix, this functions as cov in MvNormal. Exactly one of colcov or colchol is needed.
- colchol: nxn array
Cholesky decomposition of among-column covariance matrix. Exactly one of colcov or colchol is needed.
Examples
Define a matrixvariate normal variable for given row and column covariance matrices:
colcov = np.array([[1., 0.5], [0.5, 2]]) rowcov = np.array([[1, 0, 0], [0, 4, 0], [0, 0, 16]]) m = rowcov.shape[0] n = colcov.shape[0] mu = np.zeros((m, n)) vals = pm.MatrixNormal('vals', mu=mu, colcov=colcov, rowcov=rowcov, shape=(m, n))
Above, the ith row in vals has a variance that is scaled by 4^i. Alternatively, row or column cholesky matrices could be substituted for either covariance matrix. The MatrixNormal is quicker way compute MvNormal(mu, np.kron(rowcov, colcov)) that takes advantage of kronecker product properties for inversion. For example, if draws from MvNormal had the same covariance structure, but were scaled by different powers of an unknown constant, both the covariance and scaling could be learned as follows (see the docstring of LKJCholeskyCov for more information about this)
# Setup data true_colcov = np.array([[1.0, 0.5, 0.1], [0.5, 1.0, 0.2], [0.1, 0.2, 1.0]]) m = 3 n = true_colcov.shape[0] true_scale = 3 true_rowcov = np.diag([true_scale**(2*i) for i in range(m)]) mu = np.zeros((m, n)) true_kron = np.kron(true_rowcov, true_colcov) data = np.random.multivariate_normal(mu.flatten(), true_kron) data = data.reshape(m, n) with pm.Model() as model: # Setup right cholesky matrix sd_dist = pm.HalfCauchy.dist(beta=2.5, shape=3) colchol_packed = pm.LKJCholeskyCov('colcholpacked', n=3, eta=2, sd_dist=sd_dist) colchol = pm.expand_packed_triangular(3, colchol_packed) # Setup left covariance matrix scale = pm.Lognormal('scale', mu=np.log(true_scale), sigma=0.5) rowcov = tt.nlinalg.diag([scale**(2*i) for i in range(m)]) vals = pm.MatrixNormal('vals', mu=mu, colchol=colchol, rowcov=rowcov, observed=data, shape=(m, n))
-
logp
(value)¶ Calculate log-probability of Matrix-valued Normal distribution at specified value.
- Parameters
- value: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Matrix-valued Normal distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
Multinomial
(name, *args, **kwargs)¶ Multinomial log-likelihood.
Generalizes binomial distribution, but instead of each trial resulting in “success” or “failure”, each one results in exactly one of some fixed finite number k of possible outcomes over n independent trials. ‘x[i]’ indicates the number of times outcome number i was observed over the n trials.
\[f(x \mid n, p) = \frac{n!}{\prod_{i=1}^k x_i!} \prod_{i=1}^k p_i^{x_i}\]Support
\(x \in \{0, 1, \ldots, n\}\) such that \(\sum x_i = n\)
Mean
\(n p_i\)
Variance
\(n p_i (1 - p_i)\)
Covariance
\(-n p_i p_j\) for \(i \ne j\)
- Parameters
- n: int or array
Number of trials (n > 0). If n is an array its shape must be (N,) with N = p.shape[0]
- p: one- or two-dimensional array
Probability of each one of the different outcomes. Elements must be non-negative and sum to 1 along the last axis. They will be automatically rescaled otherwise.
-
logp
(x)¶ Calculate log-probability of Multinomial distribution at specified value.
- Parameters
- x: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Multinomial distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
MvNormal
(name, *args, **kwargs)¶ Multivariate normal log-likelihood.
\[f(x \mid \pi, T) = \frac{|T|^{1/2}}{(2\pi)^{k/2}} \exp\left\{ -\frac{1}{2} (x-\mu)^{\prime} T (x-\mu) \right\}\]Support
\(x \in \mathbb{R}^k\)
Mean
\(\mu\)
Variance
\(T^{-1}\)
- Parameters
- mu: array
Vector of means.
- cov: array
Covariance matrix. Exactly one of cov, tau, or chol is needed.
- tau: array
Precision matrix. Exactly one of cov, tau, or chol is needed.
- chol: array
Cholesky decomposition of covariance matrix. Exactly one of cov, tau, or chol is needed.
- lower: bool, default=True
Whether chol is the lower tridiagonal cholesky factor.
Examples
Define a multivariate normal variable for a given covariance matrix:
cov = np.array([[1., 0.5], [0.5, 2]]) mu = np.zeros(2) vals = pm.MvNormal('vals', mu=mu, cov=cov, shape=(5, 2))
Most of the time it is preferable to specify the cholesky factor of the covariance instead. For example, we could fit a multivariate outcome like this (see the docstring of LKJCholeskyCov for more information about this):
mu = np.zeros(3) true_cov = np.array([[1.0, 0.5, 0.1], [0.5, 2.0, 0.2], [0.1, 0.2, 1.0]]) data = np.random.multivariate_normal(mu, true_cov, 10) sd_dist = pm.Exponential.dist(1.0, shape=3) chol, corr, stds = pm.LKJCholeskyCov('chol_cov', n=3, eta=2, sd_dist=sd_dist, compute_corr=True) vals = pm.MvNormal('vals', mu=mu, chol=chol, observed=data)
For unobserved values it can be better to use a non-centered parametrization:
sd_dist = pm.Exponential.dist(1.0, shape=3) chol, _, _ = pm.LKJCholeskyCov('chol_cov', n=3, eta=2, sd_dist=sd_dist, compute_corr=True) vals_raw = pm.Normal('vals_raw', mu=0, sigma=1, shape=(5, 3)) vals = pm.Deterministic('vals', tt.dot(chol, vals_raw.T).T)
-
logp
(value)¶ Calculate log-probability of Multivariate Normal distribution at specified value.
- Parameters
- value: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Multivariate Normal distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
MvStudentT
(name, *args, **kwargs)¶ Multivariate Student-T log-likelihood.
\[f(\mathbf{x}| \nu,\mu,\Sigma) = \frac {\Gamma\left[(\nu+p)/2\right]} {\Gamma(\nu/2)\nu^{p/2}\pi^{p/2} \left|{\Sigma}\right|^{1/2} \left[ 1+\frac{1}{\nu} ({\mathbf x}-{\mu})^T {\Sigma}^{-1}({\mathbf x}-{\mu}) \right]^{-(\nu+p)/2}}\]Support
\(x \in \mathbb{R}^p\)
Mean
\(\mu\) if \(\nu > 1\) else undefined
Variance
- \(\frac{\nu}{\mu-2}\Sigma\)
if \(\nu>2\) else undefined
- Parameters
- nu: int
Degrees of freedom.
- Sigma: matrix
Covariance matrix. Use cov in new code.
- mu: array
Vector of means.
- cov: matrix
The covariance matrix.
- tau: matrix
The precision matrix.
- chol: matrix
The cholesky factor of the covariance matrix.
- lower: bool, default=True
Whether the cholesky fatcor is given as a lower triangular matrix.
-
logp
(value)¶ Calculate log-probability of Multivariate Student’s T distribution at specified value.
- Parameters
- value: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Multivariate Student’s T distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.multivariate.
Wishart
(name, *args, **kwargs)¶ Wishart log-likelihood.
The Wishart distribution is the probability distribution of the maximum-likelihood estimator (MLE) of the precision matrix of a multivariate normal distribution. If V=1, the distribution is identical to the chi-square distribution with nu degrees of freedom.
\[f(X \mid nu, T) = \frac{{\mid T \mid}^{nu/2}{\mid X \mid}^{(nu-k-1)/2}}{2^{nu k/2} \Gamma_p(nu/2)} \exp\left\{ -\frac{1}{2} Tr(TX) \right\}\]where \(k\) is the rank of \(X\).
Support
\(X(p x p)\) positive definite matrix
Mean
\(nu V\)
Variance
\(nu (v_{ij}^2 + v_{ii} v_{jj})\)
- Parameters
- nu: int
Degrees of freedom, > 0.
- V: array
p x p positive definite matrix.
Notes
This distribution is unusable in a PyMC3 model. You should instead use LKJCholeskyCov or LKJCorr.
-
logp
(X)¶ Calculate log-probability of Wishart distribution at specified value.
- Parameters
- X: numeric
Value for which log-probability is calculated.
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from Wishart distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
pymc3.distributions.multivariate.
WishartBartlett
(name, S, nu, is_cholesky=False, return_cholesky=False, testval=None)¶ Bartlett decomposition of the Wishart distribution. As the Wishart distribution requires the matrix to be symmetric positive semi-definite it is impossible for MCMC to ever propose acceptable matrices.
Instead, we can use the Barlett decomposition which samples a lower diagonal matrix. Specifically:
\[ \begin{align}\begin{aligned}\begin{split}\text{If} L \sim \begin{pmatrix} \sqrt{c_1} & 0 & 0 \\ z_{21} & \sqrt{c_2} & 0 \\ z_{31} & z_{32} & \sqrt{c_3} \end{pmatrix}\end{split}\\\begin{split}\text{with} c_i \sim \chi^2(n-i+1) \text{ and } n_{ij} \sim \mathcal{N}(0, 1), \text{then} \\ L \times A \times A.T \times L.T \sim \text{Wishart}(L \times L.T, \nu)\end{split}\end{aligned}\end{align} \]See http://en.wikipedia.org/wiki/Wishart_distribution#Bartlett_decomposition for more information.
- Parameters
- S: ndarray
p x p positive definite matrix Or: p x p lower-triangular matrix that is the Cholesky factor of the covariance matrix.
- nu: int
Degrees of freedom, > dim(S).
- is_cholesky: bool (default=False)
Input matrix S is already Cholesky decomposed as S.T * S
- return_cholesky: bool (default=False)
Only return the Cholesky decomposed matrix.
- testval: ndarray
p x p positive definite matrix used to initialize
Notes
This is not a standard Distribution class but follows a similar interface. Besides the Wishart distribution, it will add RVs name_c and name_z to your model which make up the matrix.
This distribution is usually a bad idea to use as a prior for multivariate normal. You should instead use LKJCholeskyCov or LKJCorr.