Mixture¶
|
Mixture log-likelihood |
|
Normal mixture log-likelihood |
|
Mixture Same Family log-likelihood This distribution handles mixtures of multivariate distributions in a vectorized manner. |
-
class
pymc3.distributions.mixture.
Mixture
(name, *args, **kwargs)¶ Mixture log-likelihood
Often used to model subpopulation heterogeneity
\[f(x \mid w, \theta) = \sum_{i = 1}^n w_i f_i(x \mid \theta_i)\]Support
\(\cup_{i = 1}^n \textrm{support}(f_i)\)
Mean
\(\sum_{i = 1}^n w_i \mu_i\)
- Parameters
- w: array of floats
w >= 0 and w <= 1 the mixture weights
- comp_dists: multidimensional PyMC3 distribution (e.g. `pm.Poisson.dist(…)`)
or iterable of PyMC3 distributions the component distributions \(f_1, \ldots, f_n\)
Examples
# 2-Mixture Poisson distribution with pm.Model() as model: lam = pm.Exponential('lam', lam=1, shape=(2,)) # `shape=(2,)` indicates two mixture components. # As we just need the logp, rather than add a RV to the model, we need to call .dist() components = pm.Poisson.dist(mu=lam, shape=(2,)) w = pm.Dirichlet('w', a=np.array([1, 1])) # two mixture component weights. like = pm.Mixture('like', w=w, comp_dists=components, observed=data) # 2-Mixture Poisson using iterable of distributions. with pm.Model() as model: lam1 = pm.Exponential('lam1', lam=1) lam2 = pm.Exponential('lam2', lam=1) pois1 = pm.Poisson.dist(mu=lam1) pois2 = pm.Poisson.dist(mu=lam2) w = pm.Dirichlet('w', a=np.array([1, 1])) like = pm.Mixture('like', w=w, comp_dists = [pois1, pois2], observed=data) # npop-Mixture of multidimensional Gaussian npop = 5 nd = (3, 4) with pm.Model() as model: mu = pm.Normal('mu', mu=np.arange(npop), sigma=1, shape=npop) # Each component has an independent mean w = pm.Dirichlet('w', a=np.ones(npop)) components = pm.Normal.dist(mu=mu, sigma=1, shape=nd + (npop,)) # nd + (npop,) shaped multinomial like = pm.Mixture('like', w=w, comp_dists = components, observed=data, shape=nd) # The resulting mixture is nd-shaped # Multidimensional Mixture as stacked independent mixtures with pm.Model() as model: mu = pm.Normal('mu', mu=np.arange(5), sigma=1, shape=5) # Each component has an independent mean w = pm.Dirichlet('w', a=np.ones(3, 5)) # w is a stack of 3 independent 5 component weight arrays components = pm.Normal.dist(mu=mu, sigma=1, shape=(3, 5)) # The mixture is an array of 3 elements. # Each can be thought of as an independent scalar mixture of 5 components like = pm.Mixture('like', w=w, comp_dists = components, observed=data, shape=3)
-
infer_comp_dist_shapes
(point=None)¶ Try to infer the shapes of the component distributions, comp_dists, and how they should broadcast together. The behavior is slightly different if comp_dists is a Distribution as compared to when it is a list of Distribution`s. When it is a list the following procedure is repeated for each element in the list: 1. Look up the `comp_dists.shape 2. If it is not empty, use it as comp_dist_shape 3. If it is an empty tuple, a single random sample is drawn by calling comp_dists.random(point=point, size=None), and the returned test_sample’s shape is used as the inferred comp_dists.shape
- Parameters
- point: None or dict (optional)
Dictionary that maps rv names to values, to supply to self.comp_dists.random
- Returns
- comp_dist_shapes: shape tuple or list of shape tuples.
If comp_dists is a Distribution, it is a shape tuple of the inferred distribution shape. If comp_dists is a list of Distribution`s, it is a list of shape tuples inferred for each element in `comp_dists
- broadcast_shape: shape tuple
The shape that results from broadcasting all component’s shapes together.
-
logp
(value)¶ Calculate log-probability of defined Mixture distribution at specified value.
- Parameters
- value: numeric
Value(s) for which log-probability is calculated. If the log probabilities for multiple values are desired the values must be provided in a numpy array or theano tensor
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from defined Mixture distribution.
- Parameters
- point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- size: int, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.mixture.
MixtureSameFamily
(name, *args, **kwargs)¶ Mixture Same Family log-likelihood This distribution handles mixtures of multivariate distributions in a vectorized manner. It is used over Mixture distribution when the mixture components are not present on the last axis of components’ distribution.
Support
\(\textrm{support}(f)\)
Mean
\(w\mu\)
- Parameters
- w: array of floats
w >= 0 and w <= 1 the mixture weights
- comp_dists: PyMC3 distribution (e.g. `pm.Multinomial.dist(…)`)
The comp_dists can be scalar or multidimensional distribution. Assuming its shape to be - (i_0, …, i_n, mixture_axis, i_n+1, …, i_N), the mixture_axis is consumed resulting in the shape of mixture as - (i_0, …, i_n, i_n+1, …, i_N).
- mixture_axis: int, default = -1
Axis representing the mixture components to be reduced in the mixture.
Notes
The default behaviour resembles Mixture distribution wherein the last axis of component distribution is reduced.
-
logp
(value)¶ Calculate log-probability of defined
MixtureSameFamily
distribution at specified value.- Parameters
- valuenumeric
Value(s) for which log-probability is calculated. If the log probabilities for multiple values are desired the values must be provided in a numpy array or theano tensor
- Returns
- TensorVariable
-
random
(point=None, size=None)¶ Draw random values from defined
MixtureSameFamily
distribution.- Parameters
- pointdict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
- sizeint, optional
Desired size of random sample (returns one sample if not specified).
- Returns
- array
-
class
pymc3.distributions.mixture.
NormalMixture
(name, *args, **kwargs)¶ Normal mixture log-likelihood
\[f(x \mid w, \mu, \sigma^2) = \sum_{i = 1}^n w_i N(x \mid \mu_i, \sigma^2_i)\]Support
\(x \in \mathbb{R}\)
Mean
\(\sum_{i = 1}^n w_i \mu_i\)
Variance
\(\sum_{i = 1}^n w_i^2 \sigma^2_i\)
- Parameters
- w: array of floats
w >= 0 and w <= 1 the mixture weights
- mu: array of floats
the component means
- sigma: array of floats
the component standard deviations
- tau: array of floats
the component precisions
- comp_shape: shape of the Normal component
notice that it should be different than the shape of the mixture distribution, with one axis being the number of components.
Notes
You only have to pass in sigma or tau, but not both.
Examples
n_components = 3 with pm.Model() as gauss_mix: μ = pm.Normal( "μ", data.mean(), 10, shape=n_components, transform=pm.transforms.ordered, testval=[1, 2, 3], ) σ = pm.HalfNormal("σ", 10, shape=n_components) weights = pm.Dirichlet("w", np.ones(n_components)) pm.NormalMixture("y", w=weights, mu=μ, sigma=σ, observed=data)