Bayes Classifiers and Naive Bayes

IPython Notebook Tutorial

Bayes classifiers are simple probabilistic classification models based off of Bayes throerom. See the above tutorial for a full primer on how they work, and what the distinction between a naive Bayes classifier and a Bayes classifier is. Essentially, each class is modeled by a probability distribution and classifications are made according to what distribution fits the data the best. They are a supervised version of general mixture models, in that the predict, predict_proba, and predict_log_proba methods return the same values for the same underlying distributions, but that instead of using expectation-maximization to fit to new data they can use the provided labels directly.

Initialization

Bayes classifiers and naive Bayes can both be initialized in one of two ways depending on if you know the parameters of the model beforehand or not, (1) passing in a list of pre-initialized distributions to the model, or (2) using the from_samples class method to initialize the model directly from data. For naive Bayes models on multivariate data, the pre-initialized distributions must be a list of IndependentComponentDistribution objects since each dimension is modeled independently from the others. For Bayes classifiers on multivariate data a list of any type of multivariate distribution can be provided. For univariate data the two models produce identical results, and can be passed in a list of univariate distributions. For example:

from pomegranate import *
d1 = IndependentComponentsDistribution([NormalDistribution(5, 2), NormalDistribution(6, 1), NormalDistribution(9, 1)])
d2 = IndependentComponentsDistribution([NormalDistribution(2, 1), NormalDistribution(8, 1), NormalDistribution(5, 1)])
d3 = IndependentComponentsDistribution([NormalDistribution(3, 1), NormalDistribution(5, 3), NormalDistribution(4, 1)])
model = NaiveBayes([d1, d2, d3])

would create a three class naive Bayes classifier that modeled data with three dimensions. Alternatively, we can initialize a Bayes classifier in the following manner

from pomegranate import *
d1 = MultivariateGaussianDistribution([5, 6, 9], [[2, 0, 0], [0, 1, 0], [0, 0, 1]])
d2 = MultivariateGaussianDistribution([2, 8, 5], [[1, 0, 0], [0, 1, 0], [0, 0, 1]])
d3 = MultivariateGaussianDistribution([3, 5, 4], [[1, 0, 0], [0, 3, 0], [0, 0, 1]])
model = BayesClassifier([d1, d2, d3])

The two examples above functionally creatte the same model, as the Bayes classifier uses multivariate Gaussian distributions with the same means and a diagonal covariance matrix containing only the variances. However, if we were to fit these models to data later on, the Bayes classifier would learn a full covariance matrix while the naive Bayes would only learn the diagonal.

If we instead wish to initialize our model directly onto data, we use the from_samples class method.

from pomegranate import *
import numpy
X = numpy.load('data.npy')
y = numpy.load('labels.npy')
model = NaiveBayes.from_samples(NormalDistribution, X, y)

This would create a naive Bayes model directly from the data with normal distributions modeling each of the dimensions, and a number of components equal to the number of classes in y. Alternatively if we wanted to create a model with different distributions for each dimension we can do the following:

model = NaiveBayes.from_samples([NormalDistribution, ExponentialDistribution], X, y)

This assumes that your data is two dimensional and that you want to model the first distribution as a normal distribution and the second dimension as an exponential distribution.

We can do pretty much the same thing with Bayes classifiers, except passing in a more complex model.

model = BayesClassifier.from_samples(MultivariateGaussianDistribution, X, y)

One can use much more complex models than just a multivariate Gaussian with a full covariance matrix when using a Bayes classifier. Specifically, you can also have your distributions be general mixture models, hidden Markov models, and Bayesian networks. For example:

model = BayesClassifier.from_samples(BayesianNetwork, X, y)

That would require that the data is only discrete valued currently, and the structure learning task may be too long if not set appropriately. However, it is possible. Currently, one cannot simply put in GeneralMixtureModel or HiddenMarkovModel despite them having a from_samples method because there is a great deal of flexibility in terms of the structure or emission distributions. The easiest way to set up one of these more complex models is to build each of the components separately and then feed them into the Bayes classifier method using the first initialization method.

d1 = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, n_components=5, X=X[y==0])
d2 = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, n_components=5, X=X[y==1])
model = BayesClassifier([d1, d2])

Prediction

Bayes classifiers and naive Bayes supports the same three prediction methods that the other models support, predict, predict_proba, and predict_log_proba. These methods return the most likely class given the data (argmax_m P(M|D)), the probability of each class given the data (P(M|D)), and the log probability of each class given the data (log P(M|D)). It is best to always pass in a 2D matrix even for univariate data, where it would have a shape of (n, 1).

The predict method takes in samples and returns the most likely class given the data.

from pomegranate import *
model = NaiveBayes([NormalDistribution(5, 2), UniformDistribution(0, 10), ExponentialDistribution(1.0)])
model.predict( np.array([[0], [1], [2], [3], [4]]))
[2, 2, 2, 0, 0]

Calling predict_proba on five samples for a Naive Bayes with univariate components would look like the following.

from pomegranate import *
model = NaiveBayes([NormalDistribution(5, 2), UniformDistribution(0, 10), ExponentialDistribution(1)])
model.predict_proba(np.array([[0], [1], [2], [3], [4]]))
[[ 0.00790443  0.09019051  0.90190506]
 [ 0.05455011  0.20207126  0.74337863]
 [ 0.21579499  0.33322883  0.45097618]
 [ 0.44681566  0.36931382  0.18387052]
 [ 0.59804205  0.33973357  0.06222437]]

Multivariate models work the same way.

from pomegranate import *
d1 = MultivariateGaussianDistribution([5, 5], [[1, 0], [0, 1]])
d2 = IndependentComponentsDistribution([NormalDistribution(5, 2), NormalDistribution(5, 2)])
model = BayesClassifier([d1, d2])
clf.predict_proba(np.array([[0, 4],
                                                            [1, 3],
                                                            [2, 2],
                                                            [3, 1],
                                                            [4, 0]]))
array([[ 0.00023312,  0.99976688],
       [ 0.00220745,  0.99779255],
       [ 0.00466169,  0.99533831],
       [ 0.00220745,  0.99779255],
       [ 0.00023312,  0.99976688]])

predict_log_proba works the same way, returning the log probabilities instead of the probabilities.

Fitting

Both naive Bayes and Bayes classifiers also have a fit method that updates the parameters of the model based on new data. The major difference between these methods and the others presented is that these are supervised methods and so need to be passed labels in addition to data. This change propogates also to the summarize method, where labels are provided as well.

from pomegranate import *
d1 = MultivariateGaussianDistribution([5, 5], [[1, 0], [0, 1]])
d2 = IndependentComponentsDistribution(NormalDistribution(5, 2), NormalDistribution(5, 2)])
model = BayesClassifier([d1, d2])
X = np.array([[6.0, 5.0],
                          [3.5, 4.0],
                          [7.5, 1.5],
                              [7.0, 7.0 ]])
y = np.array([0, 0, 1, 1])
model.fit(X, y)

As we can see, there are four samples, with the first two samples labeled as class 0 and the last two samples labeled as class 1. Keep in mind that the training samples must match the input requirements for the models used. So if using a univariate distribution, then each sample must contain one item. A bivariate distribution, two. For hidden markov models, the sample can be a list of observations of any length. An example using hidden markov models would be the following.

d1 = HiddenMarkovModel...
d2 = HiddenMarkovModel...
d3 = HiddenMarkovModel...
model = BayesClassifier([d1, d2, d3])
X = np.array([list('HHHHHTHTHTTTTH'),
                                            list('HHTHHTTHHHHHTH'),
                                            list('TH'),
                                            list('HHHHT')])
y = np.array([2, 2, 1, 0])
model.fit(X, y)

API Reference

class pomegranate.NaiveBayes.NaiveBayes

A naive Bayes model, a supervised alternative to GMM.

A naive Bayes classifier, that treats each dimension independently from each other. This is a simpler version of the Bayes Classifier, that can use any distribution with any covariance structure, including Bayesian networks and hidden Markov models.

Parameters:

models : list

A list of initialized distributions.

weights : list or numpy.ndarray or None, default None

The prior probabilities of the components. If None is passed in then defaults to the uniformly distributed priors.

Examples

>>> from pomegranate import *
>>> X = [0, 2, 0, 1, 0, 5, 6, 5, 7, 6]
>>> y = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1]
>>> clf = NaiveBayes.from_samples(NormalDistribution, X, y)
>>> clf.predict_proba([6])
array([[0.01973451,  0.98026549]])
>>> from pomegranate import *
>>> clf = NaiveBayes([NormalDistribution(1, 2), NormalDistribution(0, 1)])
>>> clf.predict_log_proba([[0], [1], [2], [-1]])
array([[-1.1836569 , -0.36550972],
           [-0.79437677, -0.60122959],
           [-0.26751248, -1.4493653],
           [-1.09861229, -0.40546511]])

Attributes

models (list) The model objects, either initialized by the user or fit to data.
weights (numpy.ndarray) The prior probability of each component of the model.
clear_summaries()

Remove the stored sufficient statistics.

Parameters:None
Returns:None
copy()

Return a deep copy of this distribution object.

This object will not be tied to any other distribution or connected in any form.

Parameters:

None

Returns:

distribution : Distribution

A copy of the distribution with the same parameters.

fit()

Fit the Naive Bayes model to the data by passing data to their components.

Parameters:

X : numpy.ndarray or list

The dataset to operate on. For most models this is a numpy array with columns corresponding to features and rows corresponding to samples. For markov chains and HMMs this will be a list of variable length sequences.

y : numpy.ndarray or list or None, optional

Data labels for supervised training algorithms. Default is None

weights : array-like or None, shape (n_samples,), optional

The initial weights of each sample in the matrix. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

n_jobs : int

The number of jobs to use to parallelize, either the number of threads or the number of processes to use. Default is 1.

inertia : double, optional

Inertia used for the training the distributions.

pseudocount : double, optional

A pseudocount to add to the emission of each distribution. This effectively smoothes the states to prevent 0. probability symbols if they don’t happen to occur in the data. Default is 0.

stop_threshold : double, optional, positive

The threshold at which EM will terminate for the improvement of the model. If the model does not improve its fit of the data by a log probability of 0.1 then terminate. Only required if doing semisupervised learning. Default is 0.1.

max_iterations : int, optional, positive

The maximum number of iterations to run EM for. If this limit is hit then it will terminate training, regardless of how well the model is improving per iteration. Only required if doing semisupervised learning. Default is 1e8.

verbose : bool, optional

Whether or not to print out improvement information over iterations. Only required if doing semisupervised learning. Default is False.

Returns:

self : object

Returns the fitted model

freeze()

Freeze the distribution, preventing updates from occuring.

from_samples()

Create a mixture model directly from the given dataset.

First, k-means will be run using the given initializations, in order to define initial clusters for the points. These clusters are used to initialize the distributions used. Then, EM is run to refine the parameters of these distributions.

A homogenous mixture can be defined by passing in a single distribution callable as the first parameter and specifying the number of components, while a heterogeneous mixture can be defined by passing in a list of callables of the appropriate type.

Parameters:

distributions : array-like, shape (n_components,) or callable

The components of the model. If array, corresponds to the initial distributions of the components. If callable, must also pass in the number of components and kmeans++ will be used to initialize them.

n_components : int

If a callable is passed into distributions then this is the number of components to initialize using the kmeans++ algorithm.

X : array-like, shape (n_samples, n_dimensions)

This is the data to train on. Each row is a sample, and each column is a dimension to train on.

weights : array-like, shape (n_samples,), optional

The initial weights of each sample in the matrix. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

pseudocount : double, optional, positive

A pseudocount to add to the emission of each distribution. This

effectively smoothes the states to prevent 0. probability symbols

if they don’t happen to occur in the data. Only effects mixture

models defined over discrete distributions. Default is 0.

stop_threshold : double, optional, positive

The threshold at which EM will terminate for the improvement of the model. If the model does not improve its fit of the data by a log probability of 0.1 then terminate. Only required if doing semisupervised learning. Default is 0.1.

max_iterations : int, optional, positive

The maximum number of iterations to run EM for. If this limit is hit then it will terminate training, regardless of how well the model is improving per iteration. Only required if doing semisupervised learning. Default is 1e8.

verbose : bool, optional

Whether or not to print out improvement information over iterations. Only required if doing semisupervised learning. Default is False.

Returns:

model : NaiveBayes

The fit naive Bayes model.

from_summaries()

Fit the model to the collected sufficient statistics.

Fit the parameters of the model to the sufficient statistics gathered during the summarize calls. This should return an exact update.

Parameters:

inertia : double, optional

The weight of the previous parameters of the model. The new parameters will roughly be old_param*inertia + new_param*(1-inertia), so an inertia of 0 means ignore the old parameters, whereas an inertia of 1 means ignore the new parameters. Default is 0.0.

pseudocount : double, optional

A pseudocount to add to the emission of each distribution. This effectively smoothes the states to prevent 0. probability symbols if they don’t happen to occur in the data. If discrete data, will smooth both the prior probabilities of each component and the emissions of each component. Otherwise, will only smooth the prior probabilities of each component. Default is 0.

Returns:

None

log_probability()

Calculate the log probability of a point under the distribution.

The probability of a point is the sum of the probabilities of each distribution multiplied by the weights. Thus, the log probability is the sum of the log probability plus the log prior.

This is the python interface.

Parameters:

X : numpy.ndarray, shape=(n, d) or (n, m, d)

The samples to calculate the log probability of. Each row is a sample and each column is a dimension. If emissions are HMMs then shape is (n, m, d) where m is variable length for each obervation, and X becomes an array of n (m, d)-shaped arrays.

Returns:

log_probability : double

The log probabiltiy of the point under the distribution.

predict()

Predict the most likely component which generated each sample.

Calculate the posterior P(M|D) for each sample and return the index of the component most likely to fit it. This corresponds to a simple argmax over the responsibility matrix.

This is a sklearn wrapper for the maximum_a_posteriori method.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

y : array-like, shape (n_samples,)

The predicted component which fits the sample the best.

predict_log_proba()

Calculate the posterior log P(M|D) for data.

Calculate the log probability of each item having been generated from each component in the model. This returns normalized log probabilities such that the probabilities should sum to 1

This is a sklearn wrapper for the original posterior function.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

y : array-like, shape (n_samples, n_components)

The normalized log probability log P(M|D) for each sample. This is the probability that the sample was generated from each component.

predict_proba()

Calculate the posterior P(M|D) for data.

Calculate the probability of each item having been generated from each component in the model. This returns normalized probabilities such that each row should sum to 1.

Since calculating the log probability is much faster, this is just a wrapper which exponentiates the log probability matrix.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

probability : array-like, shape (n_samples, n_components)

The normalized probability P(M|D) for each sample. This is the probability that the sample was generated from each component.

probability()

Return the probability of the given symbol under this distribution.

Parameters:

symbol : object

The symbol to calculate the probability of

Returns:

probability : double

The probability of that point under the distribution.

sample()

Generate a sample from the model.

First, randomly select a component weighted by the prior probability, Then, use the sample method from that component to generate a sample.

Parameters:

n : int, optional

The number of samples to generate. Defaults to 1.

Returns:

sample : array-like or object

A randomly generated sample from the model of the type modelled by the emissions. An integer if using most distributions, or an array if using multivariate ones, or a string for most discrete distributions. If n=1 return an object, if n>1 return an array of the samples.

summarize()

Summarize data into stored sufficient statistics for out-of-core training.

Parameters:

X : array-like, shape (n_samples, variable)

Array of the samples, which can be either fixed size or variable depending on the underlying components.

y : array-like, shape (n_samples,)

Array of the known labels as integers

weights : array-like, shape (n_samples,) optional

Array of the weight of each sample, a positive float

n_jobs : int

The number of jobs to use to parallelize, either the number of threads or the number of processes to use. Default is 1.

Returns:

None

thaw()

Thaw the distribution, re-allowing updates to occur.

class pomegranate.BayesClassifier.BayesClassifier

A Naive Bayes model, a supervised alternative to GMM.

Parameters:

models : list or constructor

Must either be a list of initialized distribution/model objects, or the constructor for a distribution object:

  • Initialized : NaiveBayes([NormalDistribution(1, 2), NormalDistribution(0, 1)])
  • Constructor : NaiveBayes(NormalDistribution)

weights : list or numpy.ndarray or None, default None

The prior probabilities of the components. If None is passed in then defaults to the uniformly distributed priors.

Examples

>>> from pomegranate import *
>>> clf = NaiveBayes( NormalDistribution )
>>> X = [0, 2, 0, 1, 0, 5, 6, 5, 7, 6]
>>> y = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1]
>>> clf.fit(X, y)
>>> clf.predict_proba([6])
array([[ 0.01973451,  0.98026549]])
>>> from pomegranate import *
>>> clf = NaiveBayes([NormalDistribution(1, 2), NormalDistribution(0, 1)])
>>> clf.predict_log_proba([[0], [1], [2], [-1]])
array([[-1.1836569 , -0.36550972],
           [-0.79437677, -0.60122959],
           [-0.26751248, -1.4493653 ],
           [-1.09861229, -0.40546511]])

Attributes

models (list) The model objects, either initialized by the user or fit to data.
weights (numpy.ndarray) The prior probability of each component of the model.
clear_summaries()

Remove the stored sufficient statistics.

Parameters:None
Returns:None
copy()

Return a deep copy of this distribution object.

This object will not be tied to any other distribution or connected in any form.

Parameters:

None

Returns:

distribution : Distribution

A copy of the distribution with the same parameters.

fit()

Fit the Naive Bayes model to the data by passing data to their components.

Parameters:

X : numpy.ndarray or list

The dataset to operate on. For most models this is a numpy array with columns corresponding to features and rows corresponding to samples. For markov chains and HMMs this will be a list of variable length sequences.

y : numpy.ndarray or list or None, optional

Data labels for supervised training algorithms. Default is None

weights : array-like or None, shape (n_samples,), optional

The initial weights of each sample in the matrix. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

n_jobs : int

The number of jobs to use to parallelize, either the number of threads or the number of processes to use. Default is 1.

inertia : double, optional

Inertia used for the training the distributions.

pseudocount : double, optional

A pseudocount to add to the emission of each distribution. This effectively smoothes the states to prevent 0. probability symbols if they don’t happen to occur in the data. Default is 0.

stop_threshold : double, optional, positive

The threshold at which EM will terminate for the improvement of the model. If the model does not improve its fit of the data by a log probability of 0.1 then terminate. Only required if doing semisupervised learning. Default is 0.1.

max_iterations : int, optional, positive

The maximum number of iterations to run EM for. If this limit is hit then it will terminate training, regardless of how well the model is improving per iteration. Only required if doing semisupervised learning. Default is 1e8.

verbose : bool, optional

Whether or not to print out improvement information over iterations. Only required if doing semisupervised learning. Default is False.

Returns:

self : object

Returns the fitted model

freeze()

Freeze the distribution, preventing updates from occuring.

from_samples()

Create a mixture model directly from the given dataset.

First, k-means will be run using the given initializations, in order to define initial clusters for the points. These clusters are used to initialize the distributions used. Then, EM is run to refine the parameters of these distributions.

A homogenous mixture can be defined by passing in a single distribution callable as the first parameter and specifying the number of components, while a heterogeneous mixture can be defined by passing in a list of callables of the appropriate type.

Parameters:

distributions : array-like, shape (n_components,) or callable

The components of the model. If array, corresponds to the initial distributions of the components. If callable, must also pass in the number of components and kmeans++ will be used to initialize them.

n_components : int

If a callable is passed into distributions then this is the number of components to initialize using the kmeans++ algorithm.

X : array-like, shape (n_samples, n_dimensions)

This is the data to train on. Each row is a sample, and each column is a dimension to train on.

weights : array-like, shape (n_samples,), optional

The initial weights of each sample in the matrix. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

pseudocount : double, optional, positive

A pseudocount to add to the emission of each distribution. This

effectively smoothes the states to prevent 0. probability symbols

if they don’t happen to occur in the data. Only effects mixture

models defined over discrete distributions. Default is 0.

Returns:

model : NaiveBayes

The fit naive Bayes model.

from_summaries()

Fit the model to the collected sufficient statistics.

Fit the parameters of the model to the sufficient statistics gathered during the summarize calls. This should return an exact update.

Parameters:

inertia : double, optional

The weight of the previous parameters of the model. The new parameters will roughly be old_param*inertia + new_param*(1-inertia), so an inertia of 0 means ignore the old parameters, whereas an inertia of 1 means ignore the new parameters. Default is 0.0.

pseudocount : double, optional

A pseudocount to add to the emission of each distribution. This effectively smoothes the states to prevent 0. probability symbols if they don’t happen to occur in the data. If discrete data, will smooth both the prior probabilities of each component and the emissions of each component. Otherwise, will only smooth the prior probabilities of each component. Default is 0.

Returns:

None

log_probability()

Calculate the log probability of a point under the distribution.

The probability of a point is the sum of the probabilities of each distribution multiplied by the weights. Thus, the log probability is the sum of the log probability plus the log prior.

This is the python interface.

Parameters:

X : numpy.ndarray, shape=(n, d) or (n, m, d)

The samples to calculate the log probability of. Each row is a sample and each column is a dimension. If emissions are HMMs then shape is (n, m, d) where m is variable length for each obervation, and X becomes an array of n (m, d)-shaped arrays.

Returns:

log_probability : double

The log probabiltiy of the point under the distribution.

predict()

Predict the most likely component which generated each sample.

Calculate the posterior P(M|D) for each sample and return the index of the component most likely to fit it. This corresponds to a simple argmax over the responsibility matrix.

This is a sklearn wrapper for the maximum_a_posteriori method.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

y : array-like, shape (n_samples,)

The predicted component which fits the sample the best.

predict_log_proba()

Calculate the posterior log P(M|D) for data.

Calculate the log probability of each item having been generated from each component in the model. This returns normalized log probabilities such that the probabilities should sum to 1

This is a sklearn wrapper for the original posterior function.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

y : array-like, shape (n_samples, n_components)

The normalized log probability log P(M|D) for each sample. This is the probability that the sample was generated from each component.

predict_proba()

Calculate the posterior P(M|D) for data.

Calculate the probability of each item having been generated from each component in the model. This returns normalized probabilities such that each row should sum to 1.

Since calculating the log probability is much faster, this is just a wrapper which exponentiates the log probability matrix.

Parameters:

X : array-like, shape (n_samples, n_dimensions)

The samples to do the prediction on. Each sample is a row and each column corresponds to a dimension in that sample. For univariate distributions, a single array may be passed in.

Returns:

probability : array-like, shape (n_samples, n_components)

The normalized probability P(M|D) for each sample. This is the probability that the sample was generated from each component.

probability()

Return the probability of the given symbol under this distribution.

Parameters:

symbol : object

The symbol to calculate the probability of

Returns:

probability : double

The probability of that point under the distribution.

sample()

Generate a sample from the model.

First, randomly select a component weighted by the prior probability, Then, use the sample method from that component to generate a sample.

Parameters:

n : int, optional

The number of samples to generate. Defaults to 1.

Returns:

sample : array-like or object

A randomly generated sample from the model of the type modelled by the emissions. An integer if using most distributions, or an array if using multivariate ones, or a string for most discrete distributions. If n=1 return an object, if n>1 return an array of the samples.

summarize()

Summarize data into stored sufficient statistics for out-of-core training.

Parameters:

X : array-like, shape (n_samples, variable)

Array of the samples, which can be either fixed size or variable depending on the underlying components.

y : array-like, shape (n_samples,)

Array of the known labels as integers

weights : array-like, shape (n_samples,) optional

Array of the weight of each sample, a positive float

n_jobs : int

The number of jobs to use to parallelize, either the number of threads or the number of processes to use. Default is 1.

Returns:

None

thaw()

Thaw the distribution, re-allowing updates to occur.