# Markov Chains¶

IPython Notebook Tutorial

Markov chains are form of structured model over sequences. They represent the probability of each character in the sequence as a conditional probability of the last k symbols. For example, a 3rd order Markov chain would have each symbol depend on the last three symbols. A 0th order Markov chain is a naive predictor where each symbol is independent of all other symbols. Currently pomegranate only supports discrete emission Markov chains where each symbol is a discrete symbol versus a continuous number (like ‘A’ ‘B’ ‘C’ instead of 17.32 or 19.65).

## Initialization¶

Markov chains can almost be represented by a single conditional probability table (CPT), except that the probability of the first k elements (for a k-th order Markov chain) cannot be appropriately represented except by using special characters. Due to this pomegranate takes in a series of k+1 distributions representing the first k elements. For example for a second order Markov chain:

```from pomegranate import *
d1 = DiscreteDistribution({'A': 0.25, 'B': 0.75})
d2 = ConditionalProbabilityTable([['A', 'A', 0.1],
['A', 'B', 0.9],
['B', 'A', 0.6],
['B', 'B', 0.4]], [d1])
d3 = ConditionalProbabilityTable([['A', 'A', 'A', 0.4],
['A', 'A', 'B', 0.6],
['A', 'B', 'A', 0.8],
['A', 'B', 'B', 0.2],
['B', 'A', 'A', 0.9],
['B', 'A', 'B', 0.1],
['B', 'B', 'A', 0.2],
['B', 'B', 'B', 0.8]], [d1, d2])
model = MarkovChain([d1, d2, d3])
```

## Probability¶

The probability of a sequence under the Markov chain is just the probability of the first character under the first distribution times the probability of the second character under the second distribution and so forth until you go past the (k+1)th character, which remains evaluated under the (k+1)th distribution. We can calculate the probability or log probability in the same manner as any of the other models. Given the model shown before:

```>>> model.log_probability(['A', 'B', 'B', 'B'])
-3.324236340526027
>>> model.log_probability(['A', 'A', 'A', 'A'])
-5.521460917862246
```

## Fitting¶

Markov chains are not very complicated to train. For each sequence the appropriate symbols are sent to the appropriate distributions and maximum likelihood estimates are used to update the parameters of the distributions. There are no latent factors to train and so no expectation maximization or iterative algorithms are needed to train anything.

## API Reference¶

class pomegranate.MarkovChain.MarkovChain

A Markov Chain.

Implemented as a series of conditional distributions, the Markov chain models P(X_i | X_i-1…X_i-k) for a k-th order Markov network. The conditional dependencies are directly on the emissions, and not on a hidden state as in a hidden Markov model.

Parameters
distributionslist, shape (k+1)

A list of the conditional distributions which make up the markov chain. Begins with P(X_i), then P(X_i | X_i-1). For a k-th order markov chain you must put in k+1 distributions.

Examples

```>>> from pomegranate import *
>>> d1 = DiscreteDistribution({'A': 0.25, 'B': 0.75})
>>> d2 = ConditionalProbabilityTable([['A', 'A', 0.33],
['B', 'A', 0.67],
['A', 'B', 0.82],
['B', 'B', 0.18]], [d1])
>>> mc = MarkovChain([d1, d2])
>>> mc.log_probability(list('ABBAABABABAABABA'))
-8.9119890701808213
```
Attributes
distributionslist, shape (k+1)

The distributions which make up the chain.

fit()

Fit the model to new data using MLE.

The underlying distributions are fed in their appropriate points and weights and are updated.

Parameters
sequencesarray-like, shape (n_samples, variable)

This is the data to train on. Each row is a sample which contains a sequence of variable length

weightsarray-like, shape (n_samples,), optional

The initial weights of each sample. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

The weight of the previous parameters of the model. The new parameters will roughly be old_param*inertia + new_param*(1-inertia), so an inertia of 0 means ignore the old parameters, whereas an inertia of 1 means ignore the new parameters. Default is 0.0.

Returns
None
from_json()

Read in a serialized model and return the appropriate classifier.

Parameters
sstr

A JSON formatted string containing the file.

Returns
modelobject

A properly initialized and baked model.

from_samples()

Learn the Markov chain from data.

Takes in the memory of the chain (k) and learns the initial distribution and probability tables associated with the proper parameters.

Parameters
Xarray-like, list or numpy.array

The data to fit the structure too as a list of sequences of variable length. Since the data will be of variable length, there is no set form

weightsarray-like, shape (n_nodes), optional

The weight of each sample as a positive double. Default is None.

kint, optional

The number of samples back to condition on in the model. Default is 1.

Returns
modelMarkovChain

The learned markov chain model.

from_summaries()

Fit the model to the collected sufficient statistics.

Fit the parameters of the model to the sufficient statistics gathered during the summarize calls. This should return an exact update.

Parameters

The weight of the previous parameters of the model. The new parameters will roughly be old_param*inertia + new_param * (1-inertia), so an inertia of 0 means ignore the old parameters, whereas an inertia of 1 means ignore the new parameters. Default is 0.0.

Returns
None
log_probability()

Calculate the log probability of the sequence under the model.

This calculates the first slices of increasing size under the corresponding first few components of the model until size k is reached, at which all slices are evaluated under the final component.

Parameters
sequencearray-like

An array of observations

Returns
logpdouble

The log probability of the sequence under the model.

sample()

Create a random sample from the model.

Parameters
lengthint or Distribution

Give either the length of the sample you want to generate, or a distribution object which will be randomly sampled for the length. Continuous distributions will have their sample rounded to the nearest integer, minimum 1.

Returns
sequencearray-like, shape = (length,)

A sequence randomly generated from the markov chain.

summarize()

Summarize a batch of data and store sufficient statistics.

This will summarize the sequences into sufficient statistics stored in each distribution.

Parameters
sequencesarray-like, shape (n_samples, variable)

This is the data to train on. Each row is a sample which contains a sequence of variable length

weightsarray-like, shape (n_samples,), optional

The initial weights of each sample. If nothing is passed in then each sample is assumed to be the same weight. Default is None.

Returns
None
to_json()

Serialize the model to a JSON.

Parameters
separatorstuple, optional

The two separators to pass to the json.dumps function for formatting. Default is (‘,’, ‘ : ‘).

indentint, optional

The indentation to use at each level. Passed to json.dumps for formatting. Default is 4.

Returns
jsonstr

A properly formatted JSON object.