class Lproportion

class Lproportion

module lcompte

The documentation is here.

This class is like a dictionnary in which the keys are the descriptor numbers, and the item the corresponding Proportions. Moreover, the probabilities of transitions between such descriptors are stored.

The aim of this class is to build easily hidden Markov chains.

Construction

__init__

Optional keyword fic allows construction by reading from a filename in specific format;

read_nf

builds from a filename in specific format;

read_Lcompte

Builds from a Lcompte.

Optional keywords:

lpost=l: specifies the maximum length of the posterior words, ie the words which frequencies are computed. Default: the maximum length of the words;
lprior=l: specifies the length of the prior words, ie the words on which the computed words depend, in a markovian context. Default: 0;

Handling

__getitem__: returns the Proportion of specified descriptor number;
__setitem__: gets the specified Proportion to the specified descriptor number;
num: returns the list of descriptors numbers;
lg_max: returns the length of the longest word (prior+posterior);
lg_max_prior: returns the length of the longest prior;
lg_max_posterior: returns the length of the longest posterior;
inter: returns the proportion of transitions between two descriptor numbers;
g_inter: gets the proportion of transitions between two valid descriptor numbers;
alph: returns the list of used letters;
KL_MC: computes Kullback-Leibler divergence to a Lproportion, by Monte Carlo simulation on several (default:100) Sequence of a given length (default:1000) generated by method read_Lprop of Sequence . See Sequence generation;

Input-Output

Specific format is:

description

sections of

decriptor number:

lines of prior|posterior whitespace count

lines of probability transitions

between descriptors numbers in format:

number1, number2 whitespace proportion

example

A|B

0.3

A|A

0.7

B|B

0.5

B|A

0.5

0.1

0.9

AB|B

0.5

AB|A

0.5

AA|A

0.4

AA|B

0.6

A|B

0.9

0.1

1,2

0.2

1,1

0.8

2,2

0.4

2,1

0.6

__str__: outputs in specific format;
loglex: returns the corresponding Lexique. See read_Lprop in that class;

Sequence generation

From a LProportion, a (part of a) Sequence can be generated randomly, by the method read_Lprop.

The process is:

for the first position, select a descriptor number for the initial state, either randomly or by the choice of the user. With that descriptor, select a letter at this position using the same process as read_prop, see Sequence generation.
for all increasing positions i:
- select a descriptor number given the former one, proportionnaly to between-descriptors transition probabilities;
- with that descriptor, select a letter at this position using the same process as read_prop, see Sequence generation.