class Lproportion
module lcompte
The documentation is here.
This class is like a dictionnary in which the keys are the descriptor
numbers, and the item the corresponding
Proportions. Moreover, the
probabilities of transitions between such descriptors are stored.
The aim of this class is to build easily hidden Markov chains.
Construction

__init__
 Optional keyword fic allows
construction by reading from a filename
in specific format;
 read_nf
 builds from a filename
in specific format;
 read_Lcompte
 Builds from a
Lcompte.
Optional keywords:

lpost=l
 specifies the maximum length of the
posterior words, ie the words which frequencies are computed.
Default: the maximum length of the words;
 lprior=l
 specifies the length of the prior words, ie
the words on which the computed words depend, in a markovian
context. Default: 0;
Handling

__getitem__
 returns the
Proportion of specified
descriptor number;
 __setitem__
 gets the specified
Proportion to the specified
descriptor number;
 num
 returns the list of descriptors numbers;
 lg_max
 returns the length of the longest word (prior+posterior);
 lg_max_prior
 returns the length of the longest
prior;
 lg_max_posterior
 returns the length of the longest
posterior;
 inter
 returns the proportion of transitions between
two descriptor numbers;
 g_inter
 gets the proportion of transitions between two
valid descriptor numbers;
 alph
 returns the list of used letters;
 KL_MC
 computes KullbackLeibler divergence to a
Lproportion, by Monte Carlo simulation on several
(default:100) Sequence of a
given length (default:1000) generated by method
read_Lprop of
Sequence . See
Sequence generation;
InputOutput
Specific format is:
description 

sections of 
decriptor number: 
lines of priorposterior whitespace count 

lines of probability transitions 
between descriptors numbers in format: 

number1, number2 whitespace proportion 













 example  

1: 
AB  0.3 
AA  0.7 
BB  0.5 
BA  0.5 
A  0.1 
B  0.9 

2: 
ABB  0.5 
ABA  0.5 
AAA  0.4 
AAB  0.6 
AB  0.9 
A  0.1 

1,2  0.2 
1,1  0.8 
2,2  0.4 
2,1  0.6 


__str__
 outputs in specific format;
 loglex
 returns the corresponding
Lexique. See
read_Lprop in that class;
Sequence generation
From a LProportion, a (part of a)
Sequence can be generated randomly,
by the method read_Lprop.
The process is:

for the first position, select a descriptor number for the
initial state, either randomly or by the choice of the user. With
that descriptor, select a letter at this position using the same
process as read_prop, see
Sequence generation.
 for all increasing positions i:

select a descriptor number given the former one,
proportionnaly to betweendescriptors transition probabilities;
 with that descriptor, select a letter at this position using
the same process as
read_prop, see
Sequence generation.