class Lproportion
module lcompte
The documentation is here.
This class is like a dictionnary in which the keys are the descriptor
numbers, and the item the corresponding
Proportions. Moreover, the
probabilities of transitions between such descriptors are stored.
The aim of this class is to build easily hidden Markov chains.
Construction
-
__init__
- Optional keyword fic allows
construction by reading from a filename
in specific format;
- read_nf
- builds from a filename
in specific format;
- read_Lcompte
- Builds from a
Lcompte.
Optional keywords:
-
lpost=l
- specifies the maximum length of the
posterior words, ie the words which frequencies are computed.
Default: the maximum length of the words;
- lprior=l
- specifies the length of the prior words, ie
the words on which the computed words depend, in a markovian
context. Default: 0;
Handling
-
__getitem__
- returns the
Proportion of specified
descriptor number;
- __setitem__
- gets the specified
Proportion to the specified
descriptor number;
- num
- returns the list of descriptors numbers;
- lg_max
- returns the length of the longest word (prior+posterior);
- lg_max_prior
- returns the length of the longest
prior;
- lg_max_posterior
- returns the length of the longest
posterior;
- inter
- returns the proportion of transitions between
two descriptor numbers;
- g_inter
- gets the proportion of transitions between two
valid descriptor numbers;
- alph
- returns the list of used letters;
- KL_MC
- computes Kullback-Leibler divergence to a
Lproportion, by Monte Carlo simulation on several
(default:100) Sequence of a
given length (default:1000) generated by method
read_Lprop of
Sequence . See
Sequence generation;
Input-Output
Specific format is:
| description |
|
| sections of |
| decriptor number: |
| lines of prior|posterior whitespace count |
| |
| lines of probability transitions |
| between descriptors numbers in format: |
| |
| number1, number2 whitespace proportion |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | example | |
|
| 1: |
| A|B | 0.3 |
| A|A | 0.7 |
| B|B | 0.5 |
| B|A | 0.5 |
| |A | 0.1 |
| |B | 0.9 |
| |
| 2: |
| AB|B | 0.5 |
| AB|A | 0.5 |
| AA|A | 0.4 |
| AA|B | 0.6 |
| A|B | 0.9 |
| A| | 0.1 |
| |
| 1,2 | 0.2 |
| 1,1 | 0.8 |
| 2,2 | 0.4 |
| 2,1 | 0.6 |
|
-
__str__
- outputs in specific format;
- loglex
- returns the corresponding
Lexique. See
read_Lprop in that class;
Sequence generation
From a LProportion, a (part of a)
Sequence can be generated randomly,
by the method read_Lprop.
The process is:
-
for the first position, select a descriptor number for the
initial state, either randomly or by the choice of the user. With
that descriptor, select a letter at this position using the same
process as read_prop, see
Sequence generation.
- for all increasing positions i:
-
select a descriptor number given the former one,
proportionnaly to between-descriptors transition probabilities;
- with that descriptor, select a letter at this position using
the same process as
read_prop, see
Sequence generation.