3  Building and using HMM

This part is based on data file seq_hmm.fa and Lproportion file lprop2.

We want to compute partitions on a set of sequences from a HMM, and compute proportions from the resulting partitions.

  1. We load the sequences and the HMM.
    import lsequence ls=lsequence.Lsequence() ls.read_nf('seq_hmm.fa') import lcompte lpr=lcompte.Lproportion(fic="lprop2") print lpr

    We notice that in these proportions there is no transition proportion for the sequences beginning.

  2. We compute a Lpartition with Viterbi algorithm, and print the number of segments of each partition
    import lpartition lpa=lpartition.Lpartition() lpa.add_Lseq(ls) import lexique lx=lexique.Lexique() lx.read_Lprop(lpr) lpa.viterbi(lx) for i in lpa: print len(i[1]),
  3. We build a new Lcompte of 2-length words from the segmentations.
    lc=lcompte.Lcompte() lc.read_Lpart(lpa,2) print lc
  4. We compute a new HMM from the 1|1-proportions on the latter Lcompte.
    lpr2=lcompte.Lproportion() lpr2.read_Lcompte(lc.rstrip(),lprior=1,lpost=1)
  5. We compute the Kullback-Leibler divergence form lpr to lpr2, using MC simulation on 100 sequences of length 5000.
  6. On the studied sequences, we compute new partitions with this new HMM with FB algorithm.
    lx2=lexique.Lexique() lx2.read_Lprop(lpr2) lpa2=lpartition.Lpartition() lpa2.add_Lseq(lpa.Lseq()) lpa2.fb(lx2)
  7. And so on.

