## 6  Predicting the number of classes

In this section, we build a random sequence, on a given number of classes, from a set of markovian models, and we use the probability algorithm to predict the given number.

In this example, the model is in file lprop1.
1. We build a Partition, on a virtual sequence of length 10000, in 23 segments, such that each segment is at least 50 positions long.
```nbcl=23  # for example
import lcompte
lp=lcompte.Lproportion(fic="lprop1")
import partition
p=partition.Partition()
p.build_random(10000,nbcl,ec=50)
```

2. We attribute predictor numbers to the Segments of the Partition.
```lnum=lp.num()
import random
numr=random.choice(lnum)
for s in p:
s.g_num([numr])
x=random.choice(lnum)
while x==numr:
x=random.choice(lnum)
numr=x
```

3. Then we build a Sequence from the Partition and the Lproportion.
```import sequence
s=sequence.Sequence()
```

4. We compute the segmentation probabilities on the sequence from the models used to build the sequence, up to 100 classes.
```import lexique
lx=lexique.Lexique(Lprop=lp)
lprob=lx.probability(s,100)

from rpy import *   #using R to plot it
r.plot(range(1,len(lprob)+1),lprob,ylab="Log-probability",xlab="Nbe of classes")

```

5. We compute the logarithm of bayesian factor: the probability of the sequence under the segmentations in k-segments for all k>=n, divided by the probability of the sequence under the segmentations in k-segments for all k<n:
```import math
def expmean(l): #computes the logarithm of the mean of exponentials
m=max(l)
x=0
for i in l:
x+=math.exp(i-m)
return(math.log(x/len(l))+m)

lbay=[]
for i in range(1,len(lprob)):
lbay.append(expmean(lprob[i:])-expsmean(lprob[:i]))

r.plot(range(2,len(lprob)+1),lbay,ylab="Log of the Bayes factor",xlab="Nbe of classes")
```

6. We select the biggest number of classes that is over 3.2.
```r.abline(h=3.2,col="red")

nbc=lbay.index(filter(lambda x: x>=3.2,lbay)[-1])+2
print nbc
```