Here we consider the pattern matching
problem as a token sequence generation problem. So, we take the token sequence
t1 till tL from the test instance and calculate its probability according to
the training data. In a typical bigram model, this generation prob is
multiplication of bigram probs. Here, we use linear interpolation to smooth
the bigram probs and introduce two terms….