P ( Ins ) = P(known|S-2) + P(as|S-1) + P(,|S1) + P(DT$|S2) + P(known as) + P(, DT$)
To estimate the interpolation mixture weight λ
Expectation Maximization (EM) algorithm
Count words and general tags separately
Avoid overwhelming frequency count of general tags