•Bigram
model can deal with gaps
–Unseen
tokens have small smoothing probabilities in specific positions
•
P(“,”|S1) = P(“whose”|S2) = P(“book”|S3)
= P(“is”|S4) =
P(“,”|S1) = P(“whose”|S2) … = small smoothing prob
P(“known”|S3) = 0.3 P(“for”|S4) = 0.21