Bootstrapping
Constructing a language model with only 320
annotated instances is small
Usual language models use millions of examples
Try bootstrapping a model
Use a sample’s annotation and apply to all in
sample’s it represents
More data, but also more noise
Self-labeled corpus
290K instances