|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Adapted
co-training:
|
|
|
|
Sample
balancing: preserve ratio of noisily labeled
|
|
examples, poor
performance without it
|
|
|
|
|
Replace unlabeled
data at each round
|
|
|
Use BoosTexter:
handles word features easily
|
|
|
Five fold cross
validation
|
|
|
|
General
performance?
|
|
|
|
Specific
performance on:
|
|
|
|
Fine-grained
classification?
|
|
|
|
|
XHTML / DIV
pages?
|
|
|
|
Others tasks?
|
|