Stylistic and lexical co-training  for webpage block classification

Web page blocks

Uses of block classification

Approaches to classification

Which approach to use

Co-training (Blum and Mitchell)

Co-training (cont’d)

Architecture

PARCELS

Target Classification

Lexical and Stylistic Co-training

Stylistic Features

Lexical Features

Evaluations

General performance

XHTML / DIV Evaluation

Rough grained model

Conclusion

Question time!