|
|
|
Compared to the
baseline, using lexical analysis alone, we can reduce 17% of the errors. This
indicates that a careful study of language features helps categorization.
|
|
As we have found
earlier, using metrics or context-sensitive features alone does not give us a
good performance, but when they are coupled with lexical features the overall
performance can be increased. Using the final feature set we can perform the
categorization task with an accuracy 93.95%, resulting in a 52% error
reduction.
|
|
Our experiment also
showed that syntax features are noisy when coupled with other features, we
therefore excluded this feature set in subsequent evaluations.
|
|
As we can see, we
have many weak classifiers, which can not perform well alone, but when they
are combined together, a good classifier is built. This is considered
ensemble method.
|