|
|
|
Next we try to
investigate whether syntax features can enhance categorization performance.
|
|
For each JavaScript
source code, we parse into a tree structure. Syntax features can then get
extracted based on the tree.
|
|
In this particular
example, a JavaScript function is parsed into a tree structure as shown
below, and the syntax (structure) features can get extracted from the parse
tree.
|
|
For example, a
syntax feature can be extracted as a level 2 sub-tree like this.
|
|
Another syntax
feature can be extracted as a level 3 sub-tree like this.
|
|
Such syntax features
are then serialized as text tokens and get passed to the classifier.
|