Syntax (structure) features are extracted from the parse tree
SETNAME[BINDNAMEàGETPROP]Syntax Feature
level=2
STMT[SETNAMEà[BINDNAMEàGETPROP]]Syntax Feature
level=3
Next we try to
investigate whether syntax features can enhance categorization performance.
For each JavaScript
source code, we parse into a tree structure. Syntax features can then get
extracted based on the tree.
In this particular
example, a JavaScript function is parsed into a tree structure as shown
below, and the syntax (structure) features can get extracted from the parse
tree.
For example, a
syntax feature can be extracted as a level 2 sub-tree like this.
Another syntax
feature can be extracted as a level 3 sub-tree like this.
Such syntax features
are then serialized as text tokens and get passed to the classifier.