|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Source code
categorization: Ugurel et.al.s work
|
|
|
published in 2002
|
|
|
|
|
Seminal work on
the subject
|
|
|
|
|
Consists of two
tasks
|
|
|
|
|
Programming
Language classification: find out the
|
|
|
type of
programming language keywords, bi-grams
|
|
|
non-interesting!
|
|
|
|
|
Topic
classification: find out the topic related to the
|
|
|
code - relies
heavily on external resources (e.g.
|
|
|
README file, code
header), did not analyze source
|
|
|
code itself much
|
|
|
|
Can we extract
features from source code itself?
|