|
|
|
Specifically, in
this paper, we are going to present a case study on JavaScript
categorization.
|
|
You probably want to
know why we chose to categorize JavaScript, but not other languages.
|
|
There are few
reasons for that.
|
|
As we all know,
JavaScript is a language that is frequently used in Web pages. Current
generation of Web pages largely rely on JavaScript for achieving certain
functionality. For example, form processing, pop-up advertisement, page
generation, page re-direction, and so on. These information sometimes are
important for end-users of the Web. Therefore, JavaScript often convey
crucial information. However, these information are often ignored by most
crawlers and indexers. For example, when you issue a query to Google, it will
return you a list of Web pages with a brief summary, but they fail to
summarize the JavaScript information on the page. So if JavaScript
information can also get summarized by the search engine, it will help user
to predict the usefulness of the page. Also, people can also build
applications to block unwanted JavaScript on the page.
|
|
So the question now
becomes, can we have these information summarized automatically?
|
|
One way of doing so
is to categorize JavaScript codes into a set of pre-defined categories.
|
|
But what would be
the categories?
|