‹header›
‹date/time›
Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
‹footer›
‹#›
topic-oriented, informative multi-document summarization, where the goal is to produce a single text as a compressed version of a set of documents
Topic Creation: As you are reading the documents, look for one or more aspects that interest you; make a note of these aspects, if you wish. Formulate a DUC topic, which is a request for information about the aspects of the TREC topic that interest you. This can be in the form of a question or set of related questions.
a)The answer can be found in the relevant documents;
c)At least 25 documents must each contribute some material to the answer.
Specific summary: describe and name specific events, people, places, etc.
General summary: refer to categories/types of things
Michael Lesk. Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to tell a Pine Cone from an Ice Cream Cone. In Proceedings of SIGDOC’86, Toronto, Ontario, June, 1986.
Satanjeev Banerjee, Ted Pedersen: Extended Gloss Overlaps as a Measure of Semantic Relatedness. IJCAI 2003: 805-810
(IJCAI-03, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, August 9-15, 2003 )
word senses that are related to each other are often defined using some shared words
gloss (the explanation of the sense with possibly specific examples);
hypernymy (is-a relation);
meronymy (has-a relation)
Assume one concept in a sentence can only be linked to one concept in any particular sentence.
The strength of concept links between two sentences, sim(si,sj), is in turn calculated as the weighted sum of the strength of each concept link in SL(si,sj).
We use the data available at http://elib.cs.berkeley.edu/docfreq to get the term document frequency “Welcome to the Web Term Document Frequency and Rank site! Available from this site are the document frequency and rank of 31,928,892 terms found on 49,602,191 pages of the Web.”
Maximal Marginal relevance
All.avg_scaled_responsiveness: 10>(5)>4>15>(29)>11>17>(8)>7>14 … (not in pan score list)  out of 32
15 ROUGE-SU4 Average_R: 0.13163342777777815 ROUGE-SU4 Average_P: 0.131227383333333 F=13.14
(0.13163342777777815*0.131227383333333*2)
15 ROUGE-2 Average_R: 0.072510115 ROUGE-2 Average_P: 0.0723482777777778 F=7.24
0.072510115*0.0723482777777778*2/(0.072510115+0.0723482777777778)
Document boundaries are ignored. (No sentence position)
No grammar needed: implies the system is far from understanding the content accuractly.