Understanding Documents via Concept Links
DUC 2005 System Task | ||
Targeted Sentences | ||
Our Approach | ||
System Overview | ||
Concept Link | ||
Sentence Similarity | ||
Sentence Ranker: A modified MMR | ||
Evaluation | ||
Conclusions |
Task Definition in [Amigo et al, 04] | ||
… topic-oriented, informative multi-document summarization, … compressed version of a set of documents … | ||
Topic Creation Instructions | ||
to formulate a topic out of interesting aspects | ||
“At least 25 documents must each contribute some material to the answer” of a quest of the topic | ||
Our view of the task | ||
A general, and topic-oriented summary. |
Good DUC 2005 summary: an extract consists of sentences that | |||
highly representative | |||
highly relevant to the topic | |||
General | |||
Specific: named entities are favored | |||
with minimal redundancy |
There exists a Concept Link between each pair of similar concepts | |||
Concept Similarity: maximal sense overlapping (Banerjee et al, 2003) | |||
Consider all senses of each concept | |||
Extended sense Sx: | |||
Synset + Gloss + hypernymy + meronymy set(1 level) |
1) A year ago Mr Douglas Hurd foreign secretary became the first UK cabinet minister to visit Argentina since the 1982 Falkland islands conflict. | |
2) Today Argentina gets out the red carpet for the UK Duke of York the first official royal visitor since the end of the Anglo Argentine Falklands war in 1982. |
Concept Links between sentences
Sum of “strength” of concept links |
Original Weight: Representative Power |
MMR modified |
Conclusions:
A simple system features
Concept Link: new way to calculate sentence similarity; | ||
no chunker/parser involved | ||
concept differs from NPs in Lexical Chain | ||
Considering sentence similarity/relatedness via Concept Link: | ||
Alleviate the influence of expression variations; (but might involve inaccurate sense guess) | ||
Outperforms Word co-occurrence approach | ||
Minimizing Redundancy via Modified MMR; | ||
No extra heuristics involved. |
Error analysis; | |
How to automatically set parameters; | |
Comparison with alternative Similarity Measures; | |
How about more knowledge (syntactic, semantic parsers …)? | |
… |