Slide 1
System Architecture
What’s New This Year
|
|
|
|
Approximate matching of grammatical
dependency relations for answer extraction |
|
Soft matching patterns in identifying
definition sentences. |
|
See [Cui et al., 2004a] and [Cui et
al., 2004b] |
|
Exploiting definitions to answer
factoid and list questions. |
Outline
|
|
|
|
System architecture |
|
New Features in TREC-13 QA Main Task |
|
Approximate Dependency Relation
Matching for Answer Extraction |
|
Soft Matching Patterns for Definition
Generation |
|
Definition Sentences in Answering
Topically-Related Factoid/List Questions |
|
Conclusion |
Dependency Relation
Matching in QA
|
|
|
|
|
Tried before |
|
PIQASso and MIT systems have applied
dependency relations in QA. |
|
Used exact match of relations to
extract answers directly. |
|
Why need to consider dependency
relations? |
|
An upper bound of 70% for answer
extraction (Light et al., 2001) |
|
Many NE’s with the same type appearing
close to each other. |
|
Some questions don’t have NE-type
targets. |
|
E.g. what does AARP stand for? |
Extracting Dependency
Relation Triples
|
|
|
|
Minipar-based (Lin, 1998) dependency
parsing |
|
|
|
Relation triple: two anchor words and
their relationship |
|
E.g.
<“desk”, complement,
“on”> for “on the desk”. |
|
|
|
Relation path: path of relations
between two words |
|
E.g., <“desk”, mod, complement
“floor”> for “on the desk at the fourth floor” |
|
|
Examples of relation
triples
|
|
|
Q: What American revolutionary general
turned over West Point to the British? |
|
q1) General sub obj West Point |
|
q2) West Point mod
pcomp-n British |
|
A: …… Benedict Arnold’s plot to
surrender West Point to the British …… |
|
s1) Benedict Arnold poss
s sobj West Point |
|
s2) West Point mod pcomp-n British |
|
|
|
Can’t be extracted by exact match of
relations. |
|
|
|
|
Learning Relation
Similarity
|
|
|
|
|
We need a measure to find the
similarity between two different paths. |
|
Adopt a statistical method to learn
similarity from past QA pairs. |
|
|
|
Training data preparation |
|
Around 1,000 factoid question-answer
pairs from the past two years’ TREC QA task. |
|
Extract all relation paths between all
non-trivial words |
|
2,557 path pairs. |
|
Align the paths according to identical
anchor nodes. |
Using Mutual Information
to Measure Relation Co-occurrence
|
|
|
|
Two relations’ similarity measured by
their co-occurrences in the question and answer paths. |
|
Variation of mutual information (MI) |
|
|
|
|
|
a to discount the score of two
relations appearing in long paths. |
Measuring Path Similarity
– 1
|
|
|
|
We adopt two methods to compute path
similarity using different relation alignment methods. |
|
Option 1: ignore the words of those
relations along the given paths – Total Path Matching. |
|
A path consists of only a list of
relations. |
|
Relation alignment by permutation of
all possibilities. |
|
Adopt IBM’s Model 1 for statistical
translation: |
|
|
|
|
|
|
Measuring Path Similarity
– 2
|
|
|
|
Option 2: consider the words of those
relations along a path – Triple Matching. |
|
A path consists of a list of relations
and their words. |
|
Only those relations with matched words
count. |
|
Deliberately ignore long dependency
relationship. |
|
|
|
|
|
|
Selecting Answer Strings
Statistically
|
|
|
|
Use the top 50 ranked sentences from
the passage retrieval module for answer extraction. |
|
Evaluate the path similarity for
relation paths between the question target / answer candidate and other
question terms. |
|
|
|
|
|
Non-NE questions: evaluate all
noun/verb phrases. |
|
|
Discussions on Evaluation
Results
|
|
|
|
The use of approximate relation
matching outperforms our previous answer extraction technique. |
|
22% improvement for overall questions. |
|
45% improvement for Non-NE questions
(69 out of 230 questions). |
|
|
|
The two path similarity measurements do
not make obvious difference. |
|
Total Path Matching performs slightly
better than Triple Matching. |
|
Triple Matching doesn’t degrade the
performance because Minipar can’t resolve long distance dependency as well. |
|
|
Outline
|
|
|
|
System architecture |
|
New Experiments in TREC-13 QA Main Task |
|
Approximate Dependency Relation
Matching for Answer Extraction |
|
Soft Matching Patterns for Definition
Generation |
|
Definition Sentences in Answering
Topically-Related Factoid/List Questions |
|
Conclusion |
Question Typing and
Passage Retrieval for Factoid/List Q’s
|
|
|
|
Question typing |
|
Leveraging our past question typology
and rule-based question typing module. |
|
Offline tagging of the whole TREC
corpus using our rule-based named entity tagger. |
|
Passage retrieval – on two sources: |
|
Topic-relevant document set by the
document retrieval module: NUSCHUA1 and 2. |
|
Definition sentences for a specific
topic by the definition generation module: NUSCHUA3 |
|
Question-specific wrappers on
definitions. |
|
|
|
|
Exploiting Definition
Sentences to Answer Factoid/List Questions
|
|
|
|
|
Conduct passage retrieval for
factoid/list questions on the definition sentences about the topic. |
|
Much more efficient due to smaller
search space. |
|
Average accuracy of 0.50, lower than
that over all topic-related documents. |
|
Due to low recall – imposed cut-off for
selecting definition sentences (naďve use of definitions). |
|
Some sentences for answering
factoid/list questions are not definition sentences. |
Exploiting Definitions
from External Knowledge
|
|
|
|
Pre-complied wrappers for extraction of
specific fields of information for list questions |
|
Works, product names and person titles. |
|
From both generated definition
sentences and existing definitions: cross validation. |
|
Achieves F-measure of 0.81 for 8 list
questions about works. |
|
|
Outline
|
|
|
|
System architecture |
|
New Experiments in TREC-13 QA Main Task |
|
Approximate Dependency Relation
Matching for Answer Extraction |
|
Soft Matching Patterns for Definition
Generation |
|
Definition Sentences in Answering
Topically-Related Factoid/List Questions |
|
Conclusion |
Conclusion
|
|
|
|
|
Approximate relation matching for
answer extraction |
|
Still have a hard time in dealing with
difficult questions. |
|
Dependency relation alignment problem –
words often can’t be matched due to
linguistic variations. |
|
Semantic matching of words/phrases is
needed with relation matching. |
|
More effective use of topic related
sentences in answering factoid/list questions. |
|
|
Q & A