1
|
|
2
|
|
3
|
- Approximate matching of grammatical dependency relations for answer
extraction
- Soft matching patterns in identifying definition sentences.
- See [Cui et al., 2004a] and [Cui et al., 2004b]
- Exploiting definitions to answer factoid and list questions.
|
4
|
- System architecture
- New Features in TREC-13 QA Main Task
- Approximate Dependency Relation Matching for Answer Extraction
- Soft Matching Patterns for Definition Generation
- Definition Sentences in Answering Topically-Related Factoid/List
Questions
- Conclusion
|
5
|
- Tried before
- PIQASso and MIT systems have applied dependency relations in QA.
- Used exact match of relations to extract answers directly.
- Why need to consider dependency relations?
- An upper bound of 70% for answer extraction (Light et al., 2001)
- Many NE’s with the same type appearing close to each other.
- Some questions don’t have NE-type targets.
- E.g. what does AARP stand for?
|
6
|
- Minipar-based (Lin, 1998) dependency parsing
- Relation triple: two anchor words and their relationship
- E.g. <“desk”, complement, “on”> for “on the
desk”.
- Relation path: path of relations between two words
- E.g., <“desk”, mod, complement “floor”> for “on the desk at the
fourth floor”
|
7
|
- Q: What American revolutionary general turned over West Point to the British?
- q1) General sub obj West Point
- q2) West Point mod pcomp-n British
- A: …… Benedict Arnold’s plot to surrender West Point to the British ……
- s1) Benedict Arnold poss s sobj West Point
- s2) West Point mod pcomp-n British
- Can’t be extracted by exact match of relations.
|
8
|
- We need a measure to find the similarity between two different paths.
- Adopt a statistical method to learn similarity from past QA pairs.
- Training data preparation
- Around 1,000 factoid question-answer pairs from the past two years’
TREC QA task.
- Extract all relation paths between all non-trivial words
- Align the paths according to identical anchor nodes.
|
9
|
- Two relations’ similarity measured by their co-occurrences in the
question and answer paths.
- Variation of mutual information (MI)
- a to discount the score of two relations appearing in long paths.
|
10
|
- We adopt two methods to compute path similarity using different relation
alignment methods.
- Option 1: ignore the words of those relations along the given paths – Total
Path Matching.
- A path consists of only a list of relations.
- Relation alignment by permutation of all possibilities.
- Adopt IBM’s Model 1 for statistical translation:
|
11
|
- Option 2: consider the words of those relations along a path – Triple
Matching.
- A path consists of a list of relations and their words.
- Only those relations with matched words count.
- Deliberately ignore long dependency relationship.
|
12
|
- Use the top 50 ranked sentences from the passage retrieval module for
answer extraction.
- Evaluate the path similarity for relation paths between the question
target / answer candidate and other question terms.
- Non-NE questions: evaluate all noun/verb phrases.
|
13
|
- The use of approximate relation matching outperforms our previous answer
extraction technique.
- 22% improvement for overall questions.
- 45% improvement for Non-NE questions (69 out of 230 questions).
- The two path similarity measurements do not make obvious difference.
- Total Path Matching performs slightly better than Triple Matching.
- Triple Matching doesn’t degrade the performance because Minipar can’t
resolve long distance dependency as well.
|
14
|
- System architecture
- New Experiments in TREC-13 QA Main Task
- Approximate Dependency Relation Matching for Answer Extraction
- Soft Matching Patterns for Definition Generation
- Definition Sentences in Answering Topically-Related Factoid/List
Questions
- Conclusion
|
15
|
- Question typing
- Leveraging our past question typology and rule-based question typing
module.
- Offline tagging of the whole TREC corpus using our rule-based named
entity tagger.
- Passage retrieval – on two sources:
- Topic-relevant document set by the document retrieval module: NUSCHUA1
and 2.
- Definition sentences for a specific topic by the definition generation
module: NUSCHUA3
- Question-specific wrappers on definitions.
|
16
|
- Conduct passage retrieval for factoid/list questions on the definition
sentences about the topic.
- Much more efficient due to smaller search space.
- Average accuracy of 0.50, lower than that over all topic-related
documents.
- Due to low recall – imposed cut-off for selecting definition sentences
(naďve use of definitions).
- Some sentences for answering factoid/list questions are not definition
sentences.
|
17
|
- Pre-complied wrappers for extraction of specific fields of information
for list questions
- Works, product names and person titles.
- From both generated definition sentences and existing definitions:
cross validation.
- Achieves F-measure of 0.81 for 8 list questions about works.
|
18
|
- System architecture
- New Experiments in TREC-13 QA Main Task
- Approximate Dependency Relation Matching for Answer Extraction
- Soft Matching Patterns for Definition Generation
- Definition Sentences in Answering Topically-Related Factoid/List
Questions
- Conclusion
|
19
|
- Approximate relation matching for answer extraction
- Still have a hard time in dealing with difficult questions.
- Dependency relation alignment problem – words often can’t be matched due to
linguistic variations.
- Semantic matching of words/phrases is needed with relation matching.
- More effective use of topic related sentences in answering factoid/list
questions.
|
20
|
|