1
|
|
2
|
|
3
|
- Hard Matching
- Rule induction
- Generalizing training instances into regular expression represented
rules
- Performing slot by slot matching
- Soft Matching
- Hidden Markov Models (HMM)
- Soft pattern matching for definitional QA (Cui et al., 2004)
|
4
|
- Lack of flexibility in matching
- Can’t deal with gaps between rules and test instances
|
5
|
- …… The channel Iqra is owned by the …
…… severance packages,
known as golden parachutes, included ……
A battery is a cell
which can provide electricity.
|
6
|
- Ad-hoc in model parameter estimation
- Cui et al., 2004: Lack of formalization
- Not generic
- Task specific topology for HMM
- Difficult to port to other applications
|
7
|
- Propose two generic soft pattern models
- Bigram model
- Formalization of our previous model
- Profile Hidden Markov Model (PHMM)
- More complex model that handles gaps better
- Parameter estimation by EM algorithm
- Evaluations on definitional question answering
- Can be applied to other pattern matching applications
|
8
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Conclusions
|
9
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Conclusions
|
10
|
- To answer questions like “Who is Gunter Blobel” or “What is Wicca”.
- Why evaluating on definition sentence retrieval?
- Diverse patterns
- Definitional QA is one of the least explored areas in QA
|
11
|
|
12
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Conclusions
|
13
|
- To estimate the interpolation mixture weight λ
- Expectation Maximization (EM) algorithm
- Count words and general tags separately
- Avoid overwhelming frequency count of general tags
|
14
|
- Bigram model can deal with gaps
- Unseen tokens have small smoothing probabilities in specific positions
|
15
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Conclusions
|
16
|
- Better solution for dealing with gaps
- Left to right Hidden Markov Model with insertion and deletion states
|
17
|
- Calculating generative probability given a test instance
- Find the most probable path by Viterbi algorithm
- Efficient calculation by forward-backward algorithm
|
18
|
- Estimated by Baum-Welch algorithm
- Using the most probable path during training
- Random or uniform initialization may lead to unsatisfactory model
- Extreme diversity of definition patterns and not sufficient training
data
- Assume path should favor match states over others
- P( token | Match ) > P ( token | Insertion )
- Using smoothed ML estimates
|
19
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Overall performance evaluation
- Sensitivity to model length
- Sensitivity to size of training data
- Conclusions
|
20
|
- Data set
- Test data: TREC-13 question answering task data
- AQUAINT corpus and 64 definition questions with answers
- Training data
- 761 manually labeled definition sentences from TREC-12 question
answering task data
- Comparison systems
- State-of-the-art manually constructed patterns
- Most comprehensive manually constructed patterns to our knowledge
- Previously proposed soft pattern in Cui et al., 2004
|
21
|
- Manually checked F3 measure
- Based on essential/acceptable answer nuggets
- NR – proportion of returned essential answer nuggets
- NP – penalty to longer answers
- Weighting NR 3 times as NP
- Subject to inconsistent scoring among assessors
- Automatic ROUGE score
- Gold standard: sentences containing answer nuggets
- Counting the trigrams shared in the gold standard and system answers
- ROUGE-3-ALL (R3A) and ROUGE-3-ESSENTIAL (R3E)
|
22
|
- Soft pattern matching outperforms hard matching
- Bigram and PHMM models perform better than the previously proposed soft
pattern method
- Previous soft pattern method is not optimized
- Manual F3 scores correlate well with automatic R3 scores
|
23
|
- PHMM is less sensitive to model length
- PHMM may handle longer sequences
|
24
|
- PHMM requires more training data to improve
|
25
|
- Capture the same information
- The importance of a token’s position in the context of the search term
- The sequential order of tokens
- Different in complexity
- Bigram model
- Simplified Markov model with each token as a state
- Captures token sequential information by bigram probabilities
- PHMM model
- More complex – aggregated token sequential information by hidden state
transition probabilities
- Experimental results show
- PHMM is less sensitive to model length
- PHMM may benefit more by using more training data
|
26
|
- Overview of Definitional QA
- Bigram Soft Pattern Model
- PHMM Soft Pattern Model
- Evaluations
- Conclusions
|
27
|
- Proposed Bigram model and PHMM model
- Generic in the forms
- Systematic parameter estimation by EM algorithm
- These two models can be applied to other applications using surface text
patterns
- Soft patterns have been applied to information extraction (Xiao et al.,
2004)
- PHMM is more flexible in dealing with gaps, but requires more training
data to converge
|
28
|
|