August 17, 2005
Generic Soft Pattern Models for Definitional QA
22
/
28
Performance Evaluation
•
Soft pattern matching outperforms hard matching
•
Bigram
and
PHMM
models perform better than the
previously proposed soft pattern method
–
Previous soft pattern method is not optimized
•
Manual F3 scores correlate well with automatic R3
scores
–
0.4971
(+7.30%)**
0.5088
(+9.83%)**
0.4937
(+6.56%)**
0.4633
F
3
0.2496
(+9.18%)
0.2553
(+11.67%)*
0.2378
(+4.00%)
0.2286
R3E
0.2234
(+6.08%)
0.2303
(+9.37%)
0.2233
(+6.00%)
0.2106
R3A
PHMM SP
Bigram SP
Original SP
HP