Min-Yen Kan and Danny C. C. Poo
Known Item Queries (JCDL 2005)
10/25
Judgments
•Participants graded on a 9-point Likert scale
•We also simplified scale to a binary class
(1-2 → yes; 3-9 → no)
•
•Let’s look at two examples:
•Practical digital libraries
•Practical digital archiving
•Query judgments are subjective, may depend on subject familiarity. Thus, we calculate inter-judge agreement to:
–establish whether the tasks are well-defined
–establish performance upper bound