Slide 1

A scenario

Looking for
“Journal of housing
for the elderly”

Tries using the default
keyword search

But lots of results doesn’t
necessary equate with
finding the item …

OPAC Query Types

Slone (2000) categorizes three types of queries:

Known Item: find a title that patron knows exists

Area: identify area of library for certain resources

Unknown Item: identify resources to solve problem or address issue

Importance of Known Item Queries

Kilgour has noted effectiveness of author / title combination

Up to 50% of keyword searches are known item queries (Larson, 91)

despite having entry points for author, title and subject search

Partial answers normally don’t help in known item search – either you find the item or you don’t

Problem Statement

Two tasks:

Query classification: is this query searching for a known item?

Search result classification: which, if any, of the search results are the known item(s) sought?

Use supervised machine learning to solve problem

Learning Architecture

Outline

Known Item Queries (KIQs)

Data Collection

Features of KIQs

Evaluation

Conclusions

Data Collection

Used queries drawn from local OPAC query collection

Anonymized, sessioned queries

Over 290K queries, purposefully sampled for a wide array of query characteristics

320 resulting queries were judged by 9 participants; 1500 item judgments

Most queries annotated by two participants

Data Collection

Tasks:

Query Judgment

Query Judgment,
with search results

Search Results
Judgment

Judgments

Participants graded on a 9-point Likert scale

We also simplified scale to a binary class
(1-2 → yes; 3-9 → no)

Let’s look at two examples:

Practical digital libraries

Practical digital archiving

Query judgments are subjective, may depend on subject familiarity. Thus, we calculate inter-judge agreement to:

establish whether the tasks are well-defined

establish performance upper bound

Agreement levels

Data analysis:

Relatively strong correlation (mostly above .6)

Stronger correlation with search results shown; easier task with more information

Most search results are not known items, high correlation for the final task

Outline

Known Item Queries (KIQs)

Data Collection

Features of KIQs

Query features

Search result features

Evaluation

Conclusion

Query Classification Features

Two examples:

Hill Raymond Coding Theory – A First Course

japan and cultural

Distinguishing characteristics:

Longer: cut-and-paste, copying from a reference

Mixed Case

Determiners: not present in unknown item or area searches

Proper Nouns: specific subjects or author names

Advanced Operators: title or author restrictions

Keywords: indicative of a type of publication e.g., “journal”, “textbook”, “course”

Use POS tagging to create a total of 16 features that embody these characteristics

Language Modeling

Idea: Model KIQ as a separate “language” from non-KIQs

Model: simple bigram language model

Create a language model for each point on scale or each class

Then, test new query’s goodness of fit to LMs:

Bootstrapping

Constructing a language model with only 320 annotated instances is small

Usual language models use millions of examples

Try bootstrapping a model

Use a sample’s annotation and apply to all in sample’s it represents

More data, but also more noise

Features with search results

What about when we have search results?

We look at the pairing of search results and the queries that generated them.

Characteristics of query-search result pairs:

Sequence of words overlap significantly

First and last positions are particularly important

Publication keyword match

Higher number of relevant search results

BLEU and NIST

Judge the fitness of a system translation with a reference translation

examine multiple granularities of n-gram overlap

BLEU – normalized between 0-1

NIST – only uses trigrams

Use these features to model subtle overlap properties

Outline

Known Item Queries (KIQs)

Data Collection

Features of KIQs

Evaluation

Conclusions

Machine Learning Paradigms

Use Waikato’s Weka Machine Learning Toolkit:

Decision Trees (J48 module): for their understandability in their hypotheses

Support vector machines (SVM, SMO module): for their robustness and general performance

Compare versus majority baseline (lower bound); and interjudge agreement (upper bound)

Task 1: Query Classification

Task 2: Query Classification w/ search results

Task 3: Query Results Classification

Applications of Classifiers

Route patrons to intended search results faster

KIQs: Turn off fancy footwork: no query expansion, spelling correction

If we have it, skip to circulation info

If we don’t, show alternatives:

Interlibrary Loan (ILL)

Suggest to purchase

Conclusions

First such work to demonstrate an automated system that does query and search result classification

System performance tied to human performance

Known item search possibly more important than unknown item search

Often we want to recall where something is

Known items are subjective; one searcher’s known item is another’s unknown

Future Work and Acknowledgments

Extending query classification to other areas

E.g., the Web (Levinson and Rose, 2003)

Extending to user query patterns

Take advantage of sessioned query logs

Thanks to the NUS library staff for their cooperation with our research!!

Especially Ng Kok Koon and Yow Wei Chui

Thanks for sticking it out till the end …

Questions?

Consulting the librarian

Five minutes later … stumped!


	Looking for “Journal of housing for the elderly”

	Tries using the default keyword search

	But lots of results doesn’t necessary equate with finding the item …


	Slone (2000) categorizes three types of queries:

		Known Item: find a title that patron knows exists

		Area: identify area of library for certain resources

		Unknown Item: identify resources to solve problem or address issue


	Kilgour has noted effectiveness of author / title combination
	Up to 50% of keyword searches are known item queries (Larson, 91)
		despite having entry points for author, title and subject search
	Partial answers normally don’t help in known item search – either you find the item or you don’t


	Two tasks:
		Query classification: is this query searching for a known item?
		Search result classification: which, if any, of the search results are the known item(s) sought?
	Use supervised machine learning to solve problem


	Known Item Queries (KIQs)
	Data Collection
	Features of KIQs
	Evaluation
	Conclusions


	Used queries drawn from local OPAC query collection
		Anonymized, sessioned queries
		Over 290K queries, purposefully sampled for a wide array of query characteristics
		320 resulting queries were judged by 9 participants; 1500 item judgments
		Most queries annotated by two participants


	Tasks:
	Query Judgment

	Query Judgment, with search results

	Search Results Judgment


	Participants graded on a 9-point Likert scale
	We also simplified scale to a binary class (1-2 → yes; 3-9 → no)

	Let’s look at two examples:
	Practical digital libraries
	Practical digital archiving
	Query judgments are subjective, may depend on subject familiarity. Thus, we calculate inter-judge agreement to:
		establish whether the tasks are well-defined
		establish performance upper bound


	Data analysis:
	Relatively strong correlation (mostly above .6)
	Stronger correlation with search results shown; easier task with more information
	Most search results are not known items, high correlation for the final task


	Two examples:
	Hill Raymond Coding Theory – A First Course
	japan and cultural

	Distinguishing characteristics:
		Longer: cut-and-paste, copying from a reference
		Mixed Case
		Determiners: not present in unknown item or area searches
		Proper Nouns: specific subjects or author names
		Advanced Operators: title or author restrictions
		Keywords: indicative of a type of publication e.g., “journal”, “textbook”, “course”

	Use POS tagging to create a total of 16 features that embody these characteristics


	Idea: Model KIQ as a separate “language” from non-KIQs
	Model: simple bigram language model


		Create a language model for each point on scale or each class

	Then, test new query’s goodness of fit to LMs:


	Constructing a language model with only 320 annotated instances is small
		Usual language models use millions of examples
	Try bootstrapping a model
		Use a sample’s annotation and apply to all in sample’s it represents
		More data, but also more noise


	What about when we have search results?

	We look at the pairing of search results and the queries that generated them.

	Characteristics of query-search result pairs:
		Sequence of words overlap significantly
		First and last positions are particularly important
		Publication keyword match
		Higher number of relevant search results


	Judge the fitness of a system translation with a reference translation
	examine multiple granularities of n-gram overlap
		BLEU – normalized between 0-1
		NIST – only uses trigrams

	Use these features to model subtle overlap properties


	Use Waikato’s Weka Machine Learning Toolkit:

	Decision Trees (J48 module): for their understandability in their hypotheses
	Support vector machines (SVM, SMO module): for their robustness and general performance

	Compare versus majority baseline (lower bound); and interjudge agreement (upper bound)


	Route patrons to intended search results faster
	KIQs: Turn off fancy footwork: no query expansion, spelling correction
	If we have it, skip to circulation info
	If we don’t, show alternatives:
		Interlibrary Loan (ILL)
		Suggest to purchase


	First such work to demonstrate an automated system that does query and search result classification
		System performance tied to human performance
	Known item search possibly more important than unknown item search
		Often we want to recall where something is
	Known items are subjective; one searcher’s known item is another’s unknown


	Extending query classification to other areas
		E.g., the Web (Levinson and Rose, 2003)
	Extending to user query patterns
		Take advantage of sessioned query logs

	Thanks to the NUS library staff for their cooperation with our research!!
		Especially Ng Kok Koon and Yow Wei Chui

	Thanks for sticking it out till the end …
	Questions?