•Past
work (Lee et al. 05, Han et al. 05) used internal resources:
–Knowledge encoded in the records
themselves
–Used field similarity, common
co-author strings for clustering
•Problems
with using only internal resources
–May provide insufficient
information or difficult to extract
–e.g., two papers on the same
topic using disjoint keywords in their titles
•Therefore,
we use resources external to the citation data