•Observation
–Not all URLs are equally useful
–e.g., aggregator services
•Desired
weighting scheme
–Low weights to aggregator web
sites
–High weights to personal and group
publication pages
•Inverse
Host Frequency (IHF)
–Similar to Inverse Document Frequency
(IDF) in information retrieval
•Consider
citations of top 100 authors in DBLP (by number
of citations)
•For
each such citation, query search engine with its
title to obtain URLs, truncate them to their hostnames
•If
a hostname h has frequency f(h),
then its IHF is