|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Observation
|
|
|
|
– |
Not
all URLs are equally
|
|
|
useful
|
|
|
|
– |
e.g.,
aggregator services
|
|
|
• |
Desired
weighting scheme
|
|
|
|
– |
Low
weights to aggregator
|
|
|
web
sites
|
|
|
|
– |
High
weights to personal and
|
|
group
publication pages
|
|
|
• |
Inverse
Host Frequency (IHF)
|
|
|
|
– |
Similar
to Inverse Document
|
|
|
Frequency
(IDF) in
|
|
|
information
retrieval
|
|
|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Consider
citations of top
|
|
|
100
authors in DBLP (by
|
|
|
number
of citations)
|
|
|
• |
For
each such citation,
|
|
|
query
search engine with
|
|
its
title to obtain URLs,
|
|
|
truncate
them to their
|
|
|
hostnames
|
|
|
• |
If
a hostname h has
|
|
|
frequency
f(h), then its
|
|
|
IHF
is
|
|
|
|