| ||||||
|
|
TITLE: Creating a virtual library with HomePageSearch and Mops AUTHORS(s) & AFFILIATION(s): Gerd Hoff, FB IV - Informatik, Universität Trier, Trier (Germany); and Martin Mundhenk, FB IV - Informatik, Universität Trier, Trier (Germany); KEYWORD(s): virtual library, intelligent search techniques, scientific literature on the web, focused navigation PRESENTER / CONTACT PERSON: Gerd Hoff CONTACT EMAIL: hoffg@uni-trier.de ABSTRACT: The fast dissemination of new research results on the world-wide web is a new challenge for search engines. In many research areas, scientists make their newest results electronically available on their web site, long before the results appear in conference proceedings or in journals. Therefore, nowadays it is important to find the newest related electronic publications on the web - in other words, it is necessary to maintain a virtual library of not-yet-printed literature. We implemented a new approach for constructing a virtual library of scientific papers which is specialized in a relatively small research area and allows to find the latest new documents. In the first step, we want to find the locations in the web where we expect interesting documents to appear. We obtain the names of the scientists who are active in the research area under consideration from computer science bibliographies on the web, e.g. DBLP. Our HomePageSearch system searches these scientists' home pages. Since there are no fixed construction rules for home pages, we determine about 500 characteristics which form a topical ranking function. In a 2-ary process, ordinary search engines are used to perform the search. In the second step, our search engine Mops creates a virtual and distributed library from the scientific papers found close to the locations determined during the first step - i.e. in the area close to the scientists' home pages. We conclude that such a focused, topic oriented crawling is very effective for building high-quality virtual libraries, using ordinary desktop hardware. |
|
Last modified October 28, 2001 by Scott Tilley. |