Project Award Date: 0000-00-00
The amount of information available on the World Wide Web continues to grow, particularly in the area of electronic commerce. According to the Nielsen Internet Demographics Survey [http://www.commerce.net/nielsen/], over 50,000,000 adults in the United States and Canada are connected to the Internet. 5,600,000 of them have purchased products online. Web search engines, which locate Web pages in response to user quieries, are by far the most popular ways of finding information. However, a growing number of products are aimed at saving the user the work of searching the internet for information byut rather surf the Wev on the user's behalf, delivering the new information to them. This technology, called "information filtering," is most approporate for areas of long-term interest, where the user is not attempting to answer a fleeting query but rather wished to keep abreat of a particular area on a continuing basis. The goal of this project is to enhance the information filtering component of ProFusion to produce a new system, called ProFilter. Current information filtering products (Pointcast, MyYahoo) allow users to automatically receive updates selected from a particular collection of vendor-defined information streams. ProFusion, on the other hand, allows users to receive updates on their own queries and thus create their own information streams. We wish to extend the existing filtering capability in four fundamental ways:
1) to summarize the new information received for quixk update;
2) to increase the types of streams that users can choose to receive to include individual Web sites, newsgroups and user-selected URLs;
3) to allow the users to store the contents of their informat ion st ream, creating a personal database; and
4) to provide searching, browsing and visualization of the personal database.
ProFilter will allow users to create a personal (or company-wide) collection of useful information collected form the Web. Currently, only two types of databases of Web pages are supported: everything on the Web (for example, Alta Vista) or everything at a collection of one or more sites (for example, WebWhacker). We intend to produce a personal agent which collects up a subset of Web pages, but the subset collected is not only defined by where the pages are located alone, but also what the content of the Web pages are. In addition, we will incorporate state-of-the-art analysis and visualization techniques which will allow users to quickly see what is new and to browse and search their personal collections.