![]()
|
DiSC: A Distributed Searchable Cache |
|
Home N-SMARTS Schedule Papers Bios Links |
DiSC is a project inspired by a trip to Ghana which I made in 2001. During this trip, I observed that while many Internet cafes in Ghana have modern computers with fast processors, ample memory, large hard drives, and fast LAN connections, their WAN connections were slow and intermittent. Furthermore, the university library which I visited was unable to afford internet connectivity, although it had computers available for the task. I therefore conceived of a web-cache which is distributed over the entire local network. This allows the cache to take advantage of all of the excess storage in the network, and avoids the need for a centralized proxy server, thus reducing hardware costs (and completely eliminateing additional capital outlay for existing networks). By making the cache searchable, we allow for completely disconnected operation, and avoid latency for search queries. I envision a system in a library which only caches library-like documents: technical-papers, news articles, online books, online journals and other relativly static, mostly textual documents. I believe that even with a modest network, a sizable percentage of the documents like these which exist on the internet could be cached and searchable. In a project for the class IT for Developing Regions at UC Berkeley, I and my partners found some evidence that there is significant correlation between queries and data in the cache. We suspect that this locality will actually result in more relevant results, since data has been "hand picked" by other users who use the cache. I intend to study this relationship further, including techniques for exploiting locality in queries to improve cache performance and visa-versa, as well as other search ranking techniques which take advantage of the cache's ability to monitor all web activity. DiSC Links
|
|
Department of Computer Science 205 Cory Hall #1772 University of California Berkeley, CA 94720-1776 My office is at: 545S Cory Hall | Last Modified: Tuesday, 05-Oct-2004 12:55:02 PDT |