Thursday, October 30, 2008

Looking up Data in P2P Systems

This article gives a brief introduction to some of the difficulties of P2P computing. It does first list the attractive reasons for P2P systems, including fault tolerant because of its decentralized nature. The challenge of course is how to efficiently use the distributed system. The article specifically talks about the look up problem of trying to find data in this huge distributed system. A broad range of techniques are evaluated and compared to each other, and briefly explained. All of them are based on a distributed hash table. These include the Chord skiplist like routing, tree-like routing, and a multi-dimension routing.

It's hard to determine which routing method is the best, as the summary of the article lists several open questions about the fault tolerance, routing, and security behaviors for malicious nodes. In the article, it does acknowledge the fact that most people directly associate p2p applications with illegal file sharing, including napster or KaZaa. The paper mentions that there are other applications, but i'm not sure if i can think of any off the top of my head...

1 comment:

Matei Zaharia said...

One interesting place where peer-to-peer concepts are used is in data centers. For example, Amazon's Dynamo key-value store roughly resembles Chord (except every node knows about pretty much every other node). Other than that, file downloads are pretty much the only thing they're used for.. centralization just works better for anything that's not absolutely mission-critical.