Supporting unstructured peer-to-peer social networking

NSF Collaborative Grant: co-investigators: Gustavo de Veciana (U.T. Austin) and George Kesidis (Penn State)
Graduate Students: Virag Shah, Ayis Ziotopoulos (joined Google, CA)
Support: This project is supported by the National Science Foundation Grant CNS 0915928

Project Overview:

Peer-to-peer (P2P) systems have seen continued growth, in terms of traffic volume, and as the architecture of choice for applications and network services; notable benefits lie in their distributed design leading to higher reliability and flexibility. However as users/peers increasingly become content providers, and generally conduct more of their business on the network, privacy is a critical concern. Our research attempts to leverage trust relationships among interconnected peers, e.g., their willingness to reveal their ability to resolve certain classes of queries to their neighbors, their reputation, etc, to improve performance, while maintaing a degree of privacy.

In particular, peer-to-peer systems can be designed so as to dynamically adapt, based on the outcomes of previous peer transactions. Such adaptation can, for example, serve to guide the routing of future queries for particular classes of content/files/information to nodes most likely to to resolve them-- this is one of the key areas of inquiry for this project. We focus on unstructured or hybrid networks where peer-membership correlations (in content and ability to resolve classes of queries) among communities of interest can be learned to improve the search performance by appropriately biasing random walks and limited-scope flooding. P2P systems can also be designed to capture, store and share dynamic spatio-temporal information in a manner that scales -- in particular by recognizing that often such information will be consumed by mobiles/users that are close to where it was generated. Such locality in consumption and production of information, can be integrated into the design of P2P networks, and serve to once again adapt the infrastructure to heterogeneity in loads and available resources.

Technical Accomplishments:

Throughput optimal routing queries in P2P networks: As mentioned above one of our key objectives is to devise new ways to route queries in unstructured (hybrid) peer-to-peer systems which can learn to better resolve queries based on observing the outcomes of previous queries. The basic challenge in finding files/resources in an unstructured P2P network is routing a query to the resource (or a peer that knows where the file/resource would be), without access to an addressing system or directory. A typical solution is simply to route queries at random, i.e., following a random walk over the P2P overlay until the file/resource is found. Indeed, much recent research work has focused on the analysis of such mechanisms, e.g., speed of convergence and/or time to find the object of interest. Such an approach has several problems, including in particular that it may take some time to find the object of interest, but more importantly, it is completely oblivious to the query loads traffic generated on the system. This is important since query resolution is carried out by peers which may have different levels of altruism and performance is dependent on ensuring that queries are routed in a manner that is consistent with available resources.

To that end we have devised an approach to routing queries that has several new characteristics. Queries are classified into classes, where classes can be a broadly defined, e.g., types of music video or files. In our approach query routing is class based and exploits a type of back pressure-based scheduling and routing mechanism. Our approach is completely novel, in various respects --in particular:

It dynamically accounts for heterogeneity in (super-) peerâ service rate, reflecting their query processing ability, altruism, and/or bounds on query forwarding rates to mitigate DOS attacks. To the best of our knowledge, this is the first work to rigorously account for such heterogeneity in devising a search mechanism for P2P networks.
It is based on classifying queries into classes. This classification serves as a type of name aggregation, which enables nodes to infer the likelihoods of resolving class queries, which, in turn, are used in learning how to forward queries.
Our approach is fully distributed in that it involves information sharing only amongst neighbors, and achieves throughput optimality subject to a Grade of Service (GoS) constraint on query resolution. Throughput optimality here refers to stabilizing the query queues if at all possible. The GoS constraint corresponds to guaranteeing that each query class follows a route over which it has a reasonable chance, of being resolved.

In our work we provide and evaluate several interesting variations on our throughput optimal mechanism that help significantly improve the delay performance, and further reduce complexity, making it amenable to implementation. Specifically, we formally show that back pressure with aggregated queues, where aggregation is based on queries' histories, is throughput optimal for fully connected super-peer networks. This provides a basis for substantially reducing complexity by approximations, e.g., in the case where content is randomly placed amongst peers.

Although there has been substantial work on back pressure-based mechanisms for routing traffic in wireline and wireless networks, we believe this result represents a significant technical contribution that has a substantially different character. Our approach effectively guides a routing of queries (rather than traffic) to locations where they are likely to be resolved. Our approach not only achieves throughput-optimality, i.e., stability, but does so under Grade of Service constraints on query resolution, i.e., it achieves the maximal capacity region subject to such GoS constraints.

P2P network architecture for the storage and query of dynamic spatio-temporal flow of events: As part of this project we have also worked towards developing a novel approach to store and query real-time (contextual) information from/to a population of wireless/wired devices generating a stream of spatio-temporal events corresponding to sensors/applications and/or humans. The approach is based on a novel peer-to-peer overlay network architecture which exploits the spatial relevance of events and the locality of queries to store and query events close to where they are likely to be consumed. To date we have developed most of the basic protocols necessary to operate such a system and proposed a simple model for such systems that can be used to analyze the sensitivity of performance (query delay) to node density and overlay connectivity.

Representative Publications:

Learning to Route Queries in Unstructured P2P Networks: Achieving Throughput Optimality Subject to Query Resolution Constraints.
V. Shah, G. de Veciana, and G Kesidis, IEEE INFOCOM, March 2012.
P2P Network for Storage and Query of a Spatio-Temporal Flow of Events.
A. Ziotopoulos, and G. de Veciana in International Workshop on Mobile Peer-to-Peer Computing in conjunction with the IEEE Conference on Pervasive Computing and Communications (Percom), March 2011
Addressing Non-homogeneities in a Ubiquitious P2P Platform for Context Exchange.
A. Ziotopoulos and G. de Veciana, IEEE WoWMoM Workshop on Autonomic and Opportuistic Communications, June 2012.
STARR-DCS: Spatio-Temporal Adaptation of Random Replication for Data Centric Storage.
A. Cuevas, and M. Uruena and G. de Veciana and A. Yadav ACM Trasactions on Sensor Networks, 10(3):, to appear, 2014.
Dynamic Data-Centric Storage for Long-Term Storage in Wireless Sensor and Actor Networks.
Angel Cuevas, Manuel Uruena, Gustavo de Veciana, Ruben Cuevas and Noel Crespi Wireless Networks Journal, May 2013
Decentralized Capacity Reallocation for a Loss Network.
A. Kurve and G. Pang and G. Kesidis and G. de Veciana, Conference on Information Sciences and Systems, March 2012.
Mobile Applications and Algorithms to Facilitate Electric Vehicle Deployment.
Yuhuan Du and Gustavo de Veciana, Proceedings of the IEEE, January 2013.