Abstract
Detecting node failures within Peer-to-Peer networks is an inherent trade-off between timely detection and consuming bandwidth on network maintenance. In the absence of user-driven messages, the majority of P2P networks rely upon the exchange of periodic keep-alive messages to maintain connections and network topology. We investigate three novel algorithms which prioritise keep-alive messages to nodes that are more likely to have failed. In doing so, these algorithms significantly reduce the expected delay between failures occurring and their subsequent detection in comparison to the standard approach, whilst consuming similar levels of bandwidth. Our algorithms build upon several studies that have shown that older peers are more likely to remain in the network than their short-lived counterparts. Each of our algorithms increase the interval between successive keep-alive messages as peers age in the system, based upon the distribution of peer session times and the current age of peers. We extensively describe the details of each algorithm, before comparing them to the standard periodic approach using simulations based upon measured network data. Furthermore, we show that these algorithms are complimentary to existing gossip-based mechanisms and investigate alternate methods of ascertaining a node’s age so that our algorithms can be robustly deployed in untrustworthy environments.
Similar content being viewed by others
Notes
We would like to thank Matson Systems for providing us fully anonymized data from the LegalTorrents.com community trackers.
References
Gnutella (2010) Gnutella protocol development wiki [Online]. http://wiki.limewire.org/index.php?title=GDF
Cohen B (2003) Incentives build robustness in BitTorrent. In: Workshop on economics of Peer-to-Peer systems, vol 6
Stoica I, Morris R, Karger D, Kaashoek F, Balakrishnan H (2001) Chord: a scalable Peer-To-Peer lookup service for internet applications. In: Proceedings of the 2001 ACM SIGCOMM conference, pp 149–160. [Online]. Available: citeseer.ist.psu.edu/stoica01chord.html
Rowstron AIT, Druschel P (2001) Pastry: scalable, decentralized object location, and routing for large-scale Peer-to-Peer systems. In: Proc. of the IFIP/ACM int. conf. on distributed systems platforms. Springer, Heidelberg, pp 329–350
Rhea S, Geels D, Roscoe T, Kubiatowicz J (2004) Handling churn in a dht. In: ATEC ’04: Proceedings of the annual conference on USENIX annual technical conference. USENIX Association, Berkeley, p 10
Alima L, El-Ansary S, Brand P, Haridi S (2003) DKS (N, k, f): a family of low communication, scalable and fault-tolerant infrastructures for P2P applications. In: Proceedings of the 3st international symposium on cluster computing and the grid. IEEE Computer Society, Washington, DC, p 344
Krishnamurthy S, El-Ansary S, Aurell E, Haridi S (2008) Comparing maintenance strategies for overlays. In: Proceedings of the 16th Euromicro conference on parallel, distributed and network-based processing (PDP 2008). IEEE Computer Society, Washington, DC, pp 473–482
Ghinita G, Teo YM (2006) An adaptive stabilization framework for distributed hash tables, p 10
Castro M, Costa M, Rowstron A (2004) Performance and dependability of structured Peer-to-Peer overlays. In: DSN ’04: Proceedings of the 2004 international conference on dependable systems and networks. IEEE Computer Society, Washington, DC, p 9
Zhuang S, Geels D, Stoica I, Katz R (2005) On failure detection algorithms in overlay networks. In: Proceedings IEEE INFOCOM 2005. 24th annual joint conference of the IEEE Computer and Communications Societies, vol 3
Dedinski I, Hofmann A, Sick B (2007) Cooperative keep-alives: an efficient outage detection algorithm for P2P overlay networks. In: P2P ’07: Proceedings of the seventh IEEE international conference on Peer-to-Peer computing. IEEE Computer Society, Washington, DC, pp 140–150
Li J, Stribling J, Morris R, Kaashoek MF (2005) Bandwidth-efficient management of dht routing tables. In: NSDI’05: Proceedings of the 2nd conference on symposium on networked systems design & implementation. USENIX Association, Berkeley, pp 99–114
So KCW, Sirer EG (2007) Latency and bandwidth-minimizing failure detectors. SIGOPS Oper Syst Rev 41(3):89–99
Stutzbach D, Rejaie R (2006) Understanding churn in Peer-to-Peer networks. In: IMC ’06: Proceedings of the 6th ACM SIGCOMM on Internet measurement. ACM, New York, pp 189–202. [Online]. Available: http://portal.acm.org/citation.cfm?id=1177105
Steiner M, En-Najjary T, Biersack EW (2009) Long term study of peer behavior in the kad dht. IEEE/ACM Trans Netw 17(5):1371–1384
Bustamante FE, Qiao Y (2004) Friendships that last: peer lifespan and its role in P2P protocols. In: Web content caching and distribution: proceedings of the 8th international workshop. Kluwer, Norwell, pp 233–246
Stutzbach D, Rejaie R, Sen S (2005) Characterizing unstructured overlay topologies in modern P2P file-sharing systems. In: Proc. of Internet measurement conference (IMC). [Online]. Available: https://db.usenix.org/events/imc05/tech/stutzbach.html
Shimizu R (1979) On a lack of memory property of the exponential distribution. Ann Inst Stat Math 31(1):309–313
Saroiu S, Gummadi KP, Gribble SD (2003) Measuring and analyzing the characteristics of napster and gnutella hosts. Multimedia Syst 9(2):170–184
Leonard D, Rai V, Loguinov D (2005) On lifetime-based node failure and stochastic resilience of decentralized Peer-to-Peer networks. In: SIGMETRICS ’05: Proceedings of the 2005 ACM SIGMETRICS international conference on measurement and modeling of computer systems. ACM, New York, pp 26–37
Izal M (2010) Bittorrent traces and tools. World Wide Web, http://mikel.tlm.unavarra.es/~mikel/bt_pam2004/
Izal M, Urvoy-Keller G, Biersack EW, Felber P, Al Hamra A, Garcés-Erice L (2004) Dissecting bittorrent: five months in a torrent’s lifetime. In: Barakat C, Pratt I (eds) Passive and active network measurement, ser. Lecture notes in computer science, vol 3015. Springer, Berlin, pp 1–11. [Online]. Available: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.5.7345
Saroiu S, Gummadi P, Gribble S (2002) A measurement study of Peer-to-Peer file sharing systems. [Online]. Available: http://citeseer.ist.psu.edu/saroiu02measurement.html
Gummadi KP, Dunn RJ, Saroiu S, Gribble SD, Levy HM, Zahorjan J (2003) Measurement, modeling, and analysis of a Peer-to-Peer file-sharing workload. In: SOSP ’03: Proceedings of the nineteenth ACM symposium on operating systems principles. ACM, New York, pp 314–329
Mahajan R, Castro M, Rowstron A (2003) Controlling the cost of reliability in Peer-to-Peer overlays. In: IPTPS
Chen W, Toueg S, Aguilera MK (2000) On the quality of service of failure detectors. IEEE Trans Comput 51:561–580
Li J, Stribling J, Morris R, Kaashoek MF, Gil TM (2005) A performance vs. cost framework for evaluating DHT design tradeoffs under churn. In: Proc. of the 24th Infocom
Bustamante FE, QiaoY (2008) Designing less-structured P2P systems for the expected high churn. IEEE/ACM Trans Netw 16(3):617–627
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Price, R., Tiňo, P. & Theodoropoulos, G. Still Alive: Extending Keep-Alive Intervals in P2P Overlay Networks. Mobile Netw Appl 17, 378–394 (2012). https://doi.org/10.1007/s11036-011-0317-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-011-0317-3