Abstract
This paper presents Software Implemented Fault Tolerance1 (SIFT) for hypercubes which is implemented by means of a software layer. It is written in each node of the nCube parallel computer.The SIFT utilizes an error detection application software and fast reconfiguration algorithm for avoiding faulty nodes. The Balance Spanning Tree (BST) is used for embedding tree-based algorithms into the hypercube topology. Any single faulty node in the hypercube can be tolerated by the software layer. More than 90% of the multiple faults can be tolerated without backtracking. The SIFT approach has been successfully implemented for a quadtree data compression algorithm for 64x64, 128x128 compressible and uncompressible data. The experiments were run on 4 and 16 node nCubes. The time overhead (reconfiguration and recomputation time) incurred by the injected fault was experimentally estimated. The coverage factor, provided by the error-detection software, has been estimated by means of nSOFIT for the quadtree data compression algorithm.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D.R. Avresky, “Embedding and Reconfiguration of Spanning Trees in Faulty Hypercube”, IEEE Transaction on Parallel and Distributed Systems, vol 10, no.3, March 1999, USA.
D.R. Avresky, S.J. Geoghegan, and P.K. Tapadiya, “A Software-Based Fault Injecton Tool ”, International Journal of Computer Systems Science and Engineering, vol 13, no. 6. pp. 125–135. November 1998.
T. Cormen, C. Leiserson, and R. Rivest, Introduction to Algorithms, McGraw-Hill, pp. 498–513, 1990.
S. Johnsson, and C. Ho, “Optimum Broadcasting and Personalized Communication in Hypercubes, ” IEEE Trans. on Comp., Vol. C-38, No. 9, pp. 197–202, Sept. 1989.
H. Samet,, “The Quadtree and related Hierarchial Data Structures, ” ACM Comp. Surveys, 16:2, pp 187–260 1984.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Avresky, D.R., Geoghegan, S. (1999). Software Implemented Fault Tolerance in Hypercube. In: Amestoy, P., et al. Euro-Par’99 Parallel Processing. Euro-Par 1999. Lecture Notes in Computer Science, vol 1685. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48311-X_71
Download citation
DOI: https://doi.org/10.1007/3-540-48311-X_71
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66443-7
Online ISBN: 978-3-540-48311-3
eBook Packages: Springer Book Archive