Abstract
Grid computing or computational grid is always a vast research field in academic. Computational grid provides resource sharing through multi-institutional virtual organizations for dynamic problem solving. Various heterogeneous resources of different administrative domain are virtually distributed through different network in computational grids. Thus any type of failure can occur at any point of time and node running in grid environment might fail. Hence fault tolerance is an important and challenging issue in grid computing as the dependability of individual grid resources may not be guaranteed. In order to make computational grids more effective and reliable fault tolerant system is necessary. The objective of this paper is to test the crash and omission transient failure in resource scheduling. This paper presents an overview of fault tolerance and its techniques, task replication and most fitting resource allocation algorithm.