Abstract
Distributed file systems are client based applications in which the central server stores the files that can be accessed via clients with proper authorization rights. Similar to an operating system, the distributed file systems manage the overall system with naming conventions and mapping schemes. Google file system (GFS) was the proprietary system developed by Google for its own use, which included deployment of commodity hardware to retain the enormous generation of data. Hadoop Distributed File System (HDFS), an open source community project, was majorly developed by Yahoo! was designed to store large amounts of data sets reliably along with providing high sets of bandwidths for streaming data on client applications. For the processing of the data stored in HDFS, Hadoop provides the users with a programming model called MapReduce. This model allows the users to reliably distribute a large problem into smaller sub-problems onto several nodes in number of clusters without facing any problems. In this paper, we describe the GFS and HDFS and compare and contrast the two systems based on a list of attributes as also this paper provides the basic functionality of MapReduce framework.