Downloads

Extracting Structue Data From UnStructured Data Through HiveQL

Authors

K. Balakrishna1, C. Penchalaiah3 Smt. S. Jessica Saritha2,1

Abstract

RDBMS can store structured data up to some GB of data. Processing of large data is very difficult to handle and also time consumption process. To overcome these problems made of using Hadoop. Apache Hadoop is a framework for big data management and analysis. The Hadoop core provides storing of structured, unstructured and semi structured data with the Hadoop Distributed File System(HDFS) and a simple MapReduce programming model to process and analyze data in comparable, the data stored in this distributed system. Apache Hive is a data warehouse built on top of Hadoop that allows you to query and manage large sets in scattered storage space using a SQL-like lingo call HiveQL, Hive translate queries into a series of MapReduce jobs. In existing system unstructured data stored in HDFS can’t be retrieve into structured format through HiveQL. In this project It is converting twitter data into a structured format by using HiveQL with SerDe. HDFS can stores twitter data by using data streaming process.

Article Details

Published

2017-12-31

Section

Articles

How to Cite

Extracting Structue Data From UnStructured Data Through HiveQL. (2017). International Journal of Engineering and Computer Science, 4(04). https://ijecs.in/index.php/ijecs/article/view/1739