Big data is a collection of large volume of unstructured data.It is so large that it becomes difficult for the traditional database systems to process the data.
3V's model is used to define big data which means
Apache Hadoop is open source framework used for storing and processing big data.
It implements distributed file system concept to process the data.
For configuring and setting up Hadoop on your computer, you need to install Linux OS and install Hadoop on it.
During setup, you need to edit the following files:
1) core-site.xml
2) mapred-site.xml
3) hdfs-site.xml
Apache hive is the datawarehouse infrastructure built on top of Hadoop for query summarization and analysis.
HiveQL queries are used to query on hive database.
3V's model is used to define big data which means
- high volume
- high velocity
- high variety
Apache Hadoop is open source framework used for storing and processing big data.
It implements distributed file system concept to process the data.
For configuring and setting up Hadoop on your computer, you need to install Linux OS and install Hadoop on it.
During setup, you need to edit the following files:
2) mapred-site.xml
3) hdfs-site.xml
Apache hive is the datawarehouse infrastructure built on top of Hadoop for query summarization and analysis.
HiveQL queries are used to query on hive database.
nice........
ReplyDeletecan u post something more about HiveQL??
ReplyDelete