Oreilly - Hands-On Big Data Analysis with Hadoop 3
by Tomasz Lelek | Released August 2018 | ISBN: 9781788999908
Perform real-time data analytics with HadoopAbout This VideoAnalyze large volumes of data effectively by combining the power of big data processing tools such as Hadoop and Spark StreamingWork with different kinds of data and perform real-life data operationsExplore best use cases, identify problem areas, and solve them with the best open source toolsIn DetailThis course is your guide to performing real-time data analytics and stream processing with Spark. Use different components and tools such as HDFS, HBase, and Hive to process raw data. Learn how tools such as Hive and Pig aid in this process.In this course, you will start off by learning data analysis techniques with Hadoop using tools such as Hive. Furthermore, you will learn to apply these techniques in real-world big data applications. Also, you will delve into Spark and its related tools to perform real-time data analytics, streaming, and batch processing on your application.Finally, you'll learn how to extend your analytics solutions to the cloud.Please note that this course is based on Hadoop 3.0 but the code used in the course is compatible with Hadoop 3.2. Show and hide more Publisher Resources Download Example Code
- Chapter 1 : HDFS and HBase – The Hadoop Database
- The Course Overview 00:01:51
- Why HBase? 00:04:29
- HDFS and HBase 00:02:56
- Column-Oriented Database Concepts 00:04:03
- Creating an HBase Database – Using HBase from Java 00:08:27
- Using Sqoop to Import Data to HDFS 00:04:50
- Chapter 2 : Data Processing Using MapReduce
- MapReduce Job Architecture 00:05:07
- Learning Spark’s Key Concepts – Spark Context, Driver, and RDD 00:06:15
- Spark API – Functional Programming Using Spark 00:05:02
- Spark Transformations and Actions 00:06:17
- Writing MapReduce Jobs Using Apache Spark 00:05:40
- Chapter 3 : Analyzing Data Using Hive and Pig
- Introduction to Pig 00:06:27
- Hive Architecture and Use Cases 00:05:45
- Hive Query Language 00:05:40
- Using Hive and Pig to Perform MapReduce Query 00:07:05
- Chapter 4 : Performing Real-Time Events Analysis Using Spark Streaming
- Introducing Spark Streaming 00:06:14
- Handling Time in High-Velocity Streams 00:05:58
- Building Streaming Application 00:04:09
- Filtering Bots from a Stream of Page View Events 00:06:52
Show and hide more