Last updated 11/2020MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHzLanguage: English | Size: 5.37 GB | Duration: 10h 56m
Run your own Hadoop clusters on your own machine or in the cloud What you'll learn Understand the Hadoop 2.x Architecture Create Map-reduce jobs Plan, install and configure core Hadoop services on a Cluster Validate the Cluster using HDFS, Map Reduce and Spark Understand Cluster Life-Cycle and Performance tuning of a Hadoop Cluster Hands-on solutions to your perplexing, real-world big data problems Requirements Good knowledge of Java Description Hadoop is the most popular, reliable and scalable distributed computing and storage for Big Data solutions. It comprises of components designed to enable tasks on a distributed scale, across multiple servers and thousands of machines. This comprehensive 3-in-1 training course gives you a strong foundation by exploring Hadoop ecosystem with real-world examples. You’ll discover the process to set up an HDFS cluster along with formatting and data transfer in between your local storage and the Hadoop filesystem. Also get a hands-on solution to 10 real-world use-cases using Hadoop. Contents and Overview This training program includes 3 complete courses, carefully chosen to give you the most comprehensive training possible. The first course, Getting Started with Hadoop 2.x, opens with an introduction to the world of Hadoop, where you will learn Nodes, Data Sets, and operations such as map and reduce. The second section deals HDFS, Hadoop's file-system used to store data. Further on, you’ll discover the differences between jobs and tasks, and get to know about the Hadoop UI. After this, we turn our attention to storing data in HDFS and Data Transformations. Lastly, we will learn how to implement an algorithm in Hadoop map-reduce way and analyze the overall performance. The second course, Hadoop Administration and Cluster Management, starts by installing the Apache Hadoop for cluster installation and configuring the required services. Learn various cluster operations like validations, and expanding and shrinking Hadoop services. You will then move onto gain a better understanding of administrative tasks like planning your cluster, monitoring, logging, security, troubleshooting and best practices. Techniques to keep your Hadoop clusters highly available and reliant are also covered in this course. The third course, Solving 10 Hadoop'able Problems, covers the core parts of the Hadoop ecosystem, helping to give a broad understanding and get you up-and-running fast. Next, it describes a number of common problems as case-study projects Hadoop is able to solve. These sections are broken down into sections by different projects, each serving as a specific use case for solving big data problems. By the end of this Learning Path, you’ll be able to plan, deploy, manage and monitor and performance-tune your Hadoop Cluster with Apache Hadoop. About the Author A K M Zahiduzzaman is a software eeer with NewsCred Dhaka. He is a software developer and technology enthusiast. He was a Ruby on Rails developer, but now working on NodeJS and angularJS and python. He is also working with a much wider vision as a technology company. The next goal is introducing SOA within the current applications to scale development via microservices. Zahiduzzaman has a lot of experience with Spark and is passionate about it. He is also a guitarist and has a band too. He was also a speaker for an international event in Dhaka. He is very enthusiastic and love to share his knowledge. Gurmukh Singh is a technology professional with 14+ years of industry experience in infrastructure design, distributed systems, performance optimization, and networks. He has worked in big data domain for the last 5 years and provides consultancy and training on various technologies. He has worked with companies such as HP, JP Morgan, and Yahoo and has authored the book Monitoring Hadoop. Tomasz Lelek is a Software Eeer and Co-Founder of InitLearn. He mostly does programming in Java and Scala. He dedicates his and efforts to get better at everything. He is currently delving into big data technologies. Tomasz is very passionate about everything associated with software development. He has been a speaker at a few conferences in Poland-Confitura and JDD, and at the Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. He was also a speaker at an international event in Dhaka. He is very enthusiastic and loves to share his knowledge. Overview Section 1: Getting Started with Hadoop 2.x Lecture 1 The Course Overview Lecture 2 Installing Hadoop in Local Lecture 3 Bring Process to Data Lecture 4 NameNode Versus DataNode Lecture 5 Map and Reduce Operations Lecture 6 Order of Execution and Parallel Thinking Lecture 7 Formatting a HDFS Lecture 8 Formatting a HDFS Lecture 9 Some Helpful Commands to Communicate with the HDFS Lecture 10 HDFS Protocol and Using It in Applications Lecture 11 Hadoop Jobs Versus Tasks Lecture 12 The Hadoop UI for Task Progress Lecture 13 Running a Couple of Example Jobs Lecture 14 Analyze the Work Flow/Data Flow/Process Flow Lecture 15 Introduction to the Movie Dataset Lecture 16 Data Transformation and Storing to HDFS Lecture 17 Devise a Simple Algorithm for Recommendation Lecture 18 Implement the Algorithm in Hadoop Map-Reduce Way and Analyze Performance Section 2: Hadoop Administration and Cluster Management Lecture 19 The Course Overview Lecture 20 Navigation of GitBash Lecture 21 Navigation of Vagrant Lecture 22 Navigation of VirtualBox Lecture 23 Planning a Single Node Setup Lecture 24 Install Apache Hadoop Lecture 25 Apache Hadoop Overview Lecture 26 Hadoop Distributed File System (HDFS) Lecture 27 YARN Overview Lecture 28 MapReduce Lecture 29 Planning Hadoop Services Placement Lecture 30 Planning ZooKeeper Placement Lecture 31 Planning HDFS Service Placement Lecture 32 Planning YARN Lecture 33 Planning Spark Services Lecture 34 HDFS Concepts Lecture 35 HDFS Data Movement Lecture 36 HDFS Admin Commands Lecture 37 MapReduce Jobs Lecture 38 Spark Jobs Lecture 39 Start/Stop Services Lecture 40 Manage Cluster Using Ambari Lecture 41 Hadoop Upgrade Lecture 42 Scaling Cluster – Part 1 Lecture 43 Scaling Cluster – Part 2 Lecture 44 HDFS Masters Lecture 45 HA Configuration Lecture 46 YARN Masters Lecture 47 Linux ACLs Lecture 48 HDFS ACLs Security – Part 1 Lecture 49 HDFS ACLs Security – Part 2 Lecture 50 Hadoop Users and Groups Lecture 51 NameNode UI Lecture 52 Apache Hadoop Auditing Lecture 53 Hadoop Metrics Lecture 54 Hadoop Logs and Monitoring Lecture 55 Hadoop Troubleshooting – Part 1 Lecture 56 Hadoop Troubleshooting – Part 2 Section 3: Solving 10 Hadoop'able Problems Lecture 57 The Course Overview Lecture 58 Hadoop Distributed File System (HDFS) Lecture 59 Distributed Compute Capability YARN Lecture 60 Apache Hive for ETL and SQL Like Lecture 61 Message Queuing and Data Ingestion Kafka Lecture 62 NoSQL Datastores – Hadoop HBase, Accumulo Lecture 63 Machine Learning – Spark and Spark MLlib Lecture 64 Stream Processing – Spark Streaming Lecture 65 Processing Payment Data from an Event Stream Lecture 66 Advanced Aggregations Using Streaming API – PaymentAnalyzer Lecture 67 Storing Series Data in HBase Lecture 68 Detecting BOT Traffic Using Spark Streaming Lecture 69 Make Web Log Data Queryable – Hive Sink Lecture 70 Investigating Customers Data in Hive Lecture 71 Trending Supply Chain – Finding Top Seller Item in a Streaming Way Lecture 72 Enriching Top Sellers with Additional Information Lecture 73 Analyzing Customer Churn (Quantitative) Using DataFrame Queries Lecture 74 Analyzing Customer Churn (Amounts) Using DataFrame Queries Lecture 75 Storing Low Granularity Structured Sensor Data in HBase Lecture 76 Consuming Sensor Data Stored in HBase – Scan and Count Lecture 77 Building Summaries on Data Streaming from Devices Lecture 78 Introducing Spark GraphX – How to Represent a Graph? Lecture 79 Perform Graph Operations Using GraphX Lecture 80 Counting Degree of Vertices Lecture 81 Neighborhood Aggregations – Collecting Neighbors Lecture 82 Structural Operators – Connected Components Lecture 83 Page Rank Using Spark GraphX Lecture 84 Anomaly Detection Lecture 85 Analyzing Web Logs for Suspicious Activity and Loading into Spark Lecture 86 Implementing Clustering – Choosing Number of Clusters Lecture 87 Detecting Anomalies in Network Traffic Lecture 88 Analyzing Post for an Author Lecture 89 Extracting Information from Unstructured Text Lecture 90 Extracting Information Via Spark DataFrame Lecture 91 Sennt Analysis of Posts Using Logistic Regression Lecture 92 Finding an Author of a Post Lecture 93 ing and Setting Cloudera Sandbox Lecture 94 Finding What Products Users Wants to Buy Using Cloudera Sandbox Toolkit Lecture 95 Using Movies History to Suggest Interesting Content Lecture 96 Testing and Expenting with Recommendation Ee This course is perfect for budding data scientists and data analysts with a firm understanding of Java and wants to get started with Hadoop HomePage: gfxtra__HandsOn_wi.part1.rar.html gfxtra__HandsOn_wi.part2.rar.html gfxtra__HandsOn_wi.part3.rar.html gfxtra__HandsOn_wi.part4.rar.html
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.