Oreilly - Spark for Machine Learning
by Tomasz Lelek | Released September 2017 | ISBN: 9781786466594
TagLineAbout This VideoLeverage Spark to make your machine learning processing distributed and much faster compared to a standard machine learning toolkit like R or PythonUse Natural Language Processing techniques to create a program that learns structure of the posts in a forumUse Gaussian Mixture Model and Logistic Regression from MLlibIn DetailSpark lets you apply machine learning techniques to data in real time, giving users immediate machine-learning based insights based on what's happening right now. Using Spark, we can create machine learning models and programs that are distributed and much faster compared to standard machine learning toolkits such as R or Python.In this course, you'll learn how to use the Spark MLlib. You'll find out about the supervised and unsupervised ML algorithms. You'll build classifications models, extracting proper futures from text using Word2Vect to achieve this. Next, we'll build a Logistic Regression Model with Spark. Then we'll find clusters and correlations in our data using K-Means clustering. We'll learn how to validate models using cross-validation and area under the ROC measurement.You'll also build an effective Recommendation Model using distributed Spark algorithm. We will look at graph processing with GraphX library. By the end of the course, you'll be able to focus on leveraging Spark to create fast and efficient machine learning programs. Show and hide more
- Chapter 1 : Advanced Text Processing and Building Classification Model
- The Course Overview 00:01:05
- Analyzing Text Input Data 00:03:45
- Extracting Features from Data 00:03:26
- Implementing Word2Vect Using Apache Spark 00:12:50
- Chapter 2 : Building a Regression Model with Spark
- Logistic Regression Explanation 00:02:43
- Writing Logistic Regression Model per Author 00:08:41
- Validate Models Using Cross-Validation 00:03:41
- Chapter 3 : Building a Clustering Model with Spark
- Analyzing Time of Post Using Clustering - GMM Explanation 00:03:17
- Implementing GMM in Apache Spark 00:02:01
- Measuring Accuracy Using Area Under ROC 00:02:10
- Chapter 4 : Dimensionality Reductions and Recommendation Engines
- Dimensionality Reduction 00:03:30
- Building Recommendation Engine 00:02:12
- Using Recommendation Engine to Get TOP Recommendations 00:10:25
- Chapter 5 : Graph Processing with GraphX
- What is a Graph? 00:06:45
- GraphX API 00:08:15
- Structural Operations on Graph 00:06:28
- Neighborhood Aggregation 00:04:23
Show and hide more
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.