Video: .mp4 (1280x720, 30 fps(r)) | Audio: aac, 48000 Hz, 2ch | Size: 348 MB
Genre: eLearning Video | Duration: 6 lectures (47 mins) | Language: English
Diabetes Prediction using Machine Learning in Apache Spark
What you'll learn Homepage: https://www.udemy.com/course/data-science-hands-on-diabetes-prediction-with-pyspark-mllib/
Diabetes Prediction using Spark Machine Learning (Spark MLlib)
Learn Pyspark fundamentals
Working with dataframes in Pyspark
Analyzing and cleaning data
Process data using a Machine Learning model using Spark MLlib
Build and train logistic regression model
Performance evaluation and saving model
Requirements
Basics of Python
Description
This is a Hands-on 1- hour Machine Learning Project using Pyspark. You learn by Practice.
No unnecessary lectures. No unnecessary details.
A precise, to the point and efficient course about Machine learning in Spark.
About Pyspark:
Pyspark is the collaboration of Apache Spark and Python. PySpark is a tool used in Big Data Analytics.
Apache Spark is an open-source cluster-computing framework, built around speed, ease of use, and streaming analytics whereas Python is a general-purpose, high-level programming language. It provides a wide range of libraries and is majorly used for Machine Learning and Real-Time Streaming Analytics.
In other words, it is a Python API for Spark that lets you harness the simplicity of Python and the power of Apache Spark in order to tame Big Data. We will be using Big data tools in this project.
You will learn more in this one hour of Practice than hundreds of hours of unnecessary theoretical lectures.
Learn the most important aspect of Spark Machine learning (Spark MLlib) :
Pyspark fundamentals and implementing spark machine learning
Importing and Working with Datasets
Process data using a Machine Learning model using spark MLlib
Build and train Logistic regression model
Test and analyze the model
We will build a model to predict diabetes. This is a 1- hour project. In this hands-on project, we will complete the following tasks:
Task 1: Project overview
Task 2: Intro to Colab environment & install dependencies to run spark on Colab
Task 3: Clone & explore diabetes dataset
Task 4: Data Cleaning
Check for missing values
Replace unnecessary values
Task 5: Correlation & feature selection
Task 6: Build and train Logistic Regression Model using Spark MLlib
Task 7: Performance evaluation & Test the model
Task 8: Save & load model
Who this course is for:
Anyone interested in Data analysis with Spark and ML
Anyone who wants to learn fundamentals of Apache Spark in Big Data Analytics
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.