Oreilly - Machine Learning with scikit-learn LiveLessons
by David Mertz | Released January 2019 | ISBN: 9780135474198
6+ Hours of Video Instruction Learn the main concepts and techniques used in modern machine learning through numerous examples written in scikit-learn Overview Machine Learning with scikit-learn LiveLessons is your guide to the scikit-learn library, which provides a wide range of algorithms in machine learning that are unified under a common and intuitive Python API. Most of the dozens of classes provided for various kinds of models share the large majority of the same calling interface. Quite often you can easily substitute one algorithm for another with very little or no change in your underlying code. This enables you to explore the problem space quickly and often to arrive at an optimal–or at least satisficing–approach to your problem domain or datasets. The scikit-learn library is built on the foundations of the numeric Python stack. It uses NumPy for its fundamental data structures and optimized performance, and it plays well with pandas and matplotlib. It is free software under a BSD license. The great bulk of machine learning programming in Python is done with scikit-learn—at least outside the specialized domain of deep neural networks. About the Instructor David Mertz has been involved with the Python community for 20 years, with data science, (under various previous names) and with machine learning since way back when it was more likely to be called “artificial intelligence.” He was a director of the Python Software Foundation for six years and continues to serve on, or chair, a variety of PSF working groups. He has also written quite a bit about Python: the column Charming Python for IBM developerWorks, for many years; Text Processing in Python (Addison-Wesley, 2003); and two short books for O’Reilly. He created the data science training program for Anaconda, Inc., and was a senior trainer for them. Skill Level Intermediate Learn How To Use various machine learning techniques Explore a dataset Perform various types of classification Use regression, clustering, and hyperparameters Use feature engineering and feature selection Implement data pipelines Develop robust train/test splits Who Should Take This Course Programmers and statisticians interested in using Python and the scikit-learn library to implement machine learning Course Requirements Programming experience Table of Contents Introduction Lesson 1: What Is Machine Learning? Lesson 2: Exploring a Dataset Lesson 3: Classification Lesson 4: Regression Lesson 5: Clustering Lesson 6: Hyperparameters Lesson 7: Feature Engineering and Feature Selection Lesson 8: Pipelines Lesson 9: Robust Train/Test Splits Summary About Pearson Video Training Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include: IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video. Show and hide more
- Introduction
- Machine Learning with scikit-learn LiveLessons: Introduction 00:05:38
- Lesson 1: What is Machine Learning?
- Learning objectives 00:01:02
- 1.1 Install 00:09:45
- 1.2 Understand the ML Libraries (new lesson, title TBD) 00:07:35
- 1.3 Describe the techniques used in machine learning 00:06:08
- 1.4 Understand the difference between "deep learning" and other ML techniques 00:08:37
- 1.5 Understand classification versus regression versus.clustering and over/underfitting 00:13:49
- 1.6 Perform dimensionality reduction, explain feature engineering, and utilize feature selection 00:06:29
- 1.7 Distinguish categorical versus ordinal versus continuous variables 00:05:18
- 1.8 Perform one-hot encoding 00:05:37
- 1.9 Utilize hyperparameters and grid search 00:04:26
- 1.10 Understand choose and metrics 00:12:08
- Lesson 2: Exploring a Data Set
- Learning objectives 00:00:25
- 2.1 Uncover anomalies and data integrity problems 00:11:59
- 2.2 Clean and massage your data 00:05:31
- 2.3 Choose features and a target 00:06:36
- 2.4 Implement a train/test split and choose model 00:09:30
- Lesson 3: Classification
- Learning objectives 00:01:09
- 3.1 Understand feature importances 00:04:41
- 3.2 Establish cut points in a decision tree 00:07:06
- 3.3 Utilize a common API 00:04:52
- 3.4 Use a more encouraging dataset 00:03:47
- 3.5 Compare multiple classifiers 00:08:45
- 3.6 Understand more about feature importances 00:06:02
- 3.7 Use multiclass classification 00:09:44
- 3.8 Understand prediction probabilities and decision boundaries 00:07:16
- Lesson 4: Regression
- Learning objectives 00:00:49
- 4.1 Sample data sets in scikit-learn 00:10:47
- 4.2 Compare a gaggle of regressors 00:08:42
- 4.3 Use linear models 00:05:49
- 4.4 Understand the pitfalls of linear models 00:08:27
- 4.5 Use non-linear regressors 00:06:58
- Lesson 5: Clustering
- Learning objectives 00:01:09
- 5.1 Compare clustering algorithms 00:05:49
- 5.2 Cluster to test a hypothesis 00:04:42
- 5.3 Cluster into N classes 00:14:02
- 5.4 Cluster into an unknown number of categories 00:12:44
- 5.5 Use density based clustering: DBScan and HDBScan 00:08:20
- 5.6 Evaluate clustering 00:06:31
- Lesson 6: Hyperparameters
- Learning objectives 00:01:36
- 6.1 Explore one hyperparameter 00:07:40
- 6.2 Explore many hyperparameters 00:06:08
- 6.3 Use GridsearchCV 00:06:27
- Lesson 7: Feature Engineering and Feature Selection
- Learning objectives 00:01:20
- 7.1 Understand a synthetic example 00:10:16
- 7.2 Understand dimensionality reduction 00:02:21
- 7.3 Use principal component analysis (PCA) 00:09:57
- 7.4 Use other decompositions: NMF, LDA, ICA, t-dist 00:12:05
- 7.5 Implement feature selection: Univariate 00:08:19
- 7.6 Implement feature selection: Model-based 00:09:54
- 7.7 Understand dimensionality expansion (polynomial features) 00:10:18
- 7.8 Use one-hot encoding 00:06:04
- 7.9 Scale with StandardScaler, RobustScaler, MinMaxScaler, Normalizer, and others 00:14:41
- 7.10 Bin values with quantiles or binarize 00:05:06
- Lesson 8: Pipelines
- Learning objectives 00:00:46
- 8.1 Understand imperative sequential processing 00:10:32
- 8.2 Use pipelines 00:09:26
- 8.3 Do pipelines with grid search 00:07:23
- Lesson 9: Robust Train/Test Splits
- Learning objectives 00:00:38
- 9.1 Understand splitting 00:06:26
- 9.2 Understand multiple splitting: KFold, LeaveOneOut, StratifiedKFold, etc 00:11:37
- 9.3 Use cross validation 00:07:00
- Summary
Show and hide more