R Complete Data Analysis Solutions

Last updated 7/2020MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHzLanguage: English | Size: 567.32 MB | Duration: 5h 21m

Learn by doing - solve real-world data analysis problems using the most popular R packages

What you'll learn

Extract, transform, and load data from heterogeneous sources

Understand how easily R can confront probability and statistics problems

Get simple R instructions to quickly organize and manipulate large datasets

Predict user purchase behavior by adopting a classification approach

Implement data mining techniques to discover items that are frequently purchased together

Group similar text documents by using various clustering methods

Requirements

You are expected to know basics of R programming. You should have R installed on your system and your system should be connected to the Internet. That’s all really!

Description

If you are looking for that one course that includes everything about data analysis with R, this is it. Let’s get on this data analysis journey together.

This course is a blend of text, videos, code examples, and assessments, which together makes your learning journey all the more exciting and truly rewarding. It includes sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. This helps you learn a range of topics at your own speed and also move towards your goal of solving data analysis problems with R.

The R language is a powerful open source functional programming language. R is becoming the go-to tool for data scientists and analysts. Its growing popularity is due to its open source nature and extensive development community. R is increasingly being used by experienced data science professionals instead of Python and it will remain the top choice for data scientists in 2017. Big companies continue to use R for their data science needs and this course will make you ready for when these opportunities come your way.

This course has been prepared using extensive research and curation skills. Each section adds to the skills learned and helps us to achieve mastery of data analysis. Every section is modular and can be used as a standalone resource.

This course has been designed to include topics on every possible requirement from a data scientist and it does so in a step-by-step and practical manner. This course covers step-by-step and practical solutions to data analysis using R. It covers every required topic and also adds an introduction to machine learning.

We will start off with learning how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation will be provided, illustrating how to use the “dplyr” and “data.table” packages to efficiently process larger data structures. We will then understand how easily R can confront probability and statistics problems and look at R instructions to quickly organize and manipulate large datasets. We will then learn to predict user purchase behavior by adopting a classification approach and implement data mining techniques to discover items that are frequently purchased together. Finally, we will offer insight into series analysis on financial data, after which there will be detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.

This course has been authored by some of the best in their fields

Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LaData, a start-up company that mainly focuses on providing big data and machine learning products. He specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences.

Selva Prabhakaran

Selva Prabhakaran is a data scientist with a large E-commerce organization. In his 7 years of experience in data science, he has tackled complex real-world data science problems and delivered production-grade solutions for top multinational companies.

Tony Fischetti

Tony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems.

Viswa Viswanathan

Viswa Viswanathan is an associate professor of Computing and Decision Sciences at the Stillman School of Business in Seton Hall University. In addition to teaching at the university, Viswa has conducted training programs for industry professionals. He has written several peer-reviewed research publications in journals such as Operations Research, IEEE Software, Computers and Industrial Eeering, and International Journal of Artificial Intelligence in Education.

Shanthi Viswanathan

Shanthi Viswanathan is an experienced technologist who as a consultant, has helped several large organizations, such as Canon, Cisco, Celgene, Amway, Warner Cable, and GE among others, in areas such as data architecture and analytics, master data management, service-oriented architecture, business process management, and modeling.

Romeo Kienzler

Romeo Kienzler is the Chief Data Scientist of the IBM Watson IoT Division and working as an Advisory Architect helping client worldwide to solve their data analysis problems. His current research focus is on cloud-scale data mining using open source technologies including R, ApacheSpark, SystemML, ApacheFlink, and DeepLearning4J.

This course is a blend of text, videos, and assessments, all packaged together keeping your journey in mind. It combines some of the best that Packt has to offer in one complete package. It includes content from the following Packt products

R for Data Science Cookbook by Yu-Wei, Chiu (David Chiu)R for Data Science Solutions[video] by Yu-Wei, Chiu (David Chiu)Mastering R Programming[video] by Selva PrabhakaranData Analysis with R by Tony FischettiR Data Analysis Cookbook by Viswa Viswanathan and Shanthi ViswanathanLearning Data Mining with R[video] by Romeo Kienzler

Overview

Section 1: Data Extracting, Transfog, and Loading

Lecture 1 About the course

Lecture 2 ing open data

Lecture 3 Reading and writing CSV files

Lecture 4 Scanning text files

Lecture 5 Working with Excel files

Lecture 6 Reading data from databases

Lecture 7 Scraping web data

Lecture 8 Accessing Facebook data

Lecture 9 Working with Twitter

Section 2: Data Preprocessing and Preparation

Lecture 10 Renaming the data variable

Lecture 11 Converting data types

Lecture 12 Working with the date format

Lecture 13 Adding new records

Lecture 14 Filtering data

Lecture 15 Dropping data

Lecture 16 Meg and sorting data

Lecture 17 Reshaping data

Lecture 18 Detecting missing data

Lecture 19 Imputing missing data

Section 3: Data Manipulation

Lecture 20 Enhancing a data.frame with a data.table

Lecture 21 Managing data with a data.table

Lecture 22 Perfog fast aggregation with a data.table

Lecture 23 Meg large datasets with a data.table

Lecture 24 Subsetting and slicing data with dplyr

Lecture 25 Sampling data with dplyr

Lecture 26 Selecting columns with dplyr

Lecture 27 Chaining operations in dplyr

Lecture 28 Arrag rows with dplyr

Lecture 29 Eliminating duplicated rows with dplyr

Lecture 30 Adding new columns with dplyr

Lecture 31 Summarizing data with dplyr

Lecture 32 Meg data with dplyr

Section 4: Simulation from Probability Distributions

Lecture 33 Generating random samples

Lecture 34 Understanding uniform distributions

Lecture 35 Generating binomial random variates

Lecture 36 Generating Poisson random variates

Lecture 37 Sampling from a normal distribution

Lecture 38 Sampling from a chi-squared distribution

Lecture 39 Understanding Student's t-distribution

Lecture 40 Sampling from a dataset

Lecture 41 Simulating the stochastic process

Section 5: Statistical Inference in R

Lecture 42 Getting confidence intervals

Lecture 43 Perfog Z-tests

Lecture 44 Perfog student's T-tests

Lecture 45 Conducting exact binomial tests

Lecture 46 Perfog Kolmogorov-Smirnov tests

Lecture 47 Working with the Pearson's chi-squared tests

Lecture 48 Understanding the Wilcoxon Rank Sum and Signed Rank tests

Lecture 49 Conducting one-way ANOVA

Lecture 50 Perfog two-way ANOVA

Section 6: Rule and Pattern Mining with R

Lecture 51 Transfog data into transactions

Lecture 52 Displaying transactions and associations

Lecture 53 Mining associations with the Apriori rule

Lecture 54 Pruning redundant rules

Lecture 55 Visualizing association rules

Lecture 56 Mining frequent itemsets with Eclat

Lecture 57 Creating transactions with temporal information

Lecture 58 Mining frequent sequential patterns with cSPADE

Section 7: Series Mining with R

Lecture 59 Creating series data

Lecture 60 Plotting a series object

Lecture 61 Decomposing a series

Lecture 62 Smoothing a series

Lecture 63 Forecasting a series

Lecture 64 Selecting an ARIMA model

Lecture 65 Creating an ARIMA model

Lecture 66 Forecasting with an ARIMA model

Lecture 67 Predicting stock prices with an ARIMA model

Section 8: Text Analytics In-depth

Lecture 68 Scraping web pages and processing texts

Lecture 69 Corpus, TDM, TF-IDF, and word cloud

Lecture 70 Cosine similarity and Latent Semantic Analysis

Lecture 71 Extracting topics with Latent Dirichlet Allocation

Lecture 72 Sennt scoring with tidytext and Syuzhet

Lecture 73 Classifying texts with RTextTools

Section 9: Sources of Data

Lecture 74 Relational databases

Lecture 75 Using JSON

Lecture 76 XML

Lecture 77 Other data formats

Lecture 78 Online repositories

Section 10: Let's Do A Project: Social Network Analysis

Lecture 79 ing social network data using public APIs

Lecture 80 Creating adjacency matrices and edge lists

Lecture 81 Plotting social network data

Lecture 82 Computing important network metrics

Section 11: Supervised Machine Learning

Lecture 83 Fitting a linear regression model with lm

Lecture 84 Summarizing linear model fits

Lecture 85 Using linear regression to predict unknown values

Lecture 86 Measuring the performance of the regression model

Lecture 87 Perfog a multiple regression analysis

Lecture 88 Selecting the best-fitted regression model with stepwise regression

Lecture 89 Applying the Gaussian model for generalized linear regression

Lecture 90 Perfog a logistic regression analysis

Lecture 91 Building a classification model with recursive partitioning trees

Lecture 92 Visualizing a recursive partitioning tree

Lecture 93 Measuring model performance with a confusion matrix

Lecture 94 Measuring prediction performance using ROCR

Section 12: Unsupervised Machine Learning

Lecture 95 Clustering data with hierarchical clustering

Lecture 96 Cutting tree into clusters

Lecture 97 Clustering data with the k-means method

Lecture 98 Clustering data with the density-based method

Lecture 99 Extracting silhouette information from clustering

Lecture 100 Comparing clustering methods

Lecture 101 Recognizing digits using the density-based clustering method

Lecture 102 Grouping similar text documents with k-means clustering methods

Lecture 103 Perfog dimension reduction with Principal Component Analysis (PCA)

Lecture 104 Deteing the number of principal components using a scree plot

Lecture 105 Deteing the number of principal components using the Kaiser method

Lecture 106 Visualizing multivariate data using biplot

Section 13: Extra Goodies: Cognitive Computing and Artificial Intelligence

Lecture 107 Introduction to neural networks and deep learning

Lecture 108 Using the H2O deep learning framework

Lecture 109 Real- cloud based IoT sensor data analysis

This course is useful whether someone is a hobbyist, analyst, an aspiring or professional data scientist, or even learning data analysis for the first . Those already familiar with the basics of R, but want to learn to efficiently analyze real-world data problems will also find this course a match for their needs.

HomePage:

https://www.udemy.com/course/r-complete-data-analysis-solutions/

Top Rated News