Last updated 4/2017MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHzLanguage: English | Size: 465.21 MB | Duration: 5h 34m
Address Big Data challenges with the fast and scalable features of Spark. What you'll learn An introduction to Big Data and data science Get to know the fundamentals of Spark 2 Understand Spark and its ecosystem of packages in data science Consolidate, clean, and transform your data acquired from various data sources Unlock the capabilities of various Spark components to perform efficient data processing, machine learning, and graph processing Dive deeper and explore various facets of data science with Spark Requirements A basic knowledge of statistics and computational mathematics Prior knowledge of Python and Scala would be beneficial Description Are you looking forward to expand your knowledge of perfog data science operations in Spark? Or are you a data scientist who wants to understand how algorithms are implemented in Spark, or a newbie with minimal development experience and want to learn about Big Data analytics? If yes, then this course is ideal you. Let’s get on this data science journey together. When people want a way to process Big Data at speed, Spark is invariably the solution. With its ease of development (in comparison to the relative complexity of Hadoop), it’s unsurprising that it’s becoming popular with data analysts and eeers everywhere. It is one of the most widely-used large-scale data processing ees and runs extremely fast. The aim of the course is to make you comfortable and confident at perfog real- data processing using Spark. What is included? This course is meticulously designed and developed in order to empower you with all the right and relevant information on Spark. However, I want to highlight that the road ahead may be bumpy on occasions, and some topics may be more challeg than others, but I hope that you will embrace this opportunity and focus on the reward. Remember that throughout this course, we will add many powerful techniques to your arsenal that will help us solve the problems. Let’s take a look at the learning journey. The course bs with the basics of Spark 2 and covers the core data processing framework and API, installation, and application development setup. Then, you’ll be introduced to the Spark programming model through real-world examples. Next, you’ll learn how to collect, clean, and visualize the data coming from Twitter with Spark streaming. Then, you will get acquainted with Spark machine learning algorithms and different machine learning techniques. You will also learn to apply statistical analysis and mining operations on your dataset. The course will give you ideas on how to perform analysis including graph processing. Finally, we will take up an end-to-end case study and apply all that we have learned so far. By the end of the course, you should be able to put your learnings into practice for faster, slicker Big Data projects. Why should I choose this course? Packt courses are very carefully designed to make sure that they're delivering the best learning experience possible. This course is a blend of text, videos, code examples, and quizzes, which together makes your learning journey all the more exciting and truly rewarding. This helps you learn a range of topics at your own speed and also move towards your goal of learning the technology. We have prepared this course using extensive research and curation skills. Each section adds to the skills learned and helps you to achieve mastery of Spark. This course is an amalgamation of sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. We have combined the best of the following Packt products Data Science with Spark by Eric CharlesSpark for Data Science by Bikramaditya Singhal and Srinivas DuvvuriApache Spark 2 for Bners by Rajanarayanan Thottuvaikkatumana Meet your expert instructors For this course, we have combined the best works of these extremely esteemed authors Eric Charles has 10 years of experience in the field of data science and is the founder of Datalayer, a social network for data scientists. He is passionate about using software and mathematics to help companies get insights from data. Bikramaditya Singhal is a data scientist with about 7 years of industry experience. He is an expert in statistical analysis, predictive analytics, machine learning, Bitcoin, Blockchain, and programming in C, R, and Python. He has extensive experience in building scalable data analytics solutions in many industry sectors. Srinivas Duvvuri is currently the senior vice president development, heading the development teams for fixed income suite of products at Broadridge Financial Solutions (India) Pvt Ltd. In addition, he also leads the Big Data and Data Science COE and is the principal member of the Broadridge India Technology Council. Rajanarayanan Thottuvaikkatumana, Raj, is a seasoned technologist with more than 23 years of software development experience at various multinational companies. He has worked on various technologies including major databases, application development platforms, web technologies, and Big Data technologies. Overview Section 1: Big Data and Data Science Lecture 1 Course Introduction Lecture 2 An introduction to Big Data Section 2: The Spark Programming Model Lecture 3 An overview of Apache Hadoop Lecture 4 Understanding Apache Spark Lecture 5 Install Spark on your laptop with Docker, or scale fast in the cloud Lecture 6 Apache Zeppelin, a web-based notebook for Spark with matplotlib and ggplot2 Lecture 7 The RDD API Section 3: Spark SQL and DataFrames Lecture 8 Understanding the structure of data and the need of Spark SQL Lecture 9 The DataFrame API and its operations Section 4: Data Analysis on Spark Lecture 10 Data analytics life cycle Lecture 11 Basics of statistics Lecture 12 Descriptive statistics Lecture 13 Inferential statistics Section 5: First Step with Spark Visualization Lecture 14 Data visualization Lecture 15 Manipulating data with the core RDD API Lecture 16 Using DataFrame, dataset, and SQL – natural and easy! Lecture 17 Manipulating rows and columns Lecture 18 Dealing with file format Lecture 19 Visualizing more – ggplot2, matplotlib, and Angular.js at the rescue Lecture 20 References Section 6: The Spark Machine Learning Algorithms Lecture 21 An introduction to machine learning Lecture 22 Discovering spark.ml and spark.mllib - and other libraries Lecture 23 Wrapping up basic statistics and linear algebra Lecture 24 Cleansing data and eeering the features Lecture 25 Reducing the dimensionality Lecture 26 Pipeline for a life Lecture 27 References Section 7: Collecting and Cleansing the Dirty Tweets Lecture 28 Streaming tweets to disk Lecture 29 Streaming tweets on a map Lecture 30 Cleansing and building your reference dataset Lecture 31 Querying and visualizing tweets with SQL Section 8: Statistical Analysis on Tweets Lecture 32 Indicators, correlations, and sampling Lecture 33 Validating statistical relevance Lecture 34 Running SVD and PCA Lecture 35 Extending the basic statistics to your needs Section 9: Extracting Features from the Tweets Lecture 36 Analyzing free text from the tweets Lecture 37 Dealing with stemming, syntax, idioms, and hashtags Lecture 38 Detecting tweet sennt Lecture 39 Identifying topics with LDA Section 10: Mine Data and Share Results Lecture 40 Word cloudify your dataset Lecture 41 Locating users and displaying heatmaps with GeoHash Lecture 42 Collaborating on the same note with peers Lecture 43 Create visual dashboards for your business stakeholders Section 11: Classifying the Tweets Lecture 44 Building the training and test datasets Lecture 45 Training a logistic regression model Lecture 46 Evaluating your classifier Lecture 47 Selection your model Section 12: Clustering Users Lecture 48 Clustering users by followers and friends Lecture 49 Clustering users by location Lecture 50 Running k-means on a stream Section 13: Putting It All Together Lecture 51 Case study Section 14: Data Science Applications Lecture 52 Building data science applications Section 15: Your Next Data Challenges Lecture 53 Recommending similar users Lecture 54 Analyzing mentions with GraphX Lecture 55 Where to go from here This course is for anyone who wants to work with Spark on large and complex datasets.,Data analyst, data scientists, or Big Data architects interested to explore the data processing power of Apache Spark will find this course very useful. HomePage:
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.