Oreilly - Troubleshooting Pandas
by Rudy Lai | Released February 2019 | ISBN: 9781789347760
Quick fixes for all your Pandas data wrangling frustrationsAbout This VideoPractical solutions to common problems reported by Data Scientists such as you.Each video is constructed in a problem-solution format, making it easy and clear to understand the problem and grasp the solution.Tried and tested solutions to common problems, while implementing data processing, cleaning, and wrangling solutions.In Detail Pandas is a powerful and popular scientific computing Python library for analyzing and manipulating data. Pandas is used to tidy messy data, independently analyze groups within your data, make powerful time-series calculations, and create beautiful visualizations during exploratory data analysis.However, it also comes with a set of problems when used for the aforementioned tasks. So, if you're facing any issues in your analysis or visualization tasks, then this is your course.With clear, simple, and unique solutions, this course will help you tackle any issues that you face while working with Pandas.The code files for this course are available at - https://github.com/PacktPublishing/Troubleshooting-PandasDownloading the example code for this course: You can download the example code files for all Packt video courses you have purchased from your account at http://www.PacktPub.com. If you purchased this course elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you. Show and hide more
- Chapter 1 : Import All Sorts of Dirty Datasets into Pandas
- The Course Overview 00:01:57
- Dealing with Messy Excel Sheets and Misformatted CSV Files 00:08:40
- Coping with Unstructured HTML and JSON Formats 00:08:44
- Handling Too Much Data from HDF5 and SQL Sources 00:08:09
- Chapter 2 : Cutting through All the Irrelevant Data
- Dropping Useless Data with Indices 00:08:19
- Advanced Data Cleaning with Query and Where 00:09:18
- Untangling Chained Indices with Views and Copies 00:07:27
- Chapter 3 : Clean Data and Fix Missing Values to Create a High-Quality Dataset
- Working with Spelling Mistakes and Typos in Text Data 00:08:56
- Filling in Missing Data and NAs 00:09:55
- Parsing Stubborn Date Strings into DatetimeObjects 00:09:53
- Chapter 4 : Slicing through Noisy Data to Find High-Level Groupings
- Splitting and Clustering Seemingly Random Data Points 00:07:31
- Resolving Incorrect Data Collection with Lambdas and Functions 00:07:41
- Cleaning up Misaggregated Satistics and Bad Pivot Tables 00:07:43
- Chapter 5 : Trouble with Time Series Data
- Fixing Messy Time Series Data with DateTimeIndex 00:09:23
- Segmenting and Offsetting Time Series Data to Find the Right Subset 00:06:42
- Repairing Misaligned Data with Shifting and Filling Operators 00:08:12
- Chapter 6 : Export errors and Presentation Problems
- Keeping the Right Data and Formatting into Excel Sheets 00:07:37
- Dealing with Incompatible Data for HTML and JSON 00:07:12
- Purging Big Data to HDF5 and SQL Sources 00:07:17
Show and hide more