Oreilly - Getting Started with Python Web Scraping
by Charles Clayton | Released March 2017 | ISBN: 9781787283244
See the in-depth capabilities of Python's web scraping toolsAbout This VideoGet hands-on solutions that will take your web scraping skills in Python to the next levelThis is your one-stop solution for common and not-so-common issues while performing web scraping with PythonUnderstand a web page's structure and collect meaningful data from a website with easeIn DetailPython is a high-level programming language used for general-purpose programming. It has a design philosophy which emphasizes code readability and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java.This video course is a rich collection of recipes that will come in handy when you are scraping a website using Python, addressing your usual and unusual problems while scraping websites by diving deep into the capabilities of Python'sweb scraping tools such as Selenium, BeautifulSoup, and urllib2. The video will start with showing how to use selenium module for scraping by setting up a web driver, debugging with the Console and downloading files and streamlining with a Headless Browser (PhantomJS). The video will then move on to demonstrate how to do parsing with Beautifulsoup which would include introduction to the BeautifulSoupObjects, Nested Selectors and Regular Expressions Basics and how to do UTF-8 Encoding. The video will finally end by showing how to do fetching with urlib2 by using the developer tools Network tab, how to bypass the browser and retrieve files.By The end of this video, you will be successfully able to understand the in-depth capabilities of python web scraping tools. Show and hide more
- Chapter 1 : Scraping with Selenium
- The Course Overview 00:02:44
- When to Web Scrape 00:02:57
- What Makes up a Website 00:09:50
- How to Interact with a Website 00:08:32
- Using the Selenium Module 00:12:12
- Ethical Web Scraping 00:04:39
- Chapter 2 : Parsing with BeautifulSoup
- Requesting HTML 00:09:14
- Using the BeautifulSoup Module 00:13:18
- Example: Parsing Wikipedia 00:11:22
- Chapter 3 : Fetching the urlib2 and API’s
- Bypassing the Browser 00:04:25
- Introduction to APIs 00:04:59
- Working with APIs 00:11:52
Show and hide more
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.