I wrote an article on web scraping last winter that has since been viewed almost 100,000 times. Clearly there are people who want to learn about this stuff, so I decided I’d write a book.
A few months later, I’m happy to announce: The Ultimate Guide to Web Scraping.
No prior knowledge of web scraping is necessary to follow along — the book is designed to walk you from beginner to expert, honing your skills and helping you becomes a master craftsman in the art of web scraping.
The book talks about the reasons why web scraping is a valid way to harvest information — despite common complaints. It also examines various ways that information is sent from a website to your computer, and how you can intercept and parse it. We’ll also look at common traps and anti-scraping tactics and how you might be able to thwart them.
There are code samples in both Ruby and Python — I had to learn Ruby just so I could write the code samples! If anyone’s willing to translate the sample code into PHP or Javascript, I’ll give you a free copy of the book. Get in touch.
—
Check out the table of contents:
Introduction to Web Scraping
Web Scraping as a Legitimate Data Collection Tool
Understand Web Technologies: A Brief Introduction to HTTP and the DOM
Finding The Data: Discovering Your “API”
Extracting the Data: Finding Structure in an HTML Document
Sample Code to Get You Started
Avoiding Common Scraping Traps
Being a Good Web Scraping Citizen
As a special deal for my blog subscribers, get 20% off with the code BLOGSUB. That coupon code is only good for a limited time, so order your copy today!
Source: http://blog.hartleybrody.com/web-scraping-guide/
A few months later, I’m happy to announce: The Ultimate Guide to Web Scraping.
No prior knowledge of web scraping is necessary to follow along — the book is designed to walk you from beginner to expert, honing your skills and helping you becomes a master craftsman in the art of web scraping.
The book talks about the reasons why web scraping is a valid way to harvest information — despite common complaints. It also examines various ways that information is sent from a website to your computer, and how you can intercept and parse it. We’ll also look at common traps and anti-scraping tactics and how you might be able to thwart them.
There are code samples in both Ruby and Python — I had to learn Ruby just so I could write the code samples! If anyone’s willing to translate the sample code into PHP or Javascript, I’ll give you a free copy of the book. Get in touch.
—
Check out the table of contents:
Introduction to Web Scraping
Web Scraping as a Legitimate Data Collection Tool
Understand Web Technologies: A Brief Introduction to HTTP and the DOM
Finding The Data: Discovering Your “API”
Extracting the Data: Finding Structure in an HTML Document
Sample Code to Get You Started
Avoiding Common Scraping Traps
Being a Good Web Scraping Citizen
As a special deal for my blog subscribers, get 20% off with the code BLOGSUB. That coupon code is only good for a limited time, so order your copy today!
Source: http://blog.hartleybrody.com/web-scraping-guide/
No comments:
Post a Comment