Monday, 16 December 2013

The “Ultimate Guide to Web Scraping” is Now Available

I wrote an article on web scraping last winter that has since been viewed almost 100,000 times. Clearly there are people who want to learn about this stuff, so I decided I’d write a book.

A few months later, I’m happy to announce: The Ultimate Guide to Web Scraping.

No prior knowledge of web scraping is necessary to follow along — the book is designed to walk you from beginner to expert, honing your skills and helping you becomes a master craftsman in the art of web scraping.

The book talks about the reasons why web scraping is a valid way to harvest information — despite common complaints. It also examines various ways that information is sent from a website to your computer, and how you can intercept and parse it. We’ll also look at common traps and anti-scraping tactics and how you might be able to thwart them.

There are code samples in both Ruby and Python — I had to learn Ruby just so I could write the code samples! If anyone’s willing to translate the sample code into PHP or Javascript, I’ll give you a free copy of the book. Get in touch.



Check out the table of contents:

    Introduction to Web Scraping

    Web Scraping as a Legitimate Data Collection Tool

    Understand Web Technologies: A Brief Introduction to HTTP and the DOM

    Finding The Data: Discovering Your “API”

    Extracting the Data: Finding Structure in an HTML Document

    Sample Code to Get You Started

    Avoiding Common Scraping Traps

    Being a Good Web Scraping Citizen

As a special deal for my blog subscribers, get 20% off with the code BLOGSUB. That coupon code is only good for a limited time, so order your copy today!

Source: http://blog.hartleybrody.com/web-scraping-guide/

No comments:

Post a Comment