Tuesday, 2 September 2014

Anyone knows an online tool that can scrape a page and create a REST API for the scraped data?


I'm looking for a SaaS solution that is able to login to a platform, scrape data (reports) and then allow accessing the data through an API. I have some reporting platforms that provide web reporting and email reporting but with no API. Online reporting doesn't help and email reporting, although can be automated and scraped, isn't so reliable.

If you are willing to do the scraping through your own connection, have a look at Import IO. They have a desktop application that you use to teach the system how to scrape a page, and then you run the crawler from that application - and you can run it for as long as you like, as far as I can tell.

You may then upload your data to the Import cloud, from where it is available via an API on the import.io servers. Useful data can be made public to donate it "to the commons" if you wish.


I did some more digging, found iMacros as a possible solution. Its Windows based, which is a drawback in my case, but it does allow automation of the scraping and afterwards interaction via common web scripting languages like PHP and ASP.net.


If you are familiar with jQuery, I think you can use node.js and Cheerio module, then you can create a simple application to do auto scraping. Actually I have already built a site to do on line web scraping based on the above mentioned tech, the site is www.datafiddle.net, you can take a look at it.


Source: http://stackoverflow.com/questions/19646028/anyone-knows-an-online-tool-that-can-scrape-a-page-and-create-a-rest-api-for-the

No comments:

Post a Comment