• Stop being a LURKER - join our dealer community and get involved. Sign up and start a conversation.

Reply to thread

That might be the case, but chasing these kind of groups down is a tedious and expensive task.


I'll add this Irish group to the quality vendor options at https://scrapinghub.com and https://www.octoparse.com/Product


PhantomJS combined with CasperJS is pretty fantastic - it runs a full, headless copy of a Webkit browser so it can operate against a real DOM, execute JavaScript properly, even grab full rendered screenshots of areas of the page but is still easy to automate.


Apart from paid applications like Visual web ripper, Mozenda, Web data extractor, etc., there are plenty of open source methodologies that enables customized web scraping operations. Such as, Selenium, Jsoup, Html Unit Driver (browser-less automation) Some of these have an inbuilt web driver / browser that could imitate even manual browsing to provide efficient web scraping.


Have a look at Why web scrape in .net, which argues the point of why you should scrape in a particular language and you should find most of these arguments will apply to your language of choice too.