Although I'm only using their results, and not working directly on the scraping side of the data acquisition, co workers rely extensively on Scrapy (And the related http://scrapinghub.com/ services for distributed proxies and everything to work around the blacklisting issues, but you can use scrapy on your own infrastructure for low volumes)