mercredi 14 août 2019

Design patterns for a web-scraping application

I'm trying to come up with a class hierarchy for a kind of a big web-scraping application. There may be a hundreds of scrapers for various web-sites but the data scraped is rather consistent (let's say whatever the page I only want to scrape some e-mails and phone numbers). The scrapers are written in Python and are using BeautifulSoup and Selenium when soup is not enough. What design patterns would you use in such a scenario?

I was thinking of using decorator design pattern that may make some parts of the code more reusable but was told that it is a bit of an overengeenering which may be true.

Aucun commentaire:

Enregistrer un commentaire