I am building a web scraping app having multiple book sources.
Each of them has similar fields I am looking for (title, author, views, etc..). For each of them I have a unique CSS selector, and sometimes I make use of a callback to transform the data I receive from the selector.
Currently, I am using an array, each first level child being the platform, and a sublevel by field.
But I think it could be too wordy, and with some platform's callback having multiples lines would be a nightmare to read.
I am considering splitting each platform into a class, extending an abstract one having all the default fields, but I am not sure how to handle each instance of the class. Should they have static properties, on should I opt for a more classic Model ?
Aucun commentaire:
Enregistrer un commentaire