I am self-learning object-oriented Python and have a classical "inherit or compose" question in my project. I read through many of the general guidelines/answers to this question, but it's hard for me to apply this to my specific case - so I decided to ask.
My program is a web-scraper, that's scrapes a website with different types of pages (and different data-objects from theses pages). I use BeautifulSoup (bs4) to parse the data from the raw request response. As there are many different data objects wanted, I need about 10 different parser-functions.
My question - what would be the best architecture for the parser-object?
A: Create a class 'mySoup' inheriting from BeautifulSoup and add the parser-functions as methods.
B: Create a class 'Parser' on its own, that takes the BeautifulSoup-object as argument.
I already changed back an forth between the two solutions and my personal feeling is, that I would rather pick the Inheritance-Solution, simply because it makes calling the parser clean and simple. Also 'Parser' doesn't look like a necessary object to me, more a collection of functions.
class mySoup(BeautifulSoup):
def parseData1(self):
...
return data_as_dict
def parseData2(self):
...
return data_as_dict
soup = mySoup(page_source)
data = soup.parseData1()
vs.
class Parser():
def __init__(self,soup):
self.soup=soup
def parseData1(self):
...
return data_as_dict
def parseData2(self):
...
return data_as_dict
soup = BeautifulSoup(page_source)
data = Parser(soup).parseData1()
Aucun commentaire:
Enregistrer un commentaire