I'm building a web scraper in Ruby that gathers product information from different stores. Right now it works like so:
- A
Request
object (we'll call thisrequest
) is created with argsstore
andsku
. -
When the
process
method is called on therequest
, the following happens:A. An instance of the scraping/parsing library (Nokogiri) is created and the html page for the item belonging to the
store
andsku
from above is received as a new object (we'll call this objectnokogiri_response
.B. An instance of
Response
(we'll call thisresponse
) is created withnokogiri_response
injected as a dependency by callingResponse.new(nokogiri_response
).C. The
to_h
method is called onresponse
, which creates a hash by calling the following methods:def title @nokogiri_response.at('div[@id="some-store-title"]') end def price @nokogiri_response.at('div[@id="bby-price-main"]') end def in_stock @nokogiri_response.at('div[@id="add-to-cart"]') end
and returns a hash that looks like the following:
{ title: 'Design Patterns: Elements of Reusable Object-Oriented Software', price: 37.48, in_stock: true }
This hash is then returned by the
process
method called byrequest
.
My question is, how can I design this the best possible way so it can work for more than 300 stores? Remember that each store will have different CSS selectors for the title
, price
, and in_stock
methods on the response
object.
Aucun commentaire:
Enregistrer un commentaire