mercredi 19 avril 2023

Python code patterns: elegant way of trying different methods until one suceeds?

I'm trying to extract informations from an HTML page, like its last modification date, in a context where there are more than one way of declaring it, and those ways use non-uniform data (meaning a simple loop over fetched data is not possible).

The ugly task is as follow:

def get_date(html):
  date = None
  # Approach 1
  time_tag = html.find("time", {"datetime": True})
  if time_tag:
    date = time_tag["datetime"]
  if date:
    return date

  # Approach 2
  mod_tag = html.find("meta", {"property": "article:modified_time", "content": True})
  if mod_tag:
    date = mod_tag["content"]
  if date:
    return date
  # Approach n
  # ...

  return date

I wonder if Python doesn't have some concise and elegant way of achieving this through a `while" logic, in order to run fast, be legible and maintenance-friendly:

def method_1(html):
  test = html.find("time", {"datetime": True})
  return test["datetime"] if test else None

def method_2(html):
  test = html.find("meta", {"property": "article:modified_time", "content": True})
  return test["content"] if test else None


def get_date(html):
  date = None
  bag_of_methods = [method_1, method_2, ...]

  i = 0
  while not date and i < len(bag_of_methods):
    date = bag_of_methods[i](html)
    i += 1

  return date

I can make that work right now by turning each approach from the first snippet in a function, append all functions to the bag_of_methods iterable and run them all until one works.

However, those functions would be 2 lines each and will not be reused later in the program, so it just seems like it's adding more lines of code and polluting the namespace for nothing.

Is there a better way of doing this ?

Aucun commentaire:

Enregistrer un commentaire