mercredi 25 janvier 2023

Is there a design-pattern that addresses creation of the same product in multiple different ways (requiring pipeline-like pre-creation steps)

I am currently working on a machine learning project and would like my Python program to be able to process/convert measurement data from various measurement data formats into a PyTorch compatible dataset class. This essentially means that I need to extract samples and labels from these measurements so that I can instantiate my dataset class.

Right now: I am mainly using a single libraryA which provides all functions I need to load and preprocess the data. To extract the samples and labels as needed I have to follow several processing steps which is why I decided to encapsulate that logic in a simple factory class.

My concern is: What if I need to handle a data format that is not supported by libraryA but by another libraryB. That libraryB, however, has a very different approach on how to extract the samples and labels and requires different processing steps.

My initial thought: Use an abstract factory or a factory method to create the dataset object and let the subclass decide how to do it. But, despite that I am not following the intent of abstract factories / factory method as stated by the GoF (I always want the same single product), the signatures of the abstract methods won't match because the libraries require very different inputs.

My question: Is there a suitable Design-Pattern that standardizes the creation of the same product with very different pre-creation steps?
Or should I stick to concrete simple factories that are tightly coupled to the library (e.g. LibADatasetFactory and LibBDatasetFactory)?

Aucun commentaire:

Enregistrer un commentaire