I am currently working on a machine learning project and would like my Python program to be able to process/convert measurement data from various measurement data formats into a PyTorch compatible dataset class. This essentially means that I need to extract samples and labels from these measurements so that I can instantiate my dataset class.
Right now: I am mainly using a single libraryA
which provides all functions I need to load and preprocess the data. To extract the samples and labels as needed I have to follow several processing steps which is why I decided to encapsulate that logic in a simple factory class.
My concern is: What if I need to handle a data format that is not supported by libraryA
but by another libraryB
. That libraryB
, however, has a very different approach on how to extract the samples and labels and requires different processing steps.
My initial thought: Use an abstract factory or a factory method to create the dataset object and let the subclass decide how to do it. But, despite that I am not following the intent of abstract factories / factory method as stated by the GoF (I always want the same single product), the signatures of the abstract methods won't match because the libraries require very different inputs.
My question: Is there a suitable Design-Pattern that standardizes the creation of the same product with very different pre-creation steps?
Or should I stick to concrete simple factories that are tightly coupled to the library (e.g. LibADatasetFactory
and LibBDatasetFactory
)?
Aucun commentaire:
Enregistrer un commentaire