dimanche 28 février 2021

How to think about OOP design choices for code re-usability and extensibility in Python?

This question is messy, and I know it needs cleaning up. Rather than vote to close and move on, please help me structure it correctly. I know the rules, but I don't know how to get my question answered by following them.

Context:

I've got this composition (new vocabulary word for me) of an OneHotEncoder object:

class CharEncoder:

    characters = cn.ALL_LETTERS_ARRAY

    def __init__(self):
        self.encoder = OneHotEncoder(sparse=False).fit(self.characters.reshape(-1, 1))
        self.categories = self.encoder.categories_[0].tolist()

    def transform(self, word):
        word = np.array(list(word)).reshape(-1, 1)
        word_vect = self.encoder.transform(word)
        return word_vect

    def inverse_transform(self, X):
        word_arr = self.encoder.inverse_transform(X).reshape(-1,)
        return ''.join(word_arr)

As you can see it has a class attribute characters which is essentially an array of all the ASCII characters plus some punctuation.

I want to make this CharEncoder class useful for more than just ASCII. Maybe someone else would really like to use a different character set, and I want to allow them to do so. Or maybe they want to encode entire words instead of individual letters... who knows!?

My problem:

I feel like there are so many design choices here that could make this code re-usable for a slightly different task. I feel overwhelmed.

  1. Do I make the character set a class attribute or an instance attribute?
  2. Do I write getters and setters for the character set?
  3. Do I instead write some parent class, and sub-classes for different character sets.
  4. Or do I make users pass their own OneHotEncoder object to my class, and not worry about it myself?

My question:

What are some considerations that might help guide my design choice here?

Aucun commentaire:

Enregistrer un commentaire