This question is messy, and I know it needs cleaning up. Rather than vote to close and move on, please help me structure it correctly. I know the rules, but I don't know how to get my question answered by following them.
Context:
I've got this composition (new vocabulary word for me) of an OneHotEncoder
object:
class CharEncoder:
characters = cn.ALL_LETTERS_ARRAY
def __init__(self):
self.encoder = OneHotEncoder(sparse=False).fit(self.characters.reshape(-1, 1))
self.categories = self.encoder.categories_[0].tolist()
def transform(self, word):
word = np.array(list(word)).reshape(-1, 1)
word_vect = self.encoder.transform(word)
return word_vect
def inverse_transform(self, X):
word_arr = self.encoder.inverse_transform(X).reshape(-1,)
return ''.join(word_arr)
As you can see it has a class attribute characters
which is essentially an array of all the ASCII characters plus some punctuation.
I want to make this CharEncoder
class useful for more than just ASCII. Maybe someone else would really like to use a different character set, and I want to allow them to do so. Or maybe they want to encode entire words instead of individual letters... who knows!?
My problem:
I feel like there are so many design choices here that could make this code re-usable for a slightly different task. I feel overwhelmed.
- Do I make the character set a class attribute or an instance attribute?
- Do I write getters and setters for the character set?
- Do I instead write some parent class, and sub-classes for different character sets.
- Or do I make users pass their own OneHotEncoder object to my class, and not worry about it myself?
My question:
What are some considerations that might help guide my design choice here?
Aucun commentaire:
Enregistrer un commentaire