I am trying to develop a class that tokenizes a sentence with a pre-defined pattern. My class is defined as below:
class Token:
def __init__(self, text: str, begin: int, end: int, tid: int):
self.__text = text
self.__length = len(text)
self.__begin = begin
self.__end = end
self.__tid = tid
@property
def text(self) -> str:
return self.__text
@property
def begin(self) -> int:
return self.__begin
@property
def end(self) -> int:
return self.__end
@property
def tid(self) -> int:
return self.__tid
def __len__(self):
return self.__length
Since this class is used to process large documents with lots of tokens, Many instances are created. I have defined my own tokenize class as follows:
def tokenizer(sentence):
list_of_tokens = []
# tokenizing algorithm
# creating an instance of Token
# adding each instance into list_of_tokens
return list_of_tokens
Now my question is whether it is correct to use @classmethod in python(like below) to help this issue or create just another function out of the class that does so?
@classmethod
def tokenenizer(cls, sentence):
list_of_tokens = []
# tokenization
# create instance of Token with cls(*arg)
# add each token into list_of_tokens
return list_of_tokens # that holds objects of Token
Aucun commentaire:
Enregistrer un commentaire