vendredi 3 septembre 2021

How to update models / vectors on runtime, on daily basis?

I have a simple web app which uses sklearn transformed vectors (tfidf / count / normalizer) & other pytorch (transformer models). I usually dump these models via joblib. This app calls these models via fastapi based apis. Until now everything is fine.

  • But the above listed vectors and models are updated on daily basis, on runtime. By the part of same application which uses it. So, whenever new files are ready we starts using them.
  • Whenever we get a call from api, we search for todays model and does: joblib.load then respond to api calls. In this process, whenever we are getting too many calls, we are doing many times joblib.load and finally we starts getting Too many files open OSError.
  • If we wouldn't be updating these models daily then I could have done loading once in global variables. But now I don't have a clear and best idea, to design it in such a way that, we can update models on daily basis & whenever the models are available for today, then start using them.
  • Also, one more constraint, until the models for today are not available, we use yesterdays model to serve requests.

Aucun commentaire:

Enregistrer un commentaire