I have multiple (python) modules where I use the same input data and also the variables have the same names.
I created a module data_loading.py
where the variables are instantiated. I then import the variables I need in the data_analysis_xx
modules.
For example,
" Module data_analysis_1 "
from data_loading import var_1, var_2,…, var_k
" Module data_analysis_2 "
from data_loading import var_1, var_3
In this way I avoid to copy-and-paste the same 200 lines of code in every module to load the same or partially the same set of data
First question:
is using a single source module for data loading the right approach? Is there a standard way or anyway a better way for importing the same variables in multiple modules?
In data_loading I also do some basic data manipulation, which can be time consuming, for example integrity check, split, cut, sort, etc. Problem: this can be time consuming. When I import data_loading, all the variables in it are loaded/processed even if I need only one or few variables.
Second question:
how to make the data_loading module work such that only the variables that really need to be loaded/processed are actually processed?
Possible solutions
-
split data_loading in multiple sub-modules --> slightly reduces the problem but increases the number of files to load from: complexity, caos, error prone. Not good
-
create a class that deals with the data loading and loads only the variable via the class? How to do this practically? How do I import the variables then? "from data_loading import Loader.var_1 as var_1 …, Loader.var_k as var_k" ?
-
implement lazy loading --> Many of my variables are classes that deal with the actual data loading (retrieving the data from a file). Hence, lazy loading would help in reducing the total cost in terms of time.
References:
- How to lazy load a data structure (python)
- cached functions via descriptor (descriptor: https://docs.python.org/3/howto/descriptor.html#descriptor-protocol)
- https://github.com/pallets/werkzeug/blob/10b4b8b6918a83712170fdaabd3ec61cf07f23ff/werkzeug/utils.py
- https://stackoverflow.com/a/6849299/7074426
Aucun commentaire:
Enregistrer un commentaire