I'm trying to write a simulation code in python. This simulation code relies on inputs for a large csv file, and there is a separate csv file for each day in the simulation. I need to make numerous queries (the queries are based on time, which are columns in the csv file) each simulation day.
I'm thinking of using pandas.read_csv
to read this in as a dataframe, and store the result and then query from this dataframe. One coding requirement is I don't want the dataframe stored at the query site.
I think the easiest way to do this is with a class, e.g.,
import pandas as pd
class DailyCSVLoader:
def __init__(filepath):
self.df = pd.read_csv(filepath)
def query(time):
# return the rows corresponding to time
with usage:
import datetime
path = "/path/to/csv/file/filename.csv"
time = datetime.datetime(year=2020, month=1, day=1, hour=12, minute=2, second=0)
loader = DailyCSVLoader(path)
loader.query()
However, for my particular codebase, it might be slightly easier to do this outside of a class and with just a function and perhaps a static variable that holds the dataframe, e.g.,
import pandas as pd
# because I don't want the calling site to store df, I decided to keep it as a static variable here
def daily_csv_loader(filepath):
daily_csv_loader.df = pd.read_csv(filepath)
def query(time, df):
# return rows from df corresponding to time
with usage
import datetime
path = "/path/to/csv/file/filename.csv"
time = datetime.datetime(year=2020, month=1, day=1, hour=12, minute=2, second=0)
daily_csv_loader(filepath)
query(time, daily_csv_loader.)
Are there any other approaches here?
Aucun commentaire:
Enregistrer un commentaire