Imagine we have different structures of dataframes in Pandas
# creating the first dataframe
df1 = pd.DataFrame({
"width": [1, 5, 7, 8],
"height": [5, 8, 4, 3]})
# creating second dataframe
df2 = pd.DataFrame({
"a": [7, 8, 9, 10],
"b": [11, 23, 1, 5],
"c": [1, 3, 4, 5]})
In general there might be more than 2 dataframes. Now, I want to create a logic that is mapping columns names to specific functions to create a new column "metric" (think of it as area for two columns and volume for 3 columns). I want to specify column names ensembles
column_name_ensembles = {
"1": {
"ensemble": ['height', 'width'],
"method": area}
"2": {
"ensemble": ['a', 'b', 'c'],
"method": volume}}
Now, the area function create a new column for the dataframe df1['metric'] = df1['height'] * df2['widht']
and the volumen function will create a new column for the dataframe df2['metic'] = df2['a'] * df2['b'] * df2['c']
. Note, that the functions can have arbitrary form but it takes the ensemble as parameters. The perfect solution would be a function that takes the dataframe and the column_name_ensembles as parameters and returns the dataframe with the appropriate 'metric' added to it.
I know this can be achieved by multiple if and else statements, but this does not seem to be the most intelligent solution. Maybe there is a design pattern that can solve this problem, but I am not an expert at design patterns.
Thank you for reading my question! I am looking forward for your great answers.
Aucun commentaire:
Enregistrer un commentaire