I have an application that has a single EntryPoint, its a library to automate some data engineers stuffs.
case class DeltaContextConfig(
primaryKey: List[String],
columnToOrder: String,
filesCountFirstBatch: Int,
destinationPath: String,
sparkDf: DataFrame,
sparkContext: SparkSession,
operationType: String,
partitionColumn: Option[String] = None,
tableName: String,
databaseName: String,
autoCompaction: Option[Boolean] = Option(true),
idealFileSize: Option[Int] = Option(128),
deduplicationColumn: Option[String] = None,
compactionIntervalTime: Option[Int] = Option(180),
updateCondition: Option[String] = None,
setExpression: Option[String] = None
)
This is my case class, my single Entrypoint.
After that all these parameters are pass to other objects, I have objects to write in Datalake, to Compact files and so on. And these objects use some of these parameters, for example, I have a DeltaWriterConfig object:
DeltaWriterConfig(
sparkDf = deltaContextConfig.sparkDf,
columnToOrder = deltaContextConfig.columnToOrder,
destinationPath = deltaContextConfig.destinationPath,
primaryKey = deltaContextConfig.primaryKey,
filesCountFirstBatch = deltaContextConfig.filesCountFirstBatch,
sparkContext = deltaContextConfig.sparkContext,
operationType = deltaContextConfig.operationType,
partitionColumn = deltaContextConfig.partitionColumn,
updateCondition = deltaContextConfig.updateCondition,
setExpression = deltaContextConfig.setExpression
)
I use the DeltaWriterConfig, to pass these parameters to my class DeltaWriter. I was creating all these configs objects on the MAIN, but I think it is not good, because, I have 3 Config Objects to populate, so I have 3 big constructors on the application main.
Is there any pattern to solve this?
thank you
Aucun commentaire:
Enregistrer un commentaire