I have the following design issue that I hope to get your help to resolve. Below is a simplistic look at what the code looks like
class DataProcessor{
public List<Record> processData(DataFile file){
List<Record> recordsList = new ArrayList<Record>();
for(Line line : file.getLines()){
String processedData = processData(line);
recordsList.add(new Record(processedData));
}
}
private String processData(String rawLine){
//code to process line
}
}
class DatabaseManager{
saveRecords(List<Record> recordsList){
//code to insert records objects in database
}
}
class Manager{
public static void main(String[] args){
DatabaseManager dbManager = new DatabaseManager("e:\\databasefile.db");
DataFile dataFile = new DataFile("e:\\hugeRawFile.csv");
DataProcessor dataProcessor = new DataProcessor();
dbManager.saveRecords(dataProcessor.processData(dataFile));
}
}
As you can see, "processData" method of class "DataProcessor" takes DataFile object, processes the whole file, create Record object for each line and then it returns a list of "Record" objects.
My problem with "processData" method: When the raw file is really huge, "List of Record" objects takes a lot of memory and sometimes the program fails. I need to change the current desgin so that the memory usage is minimized. "DataProcessor" should not have direct access to "DatabaseManager". I was thinking of passing a queue to "processData" method, where one thread run "processData" method to insert "Record" object in the queue, while another thread remove "Record" object from the queue and insert it in database. But I'm not sure about the performance issues with this.
Aucun commentaire:
Enregistrer un commentaire