I'm working on a c++ data analysis project. My workflow goes like this
- Analyze the data and build models
- Optimize the code for latency, to deploy for production
- goto 1
Step 1 has lots of machine learning parameters, using which I test very minor variations of algorithms. In step 2, I clean up the unused parts of the code (non optimal parameters), optimize code for latency (changing maps to arrays for example), and deploy the code. These modifications are done directly on the step 1's code. No separate branch in maintained.
When new data is obtained and step 1 is required to be repeated, I would have lost the ability to test minor variations of an algorithm. One way to solve this is to maintain two branches. One will be an experimental branch, which has all the parameters for the minor variations of the algorithm. Another branch will be latency optimized code. But, the problem here is any small change in experimental branch will need to be repeated in the latency optimized branch, because there two branches cannot be merged. There are huge differences (even new files appearing) between experimental branch and latency optimized branch, which hinder direct merging.
Is there any other way to solve this?
Aucun commentaire:
Enregistrer un commentaire