samedi 17 avril 2021

Design Question - Looking for OO help and best way forward

Problem


I'm trying to build an elegant solution to a complex problem in an OO language (java). I have some thoughts about how to structure the code but haven't put together anything yet that I really like. Looking for some ideas/brainstorming on a better solution.

Task

The problem is straightforward. I need to create a library that does the following:

  1. pulls data down from some source into a local object.
  2. passes that local object to a processor, that processor does some computations on that data.
  3. those computations are written to some data store.
What is likely to change

As you can see above, I've somewhat split the tasks into their own logical groups, you have your data source (retriever), your processor and your exporter. This has certainly shaped my thoughts when developing a domain model but perhaps those aren't the logical groupings I should use. But, here is how those are likely to change.

  • data retriever - This piece is unlikely to change. We have a common library that gets passed parameters for where to retrieve the data and what is returned is POJO object for the data. We are heavily invested in this pojo model and would expect a large refactoring if it were to change.
  • processor - This piece is likely to change, probably at runtime based on arguments. The commonality between processors would be that they will operate on that common POJO object. However, what they do could be completely different and the output they produce could be completely different.
  • exporter - each processor will create some new data and will need to write that data somewhere. Each processor will probably be creating data with it's own model and the exporter would need to know how to interpret that data and store it. There could be multiple exporters for each processor, probably determined at runtime, what if it should be stored locally on a file or written to a remote sql db.
Constraints
  1. Data created in a processor could be too large to store in-memory, potential to create multi-gig amounts of data. The processor knows how much data it is creating so it may want to write to the data store (whatever it is) in chunks or if it creates smaller data amounts, it could write to the data store all at once.
  2. If you were to separate into the three logical groupings above, it's not clear to me how the processor and exporter would exchange data, there is no gaurentee that the processor would create a sub type of some super type that they could exchange, each processor probably would operate on their own model.
What I've tried:

Two different solutions so far:

  1. domain model - tried creating a domain model that had the retriever, processor and exporter all interfaces that are called by some runner. That runner calls a factory that returns the correct retriever, processor and exporter based on incoming arguments. The downside that I have faced with this are that it is non-trivial on how to pass different objects between the processor and the exporter, it would essentially come down to the processor returning an plain object and the exporter taking in an object and relying on the factory to wire them up correctly. This is because object would be the only true supertype for all the data.
  2. transaction script - isolate the logic via conceptually thinking of them as scripts. The processor then handles creation of it's data and export of it's data. In this way, the processor then knows how it wants to write the data and can do so in chunks or all at once. Could isolate the export logic into it's own class but wouldn't have to be. This doesn't give as much freedom but is much more straightforward. Adding another script would be simple and all of it's logic would be abstracted from any other script logic.

Aucun commentaire:

Enregistrer un commentaire