I'm currently working on a computation done in different steps:
- Extract data from DB (data class may be different depending on the request)
- Add static data to each record (static data is memory). After this step all the data is represented by the same Class.
- Call a REST end point to perform the actual calculation
- Post process the result
Since the amount of data could be very big, each step has its own thread pool, so that different request can be processed in parallel. Before sending the request to the pipeline, the number of records is calculated and if it exceed a certain threshold, the request is paged.
I started to design this as a pipeline where each pipe is a step.
My idea was to separated the management of the request from the actual work done by each pipe on the data, so that I can replace the pipe depending on the data I'm extracting, but still use the same logic to process paged and non paged requests.
The problem is in how to keep the request management generic and the data processing specific and switching the different implementation based on the data.
The logic of paging the request and collecting the pages results is done by adding to each request the number of pages, the page number and the size. At some point (in step 3), I collect all the pages for the same request (there is an Id for each request) and wait till all pages are collected.
At the moment I have 3 pieces:
- Pipeline
- Pipe
- Request
And I want to keep these pieces as generic as possible, while the underlying data should be specific.
Now, the first question is, should I make and implementation of each pipe for each data I have and switch them depending on the request? Or keep the same implementation for the pipe and let the pipe decide what to do internally? (For example by using different objects each capable of handling one type of data). Second question is, how to implement the decision of which implementation I should use depending on the data type?
At the moment I tried to set a type to the request, but the problem still is present since the pipe only knows about the generic Request interface. I tried also to parametrize the request by the data, but still, the only known type is the interface Request.
interface Request {
UUID getId();
int getTotalPages();
}
interface Pipe {
void execute(Request request);
void addNextPipe(Request request);
}
interface Pipeline {
void execute(Request request);
void addPipe(Pipe pipe);
}
PagedRequest extends Request {
UUID id;
int totalPages;
int page;
int size;
List<Data> data;
UUID getId() {
...
}
int getTotalPages() {
...
}
}
Step1 implements Pipe {
Pipe nextPipe;
ThreadPool pool;
execute(Request request) {
pool.execute(new Worker(request));
}
private class Worker implements Runnable {
private Request request;
run() {
//do specific work
nextPipe.execute(request);
}
}
}
What I want to achieve is something like:
- Receive the request
- Lookup the type/data
- Dispatch the request to the correct handler. The one that is capable of working with the data in the request
- Pass the generic request to the next pipe
Assuming we have three type of data A, B and C:
StepN implements Pipe {
execute(Request request) {
if(handlerForDataA can handle) {
handlerForDataA.handle(request);
} else if(handlerForDataB can handle) {
handlerForDataB.handle(request);
} else if(handlerForDataC can handle) {
handlerForDataC.handle(request);
}
}
}
The handler could also be actual worker.
Thank you
Aucun commentaire:
Enregistrer un commentaire