I would like to write a data processing toolbox following the pipeline pattern.
Because two elements Process1
and Process2
of the pipeline could make use of the same data from the DataGenerator
(non-linear pipeline), I created an element DataProcessor
consuming the generator created in DataGenerator
and executing the different processes.
I am not really happy with that because some subsequent element only needing data processed by one of Process1
or Process2
should wait for both to be executed.
I am not an expert in design pattern, so maybe I'm using the wrong pattern for my task.
Any suggestion?
class DataGenerator:
"""
Dummy data generator
"""
def __init__(self, size):
self.size = size
def execute(self):
for i in range(1000):
yield numpy.random.random((self.size))
class Process1:
"""
Some process
"""
def __init__(self, config):
self.config = config
# not used
def execute(self, frames):
for frame in frames:
yield self.process(frame)
def process(self, frame):
# process data
return processed_frame
class Process2:
"""
Some process
"""
def __init__(self, config):
self.config = config
# not used
def execute(self, frames):
for frame in frames:
yield self.process(frame)
def process(self, frame):
# process data
return processed_frame
class DataProcessor:
"""
Execute all processes
"""
processes = []
def add_process(self, process_instance):
self.processes.append(process_instance)
def execute(self, frames):
for frame in frames:
processed_frame = []
for p in self.processes:
processed_frame.append(p.process(frame))
yield processed_frame
Aucun commentaire:
Enregistrer un commentaire