I’ve got a python program architecture/design question, that I’d love some input on.
Background
I’m a solid intermediate PY dev - with 1.5 years PY 3.7 coding exp, but also separately had over 10 years experience in architecture/design in other languages.
I’ve had this PY architecture problem going round in my head for a few weeks, and researched/googled/puzzled over what’s the best way to handle it, but I’m still not sure.
So basically I’m not a newbie asking how to do my homework, and I’ve done a bunch of thinking about the problem already and I’m stumped.
The problem below I've got working perfectly in prod right now - I just need to re-architect it to utilize common code for entry points and environment handling and I don't know how best to handle that in PY.
Problem Background
- I have code that analyses groups of images. There are different types of groups - call them red, blue and green types of images. Each type of group has their own particular method of analysis on the images. call it red, green and blue tests.
- Though the specifics of each type of test on the image groups are different, the pre-processing and post-processing work is the same.
- Currently I have a separate PY prog for each type of group, call them
red.py
,blue.py
etc.
I run these tests either locally or on AWS as lambda calls.
- So each test has two different entry points,
main()
if local andlambda_handler()
if in AWS. - These entry points do usual entry point stuff - process arguments, setup logging, and initialisation etc.
- The entry points then call group specific code to do the actually image processing - so conceptually call something like:
test_red(group_of_images)
inred.py
etc. So these calls are different for each different group type - The entry point code is common across each of the different image types (and so each of the test files) - what varies is the code to actually test the images.
For speed, all the individual images are tested in parallel:
- In a local environment, the testing of each single image is done in parallel via multiprocessing using futures.
So thetest_red(images)
group type specific functions submit a MP callfor each image
- so something liketest_red(images) -> [test_red_image(single_red_image)]
- In AWS env, the MP is handled by spawning separate AWS lambda calls per image. This introduces a new intermediate AWS lambda handler function that needs to sit ‘around’ the test an individual image function:
test_red(group_of_images) -> [lambda_worker(single_red_image) -> test_red_image(single_red_image)]
lambda_worker()
is wrapper around the single image test function to provide an AWS lambda entry point.- In AWS environment, each of the calls to the lambda's are run in parallel via a multithreading via futures.
Problem
In each of my code files for each group type - so red.py
, blue.py
etc - the wrapper code for the entry points: main()
, lambda_handler()
and lambda_worker()
is the same between each test.
Right now, I have that code written in each test file separately (red.py, green.py etc) (...not good!)
I would (obviously) like to move that code to a separate common file, that gets used by each test file.
I’m not sure how best to do that.
Questions
- I assume decoration is the way to go - at a high level is that correct? Is there a better way to handle this than decoration?
If so, then using PY decorators (@) would be the obvious route - except that the @ decoration results in a single function replacing your undecorated function
So in my case this has two problems: a) in the main()
and lambda_handler()
top level wrapper, I need two separate functions to come out of the decoration, not a single wrapped function b) in the actual testing case, I need a new decorated function (lambda_handler()), but I also still need access to the original undecorated function.
- So is it possible to handle this case via the standard python decorator operator (@), or because of (a) and (b) above, I need to do the decoration “by hand”?
Aucun commentaire:
Enregistrer un commentaire