mercredi 25 juillet 2018

How to determine the most efficient and adaptive/flexible module architect in python?

The overall objective of a task is to select a subgroup of people from the entire group of people. The selection process will be based on some criteria.

Say, the entire group of people comes from 2 different databases.

1) Data Hosp – contains all hospitalization records of patients from year 2000-2010, each row represents a hospitalization record. One patient can have multiple hospitalization records (multiple rows on different occasions).

2) Data Pharm – contains all drug prescription information of patients from year 2000-2010, each row represents a prescription record. One patient can have multiple prescription records.

I want to use different rules for the selection process at different times. For example, we can select the subgroup if patients meet any of the following rule-based conditions:

1) Select if a patient has three or more hospitalizations in 2000-2010,

2) Select if a patient has three or more prescription in 2000-2010,

3) Select if a patient has 1) but not 2),

4) Select if a patient has 1) and 2),

5) Select if a patient has 2) but not 1)… etc

I am simplifying the complexity of the rules drastically, so in practice, I would like to create my own python module called “select_patient” so that I don’t have to rewrite similar codes that many times.

I have a fairly good understanding in python basic and intermediate concepts including creating function and simple module, but I haven’t created very complex modules yet, so I don’t know what the best path is to construct this module. I also don't know if I may be unaware of some of the more advance concepts in python that are necessary to create what I want to create.

Also, how do I create a module that’s adaptive to slight variations? For example, in the above conditions, criterion 1) only involves Data Hosp, while criteria 3-5) involve both databases. Furthermore, say if I want to modify condition 1) by adding another condition such that the hospitalization has to occur in urban areas (and not rural) areas.

One design approach I am currently trying is to create a big module in which the parameters will spell out all the conditions in parameters. For example

class select_patients(param1, param2, param3, param4… param20):
    def __init__():
    def test_cond1():
    def test_cond2():

The vast number of parameters allow me to specify under what user-driven circumstance, the different condition will be applicable. But I found this to be very burdensome to specify so many parameters as soon as from the get-go.

Another approach I am thinking of is a piece-meal approach, where I am breaking down the tasks and creating smaller functions first. For example

def test_cond1():
def test_cond2():
def select_patients(param1, param2…):
    if XYZ: # apply test_cond1()
    else: # apply test_cond2()

Is one approach more preferable than the other?

Since I haven’t used inheritance and iterator/generator (or other advance python concepts) before, are they potentially useful for what I am trying to do?

Aucun commentaire:

Enregistrer un commentaire