mardi 14 juin 2016

Django model for sparse data

I am developing a django app that contains a number of forms which will be used to enter clinical data on some cancer tissue samples (10-20 fields per form, mostly CharField, FloatField and some multiple choice text dropdowns).

My challenge is that I need a form that can display different fields based on a diagnosis, for 150+ diagnoses. I can programmatically read the list of diagnoses, the fields required for each diagnosis and corresponding field types. Also, the set of all unique fields across all diagnoses is large (much larger than the number of fields needed for any specific diagnosis).

e.g.

                                                                                  disease_specific_fields         field_type
diagnosis
B-lymphoblastic leukemia/lymphoma NOS                                                        EBV-positive  Pull down: Yes/No
B-lymphoblastic leukemia/lymphoma with recurrent genetic abnormalities(TCF3-PBX1)            EBV-positive  Pull down: Yes/No
Monoclonal B lymphocytosis(CLL/SLL spectrum)                                                 EBV-positive  Pull down: Yes/No
Peripheral T cell lymphoma NOS                                                               EBV-positive  Pull down: Yes/No
AML with recurrent cytogenetic abnormalities(t(6;9) DEK-NUP214)                              EBV-positive  Pull down: Yes/No

So far, I thought of the following approaches:

  1. Create a single huge model that will contain mostly sparse data, and handle irrelevant data using django forms. CONS: inefficient storage and a lot of overhead code tied to forms.

  2. Create a model for each diagnosis. CONS: complicates migrations and maintenance, I think.

  3. Create one small model for all diagnoses that contains several 'generic' fields of each type ('CharField', 'FloatField', etc), and render respective field names dynamically in forms / views.

I am looking for any constructive suggestions on how to implement a model/models capturing the above data. Efficiency and storage are secondary concerns, mostly I want a clean and intuitive solution. Any answers tailored for django will be especially helpful.

Aucun commentaire:

Enregistrer un commentaire