samedi 13 octobre 2018

Python3: How to redesign a nested class hierarchy for better serializing and/or facilitate a better model/view decoupling

a friend of mine and I, we are stuck in development of a code for an optical raytracer. This optical raytracer consists of an object hierarchy where certain classes contain variables which are implemented as a more or less complicated stateful object. The classes may also contain instances of other similar classes also containing those variables objects or further class instances.

First of all we will provide a "minimal" working example:

class OptVar:
    """
    Complicated stateful variable
    """
    def __init__(self, **kwargs):
        self.parameters = kwargs


class OptVarContainer:
    """
    Class which contains several OptVar objects and nested OptVarContainer
    classes. Is responsible for OptVar management of its sub-OptVarContainers
    with their respective OptVar objects.
    """
def __init__(self, **kwargs):
    for (key, value_dict) in kwargs.items():
        setattr(self, key, OptVar(**value_dict))


class C(OptVarContainer):
    """
    Specific implementation of class OptVarContainer
    """
    def __init__(self):
        super(C, self).__init__(
                **{"my_c_a": {"c1": 1, "c2": 2},
                   "my_c_b": {"c3": 3, "c4": 4}})


class B(OptVarContainer):
    """
    Specific implementation of class OptVarContainer
    """
    def __init__(self):
        super(B, self).__init__(**{"b": {"1": 1, "2": 2}})
        self.c_obj = C()


class A(OptVarContainer):
    """
    Specific implementation of class OptVarContainer
    """
    def __init__(self):
        super(A, self).__init__(
                **{"a1": {"1": 1, "2": 2},
                   "a2": {"a": "a", "b": "b"}})
        self.b_obj = B()


def main():
    # creating OptVarContainer with some nested OptVarContainers.
    my_a_obj = A()
    # It is intended behaviour to access the OptVar objects via
    # scoping within the class hierarchy.
    print(my_a_obj.b_obj.b.parameters)
    my_a_obj.b_obj.b.parameters["2"] = 3
    print(my_a_obj.b_obj.b.parameters)
    print(my_a_obj.b_obj.c_obj.my_c_a.parameters["c1"])
    my_a_obj.b_obj.c_obj.my_c_a.parameters["c1"] = 6
    print(my_a_obj.b_obj.c_obj.my_c_a.parameters)
    # Two major problems:
    # a) Serialization (with compatibility between different versions)
    # b) Access to all OptVar objects at once


if __name__ == "__main__":
    main()

So at the end of the day the optical system which is to be optimized (later) is of type OptVarContainer and contains several objects in a hierarchical manner.

There are two major problems:

  • Serialization of the objects without too much managing effort and version stability for interplay with a user interface or later reconstruction
  • Easy access to and management of the variables objects which are somehow distributed within the object hierarchy (for later integration into a user interface)

For optimization later the OptVars are collected and their state is used to put them into a numpy array and hand them over to an optimizer.

The objects (A, B, C) may be connected in a recursive manner and therefore serialization and easy access to the OptVar-objects is quite complicated. For the a) part there is no solution in the code. The solution for the b) part at the moment is to traverse through the object hierarchy from my_a_obj through b_obj through c_obj collecting all OptVar objects (preventing doubling by using an id list) and returning a dict. This works but for us it is only a temporary solution and it is quite ugly.

The overall goal is to make the serialization and later interaction with a GUI quite easy by just using dicts to be passed around. (Is there a better alternative to solve these two tasks?) This would also simplify interaction between OptVarContainers and Optimizer since the interfaces are dicts only.

We did some research for a) and found pickle, json, yaml for serialization. We also did some tests with jsonpickle which works without running into problems with the recursive object hierarchy. Since jsonpickle does all serialization automagically we don't have to care about that. For us the main drawback is that jsonpickle'd objects cannot be reconstructed if the base objects change so there is no version stability. Furthermore it seems quite complicated to write a serializer backend without drowning in management code. So our decision is to have a quite simple yaml "dictionary" at the end of the day and every object in the class hierarchy should provide its own dict on request (in an ideal world this dict can also be used to reconstruct the object). The quesiton here is: How to achieve this in a quite general way? Is the memento design pattern appropriate for this? Can it be used in the A, B, C context given above where the classes are only similar and not equal and interconnected in this complicated manner?

We also did some thinking on b) and we thought that some kind of pool of OptVars would be nice since the pool is easy traversable and it can contain the complicated stateful variable objects. In the classes A, B, C we would then only provide some "links" to the OptVars. Nevertheless there are two major problems with this approach:

  • How to realize such an object pool? I read about a design pattern for this which is implemented as a singleton, but in our case a singleton is maybe not the best to do. This is since there should be more than one pool at a time (maybe for different versions of the class hierarchy, or loading and saving pools). It is also no data sink only, and it has not a single global status.
  • How to manage such a pool and how to implement it into Python? We don't want to have a global variable, but we also don't want to provide the pool every time an A, B, C object is created. Are there better possibilties to put OptVars automatically into a given pool?
  • How should the link between pool OptVars and A, B, C OptVars be implemented? Proxy design pattern? Any other design patterns?

First of all, I am sorry if the questions are not appropriate for stackexchange or the format is quite confusing. I know that this is no problem where a specific and unique solution exists, but for us it is necessary to get some input on this. I hope the example above shows the structure of the problem. If there are any questions arising, do not hesitate to ask me.

Thanks for your help!

Best regards Johannes

Aucun commentaire:

Enregistrer un commentaire