vendredi 21 décembre 2018

Django multi-table/concrete inheritance alternatives for basic data model pattern

tl;dr

Is there a simple alternative to multi-table inheritance for implementing the basic data-model pattern depicted below, in Django?

Premise

Please consider the very basic data-model pattern in the image below, based on e.g. Hay, 1996.

Simply put: Organizations and Persons are Parties, and all Parties have Addresses. A similar pattern may apply to many other situations.

The important point here is that the Address has an explicit relation with Party, rather than explicit relations with the individual sub-models Organization and Person.

diagram showing basic data model

This specific example has several obvious shortcomings, but that is beside the point. For the sake of this discussion, suppose the pattern perfectly describes what we wish to achieve, so the only question that remains is how to implement the pattern in Django.

Implementation

The most obvious implementation, I believe, would use multi-table-inheritance (a.k.a. concrete inheritance):

class Party(models.Model):
    """ Note this is a concrete model, not an abstract one. """
    name = models.CharField(max_length=20)


class Organization(Party):
    """ 
    Note that a one-to-one relation 'party_ptr' is automatically added, 
    and this is used as the primary key (the actual table has no 'id' 
    column). The same holds for Person.
    """
    type = models.CharField(max_length=20)


class Person(Party):
    favorite_color = models.CharField(max_length=20)


class Address(models.Model):
    """ 
    Note that, because Party is a concrete model, rather than an abstract
    one, we can reference it directly in a foreign key.

    Since the Person and Organization models have one-to-one relations 
    with Party which act as primary key, we can conveniently create 
    Address objects setting either party=party_instance,
    party=organization_instance, or party=person_instance.

    """
    party = models.ForeignKey(to=Party, on_delete=models.CASCADE)

This seems to match the pattern perfectly. It almost makes me believe this is what multi-table-inheritance was intended for in the first place.

However, multi-table-inheritance appears to be frowned upon, especially from a performance point-of-view, although it depends on the application. Especially this scary, but ancient, post from one of Django's founders seems cause for discouragement:

In nearly every case, abstract inheritance is a better approach for the long term. I’ve seen more than few sites crushed under the load introduced by concrete inheritance, so I’d strongly suggest that Django users approach any use of concrete inheritance with a large dose of skepticism.

Despite this scary warning, I guess the main point in that post is the following observation regarding multi-table inheritance:

These joins tend to be "hidden" — they’re created automatically — and mean that what look like simple queries often aren’t.

Alternatives

Abstract inheritance does not seem like a viable alternative to me, because we cannot set a foreign key to an abstract model, which makes sense, because it has no table. I guess this implies that we would need a foreign key for every "child" model plus some extra logic to simulate this.

As another alternative, it is often suggested to use explicit one-to-one relations (eoto for short, here) instead of multi-table-inheritance (so Party, Person and Organization would all just be subclasses of models.Model).

Both approaches, multi-table-inheritance (mti) and explicit one-to-one relations (eoto), result in three database tables. So, depending on the type of query, of course, some form of JOIN is often inevitable when retrieving data.

By inspecting the resulting tables in the database, it becomes clear that the only difference between the mti and eoto approaches, on the database level, is that an eoto Person table has an id column as primary-key, and a separate foreign-key column to Party.id, whereas an mti Person table has no separate id column, but instead uses the foreign-key to Party.id as its primary-key.

Question(s)

I don't think the behavior from the example (especially the single direct relation to the parent) can be achieved with abstract inheritance, can it? If it can, then how would you achieve that?

Is an explicit one-to-one relation really that much better than multi-table-inheritance, except for the fact that it forces us to make our queries more explicit? To me the convenience and clarity of the multi-table approach outweighs the explicitness argument.

Note that this SO question is very similar, but does not quite answer my questions. Moreover, the latest answer there is almost nine years old now, and Django has changed a lot since.

[1]: Hay 1996, Data Model Patterns

Aucun commentaire:

Enregistrer un commentaire