Adding type hints to the Django ORM
It occurred to me that Django's ORM could do with a bit of a revamp to make use of recent developments in the Python language.
The main area where I think Django's models are missing out is the lack of type hinting (hardly surprising since Django pre-dates type hints). Adding type hints allows Mypy to detect bugs before you even run your code. It may only save you minutes each time, but multiply that by the number of code + run iterations you do each day, and it can save hours of development time. Multiply that by the lifetime of your project, and it could save weeks or months. A clear win.
Typing Django Models
I'd love to be able to use type hints with the Django ORM, but it seems that the magic required to create Django models is just too dynamic and would defy any attempts to use typing. Fortunately that may not necessarily be the case. Type hints can be inspected at runtime, and we could use this information when building the model, while still allowing Mypy to analyze our code. Take the following trivial Django model:
class Foo(models.Model):
count = models.IntegerField(default=0)
The same information could be encoded in type hints as follows:
class Foo(TypedModel):
count: int = 0
The TypedModel class could inspect the type hints and create the integer field in the same way as models.Model
uses IntegerField
and friends. But this would also tell Mypy that instances of Foo have an integer attribute called count
.
But what of nullable fields. How can we express those in type hints? The following would cover it:
class Foo(TypedModel):
count: Optional[int] = 0
The Optional
type hint tells Mypy that the attribute could be None
, which could also be used to instruct TypedModel to create a nullable field.
So type hints contain enough information to set the type of the field, the default value, and wether the field is nullable--but there are other pieces of information associated with fields in Django models; a CharField
has a max_length
attribute for instance:
class Bar(models.Model):
name = models.CharField(max_length=30)
There's nowhere in the type hinting to express the maximum length of a string, so we would have to use a custom object in addition to the type hints. Here's how that might be implemented:
class Bar(TypedModel):
name: str = String(max_length=30)
The String
class contains the maximum length information and additional meta information for the field. This class would have to be a subclass of the type specified in the hint, i.e. str
, or Mypy would complain. Here's an example implementation:
class String(str):
def __new__(cls, max_length=None):
obj = super().__new__(cls)
obj.max_length = max_length
return obj
The above class creates an object that acts like a str
, but has properties that could be inspected by the TypedModel class.
The entire model could be built using these techniques. Here's an larger example of what the proposed changes might look like:
class Student(TypedModel):
name: str = String(max_length=30) # CharField
notes: str = "" # TextField with empty default
birthday: datetime # DateTimeField
teacher: Optional[Staff] = None # Nullable ForeignKey to Staff table
classes: List[Subject] # ManyToMany
Its more terse than a typical Django model, which is a nice benefit, but the main advantage is that Mypy can detect errors (VSCode will even highlight such errors right in the editor).
For instance there is a bug in this line of code:
return {"teacher_name": student.teacher.name}
If the teacher
field is ever null, that line with throw something like NoneType has no attribute "name"
. A silly error which may go un-noticed, even after a code review and 100% unit test coverage. No doubt only occurring in production at the weekend when your boss/client is giving a demo. But with typing, Mypy would catch that.
Specifying Meta
Another area were I think modern Python could improve Django models, is specifying the models meta information.
This may be subjective, but I've never been a huge fan of the way Django uses a inner class (a class defined in a class) to store additional information about the model. Python3 gives us another option, we can add keyword args to the class statement (where you would specify the metaclass). This feels like a more better place to add addtional information about the Model. Let's compare...
Hare's an example taking from the docs:
class Ox(models.Model):
horn_length = models.IntegerField()
class Meta:
ordering = ["horn_length"]
verbose_name_plural = "oxen"
Here's the equivalent, using class keyword args:
class Ox(TypedModel, ordering=["horn_length"], verbose_name_plural="oxen"):
horn_length : int
The extra keywords args may result in a large line, but these could be formatted differently (in the style preferred by black):
class Ox(
TypedModel,
ordering=["horn_length"],
verbose_name_plural="oxen"
):
horn_length : int
I think the class keyword args are neater, but YMMV.
Code?
I'm sorry to say that none of this exists in code form (unless somebody else has come up with the same idea). I do think it could be written in such a way that the TypedModel
and traditional models.Model
definitions would be interchangeable, since all I'm proposing is a little syntactical sugar and no changes in functionality.
It did occur to me to start work on this, but then I remembered I have plenty projects and other commitments to keep me busy for the near future. I'm hoping that this will be picked up by somebody strong on typing who understands metaclasses enough to take this on.
Model fields are not only a representation of the data type but the set of logic in which formats Django should read and write data.
For example, you will store comma separated string in a database but you field will have custom
to_python
method that will automatically split it to a list of values.Maybe it will make sense to add type hinting to the
to_python
method.Thanks for interesting food for thought.
There is a typo in the
Student
model: you wroteTypeModel
instead ofTypedModel
.Fixed, thanks.
And we could finally get ORM/framework agnostic models :-) Great post!
I wrote a tiny proof of concept: https://gist.github.com/Naddiseo/d611bbd50388f267720e280de5643b90
Nice work. Anything tricky in the implementation?
Nothing super tricky. The only thing that bit me initially was the "inherited metaclasses must be is strict subclass of parent metaclasses" thing that python has; it was easily fixed by making the metaclass extend from the python base model metaclass. The only things left are to extend the mapping from python types to django models fields, which I haven't looked to far into for things like foreign keys and optional. And the second thing is that Django currently issues a warning about redefining the model which I haven't looked to far in to. The idea could be extended to use annotated types, which would make the it easier to provide options to the django models:
I think this would definitely help speed up development if errors can be spotted before the code is run.
It might be worth taking this idea to the development community of the main Django project for feedback.
Nice article and nice comments. A couple of points:
The pydantic library has a very mature, extensible system for validation based on type hinting. It could probably be the inspiration for a lot of the extras you mention, such as constraints.
Consider using dataclasses to do the modeling. You get the type hint information but you also get a "field" where you can do inner-Config-class things. Such as ship along "metadata". In my system I have certain frequent metadata as custom fields, making it easy to import and say what you mean. The postinit could also be useful for doing Django-specific things.
Consider using
django-stubs
: it provide type specs and custommypy
plugins to do typechecking with django correctly.I wrote an article about it: https://sobolevn.me/2019/08/typechecking-django-and-drf
Wow, there's no "edit". But I've made several typos and my link is not active. Here you go: https://sobolevn.me/2019/08/typechecking-django-and-drf
These ideas are great and definitely the way to go. Unfortunately, old projects like Django and SqlAlchemy will not move fast enough. Too much legacy, too many users used to the old ways. It is necessary for a new project to be born and kick the butts of the old guys so they can pay attention and start moving (and probably won't ever catch up).
See how FastAPI makes use of type hints and how it makes total sense and lead to a superior framework.