Sanity When ALTERing Database Tables in Django

Intro

Databases are an integral part of many web applications - if you have used a website that has any user-specific functionality, that information has to be stored somewhere. SQL databases provide the type of behavior most tend to think of when they hear the word database - tables of objects, columns of types with constraints, rows with records; relations between tables and strict rigidity that necessitates data be entered or manipulated in exactly the format expected. Since these types of databases are common and fully-featured to tackle a slew of common business domains, they have first-class support in Django.

Understanding your data models and managing data persistence is a necessity for most modern website. How your project code interacts with the database (an often single point of failure that is an entirely separate service) is unavoidable for effective web engineering. One aspect of managing code along with changes to a database that is often overlooked - until your database gets into an inconsistent state and takes down your website at least - is tracking database schema changes that occur when your project requirements cause you to modify your ORM models. Django manages database changes via migrations, offering a straight forward python-first syntax for managing database changes in a Django application.

Working With Migrations

  1. Django migrations are created by running `makemigrations`, which will do nothing if no models have been changed.
~ python manage.py makemigrations --name remove_testmodel_tz_2
Migrations for 'testapp':
testapp/migrations/0003_remove_testmodel_tz_2.py
- Remove field tz_2 from testmodel

But migrations are not magic so…

2. READ YOUR MIGRATIONS! Just because migrations for many model changes will be generated automatically, it is your responsibility to make correct migrations that track changes properly. Models are not migrations and migrations are not models. Models are what you want to write to make data modeling quick and easy in the ORM - migrations are what you need to write because you wanted to use a database in the first place. These will be run by other developers and should make sense before you run `migrate` and subsequently commit them.

~ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, sessions, testapp
Running migrations:
Applying testapp.0003_remove_testmodel_tz_2... OK

Certainly we couldn’t remove something that wasn’t previously added, so here’s how can we see that in our migration(s):

New Migration:

from django.db import migrations


class Migration(migrations.Migration):

    dependencies = [
        ('testapp', '0002_auto_20191209_0636'),
    ]

    operations = [
        migrations.RemoveField(
            model_name='testmodel',
            name='tz_2',
        ),
    ]

Initial Migration:

class Migration(migrations.Migration):

    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='TestModel',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('tz_1', models.TextField(choices=[('Africa/Abidjan', 'Africa/Abidjan'), ('Africa/Accra', 'Africa/Accra'),
...

Here we can see a prior migration created the model that has the field we now want to delete.

3. Migrations are transactional - they are run in a defined order and assume a prior state to function properly. This is analogous to reading commits in a VCS such as git (a practice competent developers should be comfortable with). You should understand how one state leads to and depends on another, which will help troubleshoot failures and inconsistencies. `showmigrations` will give you the quick view on what migrations have been applied or are waiting to be applied (the application of migrations is tracked in your database).

admin
 [X] 0001_initial
 [X] 0002_logentry_remove_auto_add
 [X] 0003_logentry_add_action_flag_choices
auth
 [X] 0001_initial
 ...
testapp
 [X] 0001_initial
 [X] 0002_auto_20191209_0636
 [X] 0003_remove_testmodel_tz_2

As we saw in (2), the content of migrations themselves are easily readable to determine what is being changed in our database schema. What if we need to go backwards? Maybe we didn’t actually want to delete that field but wanted to modify it? What will happen if we hop back into our project right now and add a small modification? Remember, migrations are transactions, they change state in a particular order, so if we go back and add a 0004…. migration to add the field back with a slight change, this is now a new field, so if you were to check in these migrations you’d wipe out the existing column data wherever it had previously existed! Migrations can be properly reversed so that we can reset the state and make the change we actually want…

~ python manage.py migrate testapp 0002_auto_20191209_0636
Operations to perform:
  Target specific migration: 0002_auto_20191209_0636, from testapp
Running migrations:
  Rendering model states... DONE
  Unapplying testapp.0003_remove_testmodel_tz_2... OK
~ rm testapp/migrations/0003_remove_testmodel_tz_2.py
~ python manage.py makemigrations --name testmodel_tz_2_allow_null
Migrations for 'testapp':
  testapp/migrations/0003_testmodel_tz_2_allow_null.py
    - Alter field tz_2 on testmodel

What if you need to manipulate the data that is stored in your migration between code changes? These are referred to as data migrations and provide a clear example of how manually editing migrations can be useful. Let’s say we found a calculation error in our code and now needed to multiply all values in a particular column by 2?

from django.db import migrations

def multiply_score(apps, schema_editor):
    Quiz = apps.get_model('testapp', 'Quiz')
    for quiz in Quiz.objects.all():
        quiz.score *= 2
        quiz.save()

class Migration(migrations.Migration):

    dependencies = [
        ('yourappname', '0001_initial'),
    ]

    operations = [
        migrations.RunPython(multiply_score),
    ]

Conclusion

Databases are robust systems that expose a wealth of functionality necessary for proper data management in modern applications. Developers must conceptualize their data as it will be modeled in the database and in their code. The Django ORM and migrations are powerful abstractions to help make this interplay between code and database easier to work with, and warrant an educated understanding of their operation.

Introducing the JBS Quick Launch Lab!

FREE 1/2 Day Assessment

Quantify what it will take to implement your next big idea! Our intensive 1/2 day session will deliver tangible timelines, costs, high-level requirements, and recommend architectures that will work best, and all for FREE. Let JBS show you why over 20 years of experience matters.
Yes, I'd Like A FREE Assessment