Most people find migration tricky. Some frameworks like Django and Rails generate migration scripts from the object definitions, and you are expected to run the scripts as part of the deployment process, but I never found them complete or as optimized as doing it yourself, especially if the migration is complex like when the data model has undergone significant change.
In the past, I have constructed the migration scripts by hand (since I did not use an ORM like Django). The deployments had 3 flavors:
- New deployment: we'd run the previous version's init and the current version's migration.
- Upgrade: we'd run the current version's migration alone.
- Development / testing: we'd split the migration script on intra-release build markers and run snippets.
I found the following issues in that method:
- Team members often stepped on each others' toes due to the single migration script.
- Running migration required downtime -- sometimes this went into hours, and the users were not happy.
- Rolling back is too complex, and never practical. I always kept a backup before initiating an upgrade, but only ever used it to pull out old data for any fix, but the fix was always forward looking.
If I had to design this all over again, I think I'd do the following:
- Instead of a single large migration script, have multiple small ones with some naming convention that can be linked to a software micro-version and/or commit version. Put the individual snippets together during migration, automatically figure out which snippets need to be run.
- Make it so that down time is minimized. All scripts should have a pre-deployment, deployment and a post-deployment component. Only the second component should require downtime, and it should do very minimal stuff. For example, if a column is being renamed, the pre-deployment should add a new column and copy the contents from the old one, the deployment component should do the same for records added after the web server is brought down, post-deployment should delete the old column, after the web server is up and running again.
Liquibase does seem promising, but I am not sure it can handle complex update queries that not only changes the schema, but also migrates data to adhere to data model changes.