This is a repost. You can find the original here
Database constraints are essential to ensuring data integrity, and you should use them. Allowing them to be deferrable during transactions makes them even more convenient. A common scenario in which the database can help us is in a sortable list implementation. This post outlines the how and why of deferring database constraints, using a sortable list domain as an example.
Modeling lists of sortable items Link to heading
Imagine that you have an application with multiple lists. Each list has items that can be reordered with a drag-and-drop interaction. This can be modelled in a fairly straightforward manner.
Each list has_many
list items, which are ordered by the position
column. Each list’s items have a position beginning with 1 and
incrementing with each subsequent item.
# app/models/list.rb
class List < ActiveRecord::Base
has_many :items, -> { order :position }, class_name: "ListItem"
validates_presence_of :name
end
# app/models/list_item.rb
class ListItem < ActiveRecord::Base
belongs_to :list
validates_presence_of :name, :list, :position
before_validation :ensure_position
def self.update_positions(ids)
ids.each_with_index do |id, index|
where(id: id).update_all(position: index + 1)
end
end
private
def ensure_position
self.position ||= self.class.where(list_id: list_id).maximum(:position).to_i + 1
end
end
A couple things are worth noting about the ListItem
class. Firstly, we
have update_positions
, a class method that accepts an array of ids and
updates each. This method will be called in a sort
controller action
as such:
class ItemsController < ApplicationController
expose(:list)
def sort
# list item ids is an ordered array of ids
list.items.update_positions(params[:list_item_ids])
head :ok
end
end
Secondly, new items don’t necessarily know what position they should
have, so we put list items that don’t have position
at the end of
their respective list, just before validating that the position is
present.
Here are the migrations that we used to create the models’ database tables:
class CreateLists < ActiveRecord::Migration
def change
create_table :lists do |t|
t.string :name
t.timestamps
end
end
end
class CreateListItems < ActiveRecord::Migration
def change
create_table :list_items do |t|
t.belongs_to :list
t.integer :position
t.string :name
t.timestamps
end
end
end
Notice anything missing? If you said database constraints, you’re
correct! Our application is enforcing presence for most attributes, but
our corresponding columns are missing NOT NULL
constraints. Also, the
list_id
column on list_items
is missing a foreign key constraint.
But I’d like to focus on another missing constraint. Our domain model has an implicit requirement that we haven’t enforced with either validations or database constraints: each list item’s position should be unique per list. No two list items in a list should have the same position. That would make the ordering non-deterministic.
We could add a uniqueness validation for position
, scoped to
the list_id
. However, as thoughtbot recently warned,
application-level uniqueness validations are insufficient at best, and
fail completely in concurrent environments.
The position
column needs a database-level constraint.
Adding constraints Link to heading
Adding the uniqueness constraint to position
is fairly straightforward
in PostgreSQL. We’ll just create a new migration with the following:
class AddUniquenessValidationOnListItems < ActiveRecord::Migration
def up
execute <<-SQL
alter table list_items
add constraint list_item_position unique (list_id, position);
SQL
end
def down
execute <<-SQL
alter table list_items
drop constraint if exists list_item_position;
SQL
end
end
Let’s wrap our UPDATE
statements in a transaction so that any failed
UPDATE
of the position column will result in none of them being
updated:
class ListItem < ActiveRecord::Base
# ...
def self.update_positions(ids)
transaction do
ids.each_with_index do |id, index|
where(id: id).update_all(position: index + 1)
end
end
end
end
This ensures at the database level that positions of items are unique per list; no two items in the same list can occupy the “1” position. With regard to data integrity, this is a huge improvement over our initial implementation. But it has one drawback: it doesn’t work.
To illustrate why, imagine a list with the following items:
id | position | name
13 | 1 | Eggs
18 | 2 | Milk
35 | 3 | Bread
To move Bread to the top of the list, we would pass an array of ids,
[35,13,18]
to the update_positions
method. This method does a series
of UPDATE
statements to the database. For the first id, the one for
Bread, we end up sending an update statement that would look like the
following:
UPDATE list_items SET position=1 WHERE id=35;
After this statement is executed in the database, but before we move on
to the next id in the list, Postgres will fail its constraint checks. At
the moment that the UPDATE
happens, the data would be:
id | position | name
13 | 1 | Eggs
18 | 2 | Milk
35 | 1 | Bread
With both Eggs and Bread occupying the same position, the UPDATE
fails.
Of course, we know that we want to change the position of Eggs as well,
so that its position would be “2”, and that collision would not happen.
But at the time that the constraint-check happens, the database doesn’t
know this.
Even within a transaction, database uniqueness constraints are enforced immediately per row. It seems our dreams of data integrity are smashed. If only there were a way to enforce uniqueness constraints at the end of the transaction, rather than the end of each statement…
Deferrable Constraints Link to heading
As mentioned before, constraints are immediately enforced. This
behavior can be changed within a transaction by changing a constraints
deferrable characteristics. In PostgreSQL, constraints are assumed to
be NOT DEFERRABLE
by default.
However, constraints can also behave as deferrable in one of two ways:
DEFERRABLE INITIALLY IMMEDIATE
or DEFERRABLE INITIALLY DEFERRED
.
The first part, DEFERRABLE
, is what allows the database constraint
behavior to change within transactions. The second part describes what
the default behavior will be within a transaction.
With a constraint that is deferrable, but initially immediate, the
constraint will by default behave just like a non-deferrable constraint,
checking every statement immediately. A constraint that is initially
deferred will, by default, defer its checks until the transaction is
committed. Both of these can change their behavior per-transaction with a
call to SET CONSTRAINTS
(documentation).
With that information, let’s change the definition of the constraint we defined before:
class AddUniquenessValidationOnListItems < ActiveRecord::Migration
def up
execute <<-SQL
alter table list_items
add constraint list_item_position unique (list_id, position)
DEFERRABLE INITIALLY IMMEDIATE;
SQL
end
def down
execute <<-SQL
alter table list_items
drop constraint if exists list_item_position;
SQL
end
end
The only thing we’ve changed from before is the DEFERRABLE INITIALLY IMMEDIATE
bit. I think it is a good idea to use the INITIALLY IMMEDIATE
option. This will ensure that other parts of our app, and
other consumers of the database will not be surprised by the behavior of
the constraint; it will continue to act a like a normal, non-deferrable
constraint, until we explicitly opt in to the deferral.
We now need to change our transaction block. In our case, the first
database statement within the transaction must be the SET CONSTRAINTS
statement:
class ListItem < ActiveRecord::Base
# ...
def self.update_positions(ids)
transaction do
connection.execute "SET CONSTRAINTS list_item_position DEFERRED"
ids.each_with_index do |id, index|
where(id: id).update_all(position: index + 1)
end
end
end
end
Having now opted in to deferring our uniqueness constraint, reordering the items now works as expected. The constraint still ensures that we don’t have two items that occupy the same position, but waits until the end of the transaction to do that check. We can have our data integrity cake and eat it too.
Bonus: deferrable introspection Link to heading
Having to name the constraint in two places is a bit of a bummer, and introduces a coupling that could bite us if the constraint name ever changed. Knowing that, we leverage PostgreSQL’s introspective abilities to query the constraint names instead.
For example, we can add the following module to our codebase:
# lib/deferrable.rb
module Deferrable
def deferrable_uniqueness_constraints_on(column_name)
usage = Arel::Table.new 'information_schema.constraint_column_usage'
constraint = Arel::Table.new 'pg_constraint'
arel = usage.project(usage[:constraint_name])
.join(constraint).on(usage[:constraint_name].eq(constraint[:conname]))
.where(
(constraint[:contype].eq('u'))
.and(constraint[:condeferrable])
.and(usage[:table_name].eq(table_name))
.and(usage[:column_name].eq(column_name))
)
connection.select_values arel
end
def transaction_with_deferred_constraints_on(column_name)
transaction do
constraints = deferrable_uniqueness_constraints_on(column_name).join ","
connection.execute("SET CONSTRAINTS %s DEFERRED" % constraints)
yield
end
end
end
And now change our model to use it:
class ListItem < ActiveRecord::Base
extend Deferrable
# ...
def self.update_positions(ids)
transaction_with_deferred_constraints_on(:position) do
ids.each_with_index do |id, index|
where(id: id).update_all(position: index + 1)
end
end
end
end
And, boom! Less coupling.
NOTE That’s a lot of Arel! Use at your own risk. ;-)
Example application Link to heading
While writing this post, I created a sample Rails app to iterate quickly. I used TDD to write the initial approach, and reused the specs while I “refactored” the implementation to the subsequent approaches. Each commit on the master branch more or less follows the sections above.