Problem statement
I have a table with a column whose values are foriegn keys, but thetarget table of the foreign key differs from row to row. The relevanttable can be determined from the key value alone, and there is a small,fixed set of such tables.
I'd like to add a foreign key constraint here so that my DBMS can ensurereferential integrity. Of course, I can't do this directly, but I have aproposed solution that involves an intermediate "forwarding table" withincoming and outgoing foreign key constraints. I'm looking for reviewon:
- whether this solution in fact solves the problem, or if I missed anedge case;
- how this solution may fare in the face of changes to the data model(e.g., new referent tables);
- whether this use of Postgres
GENERATED ALWAYS AS ... STORED
columns is reasonable or suspect; - whether this solution is likely to introduce concurrency issues.
Proposed solution
To illustrate the solution, consider a simple database that stores"users" and "groups". Users and groups are each keyed by integer IDs,and some bits of the ID are reserved to tell what kind of ID it is:
-- User and group IDs are both integers, but are in disjoint subsets of the key-- space, distinguished by the low 8 bits.CREATE DOMAIN userid AS int8 CHECK ((VALUE & 255) = 1);CREATE DOMAIN groupid AS int8 CHECK ((VALUE & 255) = 2);CREATE TABLE users( user_id userid PRIMARY KEY, name text NOT NULL);CREATE TABLE groups( group_id groupid PRIMARY KEY, admin userid NOT NULL REFERENCES users);INSERT INTO users(user_id, name) VALUES (1, 'alice'), (257, 'bob');INSERT INTO groups(group_id, admin) VALUES (2, 1), (258, 1);
Now, both users and groups can create invoices. Invoices have entirelythe same data whether they're created by a user or a group, so we justuse a single table that stores the ID of the "actor" (user or group)that created the invoice along with the extra data:
-- Invoices can be created by either users or groups: collectively, "actors".CREATE DOMAIN actorid AS int8 CHECK ((VALUE & 255) IN (1, 2));CREATE TABLE invoices( actor actorid NOT NULL, create_time timestamptz NOT NULL, amount_cents int NOT NULL);
Now, semantically, invoices.actor
is a foreign key onto eitherusers
or groups
, depending on the value of actor & 255
. There's noway to directly write a REFERENCES
constraint for that. We can imaginedefining a view of all the actor IDs—
CREATE VIEW all_actor_ids AS ( SELECT user_id AS actor FROM users UNION ALL SELECT group_id AS actor FROM groups);
—such that, in principle, actor actorid REFERENCES all_actor_ids
, butPostgres does not actually allow referring to views in foreignkeys.
To work around this, we basically materialize all_actor_ids
into atable that itself has foreign key constraints to ensure its ownintegrity:
CREATE TABLE actors( actor actorid PRIMARY KEY, user_id userid REFERENCES users GENERATED ALWAYS AS (CASE WHEN (actor & 255) = 1 THEN actor END) STORED, group_id groupid REFERENCES groups GENERATED ALWAYS AS (CASE WHEN (actor & 255) = 2 THEN actor END) STORED, CONSTRAINT actors_exactly_one_key CHECK (1 = (user_id IS NOT NULL)::int + (group_id IS NOT NULL)::int));
Now, invoices.actor
can refer to actors
:
ALTER TABLE invoices ADD FOREIGN KEY (actor) REFERENCES actors;
The idea is that, before you add an invoice on behalf of an actor, youfirst run INSERT INTO actors(actor) VALUES($1) ON CONFLICT DO NOTHING
.The generated columns take care of populating either user_id
xorgroup_id
, the foreign key constraints on those columns ensure that theunderlying entity actually exists, and the conflict handler makes theoperation a no-op if the actor has been used before.
For example, with the above definitions, these inserts work:
-- All users and groups can be populated as actors.INSERT INTO actors(actor) SELECT user_id FROM users UNION ALL SELECT group_id FROM groups ON CONFLICT DO NOTHING;-- Invoices can be created for either actors or groups.INSERT INTO invoices(actor, create_time, amount_cents) VALUES (1, now(), 100), (258, now(), 200);
Note that the actors
data never actually needs to be part of a JOIN
in a read path. It exists only to coax the foreign key constraints intosubmission.
Questions
It seems to me that this solution should properly ensure referentialintegrity: in particular, a user or group can't be deleted withoutcascading down to delete any invoices created by that user or group.But I have some questions:
Am I missing some edge case in which this solution does not actuallyensure referential integrity?
Suppose that invoices can now also be created by a third type ofentity: say,
robots
. I think that I can alter theactorid
domainto incorporaterobotid
s, then add a newactors.robot_id
columnlike the others and update theactors_exactly_one_key
constraint.Are there lurking issues that I should be wary of here?I haven't used Postgres
GENERATED ALWAYS AS ... STORED
columnsbefore, and I'm a little nervous that the default expression can'tbe changed at all after the fact. Does this seem like an appropriateuse of generated columns, or would it be better to replace thegenerated columns withCHECK
constraints that ensure the samevalues but require the user to provide them?Is the
INSERT INTO actors(actor) ... ON CONFLICT DO NOTHING
likelyto introduce concurrency issues? (Or, are there any other glaringperformance issues that I've missed?)
Any other feedback or reviews also warmly appreciated.
I'm using Postgres 12, but if the best solution here requires upgradingto Postgres 14, I'm open to it.