Quantcast
Viewing all articles
Browse latest Browse all 60

Foreign key to multiple tables via forwarding table

Problem statement

I have a table with a column whose values are foriegn keys, but thetarget table of the foreign key differs from row to row. The relevanttable can be determined from the key value alone, and there is a small,fixed set of such tables.

I'd like to add a foreign key constraint here so that my DBMS can ensurereferential integrity. Of course, I can't do this directly, but I have aproposed solution that involves an intermediate "forwarding table" withincoming and outgoing foreign key constraints. I'm looking for reviewon:

  • whether this solution in fact solves the problem, or if I missed anedge case;
  • how this solution may fare in the face of changes to the data model(e.g., new referent tables);
  • whether this use of Postgres GENERATED ALWAYS AS ... STOREDcolumns is reasonable or suspect;
  • whether this solution is likely to introduce concurrency issues.

Proposed solution

To illustrate the solution, consider a simple database that stores"users" and "groups". Users and groups are each keyed by integer IDs,and some bits of the ID are reserved to tell what kind of ID it is:

-- User and group IDs are both integers, but are in disjoint subsets of the key-- space, distinguished by the low 8 bits.CREATE DOMAIN userid AS int8 CHECK ((VALUE & 255) = 1);CREATE DOMAIN groupid AS int8 CHECK ((VALUE & 255) = 2);CREATE TABLE users(    user_id userid PRIMARY KEY,    name text NOT NULL);CREATE TABLE groups(    group_id groupid PRIMARY KEY,    admin userid NOT NULL REFERENCES users);INSERT INTO users(user_id, name) VALUES (1, 'alice'), (257, 'bob');INSERT INTO groups(group_id, admin) VALUES (2, 1), (258, 1);

Now, both users and groups can create invoices. Invoices have entirelythe same data whether they're created by a user or a group, so we justuse a single table that stores the ID of the "actor" (user or group)that created the invoice along with the extra data:

-- Invoices can be created by either users or groups: collectively, "actors".CREATE DOMAIN actorid AS int8 CHECK ((VALUE & 255) IN (1, 2));CREATE TABLE invoices(    actor actorid NOT NULL,    create_time timestamptz NOT NULL,    amount_cents int NOT NULL);

Now, semantically, invoices.actor is a foreign key onto eitherusers or groups, depending on the value of actor & 255. There's noway to directly write a REFERENCES constraint for that. We can imaginedefining a view of all the actor IDs—

CREATE VIEW all_actor_ids AS (    SELECT user_id AS actor FROM users    UNION ALL    SELECT group_id AS actor FROM groups);

—such that, in principle, actor actorid REFERENCES all_actor_ids, butPostgres does not actually allow referring to views in foreignkeys.

To work around this, we basically materialize all_actor_ids into atable that itself has foreign key constraints to ensure its ownintegrity:

CREATE TABLE actors(    actor actorid PRIMARY KEY,    user_id userid        REFERENCES users        GENERATED ALWAYS AS (CASE WHEN (actor & 255) = 1 THEN actor END) STORED,    group_id groupid        REFERENCES groups        GENERATED ALWAYS AS (CASE WHEN (actor & 255) = 2 THEN actor END) STORED,    CONSTRAINT actors_exactly_one_key        CHECK (1 = (user_id IS NOT NULL)::int + (group_id IS NOT NULL)::int));

Now, invoices.actor can refer to actors:

ALTER TABLE invoices ADD FOREIGN KEY (actor) REFERENCES actors;

The idea is that, before you add an invoice on behalf of an actor, youfirst run INSERT INTO actors(actor) VALUES($1) ON CONFLICT DO NOTHING.The generated columns take care of populating either user_id xorgroup_id, the foreign key constraints on those columns ensure that theunderlying entity actually exists, and the conflict handler makes theoperation a no-op if the actor has been used before.

For example, with the above definitions, these inserts work:

-- All users and groups can be populated as actors.INSERT INTO actors(actor)    SELECT user_id FROM users UNION ALL SELECT group_id FROM groups    ON CONFLICT DO NOTHING;-- Invoices can be created for either actors or groups.INSERT INTO invoices(actor, create_time, amount_cents)    VALUES (1, now(), 100), (258, now(), 200);

Note that the actors data never actually needs to be part of a JOINin a read path. It exists only to coax the foreign key constraints intosubmission.

Questions

It seems to me that this solution should properly ensure referentialintegrity: in particular, a user or group can't be deleted withoutcascading down to delete any invoices created by that user or group.But I have some questions:

  • Am I missing some edge case in which this solution does not actuallyensure referential integrity?

  • Suppose that invoices can now also be created by a third type ofentity: say, robots. I think that I can alter the actorid domainto incorporate robotids, then add a new actors.robot_id columnlike the others and update the actors_exactly_one_key constraint.Are there lurking issues that I should be wary of here?

  • I haven't used Postgres GENERATED ALWAYS AS ... STORED columnsbefore, and I'm a little nervous that the default expression can'tbe changed at all after the fact. Does this seem like an appropriateuse of generated columns, or would it be better to replace thegenerated columns with CHECK constraints that ensure the samevalues but require the user to provide them?

  • Is the INSERT INTO actors(actor) ... ON CONFLICT DO NOTHING likelyto introduce concurrency issues? (Or, are there any other glaringperformance issues that I've missed?)

Any other feedback or reviews also warmly appreciated.

I'm using Postgres 12, but if the best solution here requires upgradingto Postgres 14, I'm open to it.


Viewing all articles
Browse latest Browse all 60

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>