---
{
  "title": "Beyond Patient/$merge: A Resource-Agnostic, Client-Driven Merge for FHIR",
  "description": "FHIR R5's Patient/$merge is a start, but production MDM needs more. We built a resource-agnostic $merge with client-driven plans, atomic audit trails, and a generic $referencing operation.",
  "date": "2026-04-09",
  "author": "Ivan Shukshin",
  "reading-time": "8 min read",
  "tags": ["FHIR Standard", "Aidbox", "System Design", "Integrations", "MDMbox"],
  "tldr": "We extended FHIR $merge to accept a client-supplied transaction Bundle as a plan parameter. The client decides what to update, create, or delete — the server guarantees atomicity and creates Task + Provenance audit resources in the same transaction. Combined with a new $referencing operation for generic reverse-include, this gives you a fully auditable, resource-agnostic merge that works for Patient, Organization, Practitioner, or anything else."
}
---

We've been running Master Data Management in [MDMbox](https://www.health-samurai.io/mdmbox) across production deployments for the past two years — deduplicating patients, merging organizations, reconciling practitioners across facilities. Along the way, we learned that every organization merges differently, and no single server-side algorithm can cover the variety of real-world merge policies.

FHIR R5 introduced `Patient/$merge` — a long-awaited operation that lets you merge duplicate patient records. It defines input parameters (`source-patient`, `target-patient`, `result-patient`), expects the server to handle reference updates, and returns the merged result. The operation is currently at maturity level 0 and hasn't seen further development since its introduction.

It's a good start. But after implementing merge in production, we found the spec doesn't go far enough.

## The problem with server-driven merge

The FHIR `$merge` spec assumes the server knows *how* to merge. The server deactivates the source, copies identifiers, updates references. But merge logic varies wildly between organizations:

- **Hospital A** deletes the source patient entirely and rewrites all references
- **Hospital B** keeps the source inactive and creates a Linkage resource for provenance
- **Hospital C** merges Encounters and Observations into the target, but preserves separate AllergyIntolerance records for manual review
- **Health information exchange D** needs to merge Organizations and Practitioners, not just Patients

The spec doesn't cover any of that — it's Patient-only and the "update all references" step is hand-waved.

![Merge flow](merge-flow.svg)

What we needed was a merge operation where:
- The **client** decides what happens (which resources to update, create, or delete)
- The **server** guarantees atomicity, creates an audit trail, and enforces safety checks
- It works for **any resource type**, not just Patient

## Our approach: the plan parameter

We built a FHIR-like `$merge` that accepts a standard transaction Bundle as an additional `plan` parameter:

```json
POST $merge

{
  "resourceType": "Parameters",
  "parameter": [
    {"name": "source", "valueReference": {"reference": "Patient/123"}},
    {"name": "target", "valueReference": {"reference": "Patient/456"}},
    {"name": "preview", "valueBoolean": false},
    {
      "name": "plan",
      "resource": {
        "resourceType": "Bundle",
        "type": "transaction",
        "entry": [
          {
            "resource": {"resourceType": "Patient", "id": "456", "...": "merged state"},
            "request": {"method": "PUT", "url": "Patient/456", "ifMatch": "W/\"3\""}
          },
          {
            "resource": {"resourceType": "Encounter", "id": "789",
                         "subject": {"reference": "Patient/456"}},
            "request": {"method": "PUT", "url": "Encounter/789", "ifMatch": "W/\"1\""}
          },
          {
            "request": {"method": "DELETE", "url": "Patient/123", "ifMatch": "W/\"2\""}
          }
        ]
      }
    }
  ]
}
```

The client sends every PUT, POST, and DELETE explicitly. The server wraps the plan with audit resources (Task + Provenance) and executes the whole thing as a single FHIR transaction. Either everything succeeds — including the audit trail — or nothing does.

Notice the `ifMatch` headers on every entry — this is optimistic locking. If any resource was modified between the time the client built the plan and the time it's executed, the transaction fails. No silent overwrites, no lost updates.

## It's a superset, not a fork

You'll notice we use `source`, `target`, and `result` instead of the spec's `source-patient`, `target-patient`, and `result-patient`. This is intentional — the operation is resource-agnostic, so Patient-specific naming would be misleading. Merging duplicate Organizations? Practitioners? Same operation, same audit trail.

Drop the `plan` parameter and pass `result` instead — you get standard FHIR `$merge` behavior. Identifier-based lookup for source and target is supported. The `plan` is what makes it strictly more powerful: when present, the client takes full control of the merge logic.

## The missing piece: $referencing

To build a good merge plan, the client needs to answer one question: *"What resources reference Patient/123?"*

FHIR has `_revinclude` for search results and `$everything` for Patient, but no generic operation that says "give me every resource in the database that points at this reference." We built one.

![Before and after merge](before-after.svg)

Under the hood, it's a PostgreSQL query using JSONPath to search across resource tables:

```sql
SELECT id, resource_type, resource
FROM <resource_table>
WHERE jsonb_path_query_array(resource, '$.** ? (@.reference == "Patient/123")')
      != '[]'::jsonb
```

The query above is simplified for clarity, but the core idea holds: `$.**` recursively traverses the entire resource structure, finding any nested object where `reference` equals the target — regardless of where in the resource tree it appears. No hardcoded paths, no per-resource-type configuration. One query finds references in `Encounter.subject`, `Observation.performer`, `Claim.provider`, or any other reference field.

**On performance:** this query uses a GIN index on the JSONB column, so the recursive traversal happens against the index — not a full table scan. For a typical merge, the operation completes in milliseconds. For resource types with millions of rows, we parallelize the search across tables and stream results back to the client.

We've exposed this as a standalone `$referencing` operation in [MDMbox](https://www.health-samurai.io/mdmbox) — a building block that any merge (or impact analysis) workflow can use.

## Task + Provenance: not just audit

Every merge creates a Task and a Provenance inside the same transaction. This is more than compliance:

![Audit trail](audit-trail.svg)

**Subscriptions.** External systems can subscribe to merge Task creation and react immediately — update their caches, trigger downstream workflows, notify users. Standard FHIR Subscriptions, no custom plumbing.

**Versioned snapshots.** The Provenance records a versioned reference (e.g. `Patient/456/_history/3`) for every resource modified or deleted by the merge. Combined with FHIR's History API, this means the complete pre-merge state is always recoverable.

**Lifecycle tracking.** The Task records the merge status (`requested` → `in-progress` → `completed` or `failed`), who initiated it, when it happened, and links to both the source and target. Querying "show me all merges that affected this patient" is a single FHIR search on Task.

## What's next: $unmerge

FHIR doesn't define an unmerge operation yet. But consider this scenario: a merge is executed on Monday, and on Wednesday someone realizes the two patients were actually different people. By then, 50 resources have been modified — Encounters repointed, Observations reassigned, the source Patient deleted.

With Task tracking the merge lifecycle, Provenance capturing every affected resource with its pre-merge version, and the History API preserving all prior states — we have everything needed to reverse a merge reliably. The `$unmerge` operation reads the Provenance, retrieves the pre-merge versions from history, and constructs a reverse transaction Bundle that restores every resource to its prior state.

We'll cover `$unmerge` in the next post.

---

*Want to try client-driven merge in your project? [MDMbox](https://www.health-samurai.io/mdmbox) is available today.*
