MDMbox Docs

Find duplicates: $match

The $match operation performs a probabilistic search using a matching model and returns potential duplicates ranked by match score.

A MatchingModel must be created before using $match. See Matching models.

Match a resource

Send a FHIR Parameters resource containing the record to match:

POST /api/fhir/Patient/$match
Content-Type: application/json
{
  "resourceType": "Parameters",
  "parameter": [
    {"name": "modelId", "valueString": "patient-model"},
    {
      "name": "resource",
      "resource": {
        "resourceType": "Patient",
        "name": [{"given": ["Freya"], "family": "Shah"}],
        "birthDate": "1990-01-15",
        "gender": "female"
      }
    }
  ]
}

Match an existing resource by ID

To match an existing resource against all others:

POST /api/fhir/Patient/123/$match?model-id=patient-model

No request body is needed — MDMbox retrieves the resource by ID and runs the match.

Parameters

Request body parameters (FHIR Parameters)

NameTypeRequiredDescription
modelIdvalueStringYesID of the MatchingModel to use
resourceresourceYesThe FHIR resource to find matches for
thresholdvalueDecimalNoOverride the model's probable threshold
onlyCertainMatchesvalueBooleanNoOnly return matches above the certain threshold
onlySingleMatchvalueBooleanNoReturn at most one result (empty if ambiguous)
countvalueIntegerNoMaximum number of results (default: 10)

Query parameters (for by-ID match)

NameTypeDefaultDescription
model-idstringMatchingModel ID (required)
thresholdnumbermodel's probableMinimum score threshold
pageinteger1Page number
sizeinteger20Results per page

Response

The response is a FHIR Bundle of type searchset. Each entry includes:

  • resource — the matched FHIR resource
  • search.score — probability (0 to 1) derived from the match weight
  • search.extension — match grade (certain, probable, or possible)
{
  "resourceType": "Bundle",
  "type": "searchset",
  "total": 3,
  "entry": [
    {
      "resource": {
        "resourceType": "Patient",
        "id": "456",
        "name": [{"given": ["Freya"], "family": "Shah"}],
        "birthDate": "1990-01-15"
      },
      "search": {
        "mode": "match",
        "score": 0.99,
        "extension": [
          {
            "url": "http://hl7.org/fhir/StructureDefinition/match-grade",
            "valueCode": "certain"
          }
        ]
      }
    }
  ]
}

For large datasets, create database indexes on columns used in matching model blocks. Without indexes, $match performs a full table scan for each block, which can be very slow.

Score calculation

Match scores are log2 Bayes factor sums, converted to probabilities using a sigmoid function:

probability = 1 / (1 + 2^(-weight))

A weight of 25 corresponds to a probability of ~0.99999997. See Mathematical details for the full derivation.

Last updated: