Indexes
Database indexes are essential for performance. In particular you will need indexes to speed up search requests.
Aidbox provides mechanisms to
- manage indexes
- suggest indexes
- generate indexes automatically
- collect per-SearchParameter usage statistics — rank "hot" parameters and confirm a created index is actually used
Background
Aidbox uses PostgreSQL database for storage. Most of resource data is contained in resource column with jsonb type. See Database overview for the full picture of how resources map onto SQL.
Consider simple example: active search parameter for Patient resource.
Let's try the search query
GET /fhir/Patient?active=true
Use _explain to find out SQL query generated by this request
GET /fhir/Patient?active=true&_explain=analyze
Possible response is
{
"query": [
"SELECT \"patient\".* FROM \"patient\" WHERE \"patient\".resource @> ? LIMIT ? OFFSET ?",
"{\"active\":true}",
100,
0
],
"query-inline": [
"SELECT \"patient\".* FROM \"patient\" WHERE \"patient\".resource @> '{\"active\":true}' LIMIT 100 OFFSET 0"
],
"plan": "Limit (cost=0.00..1.01 rows=1 width=124) (actual time=0.015..0.015 rows=0 loops=1)\n -> Seq Scan on patient (cost=0.00..1.01 rows=1 width=124) (actual time=0.014..0.014 rows=0 loops=1)\n Filter: (resource @> '{\"active\": true}'::jsonb)\n Rows Removed by Filter: 1\n Planning Time: 0.729 ms\n Execution Time: 0.050 ms"
}
Corresponding SQL is
SELECT "patient".*
FROM "patient"
WHERE "patient".resource @> '{"active": "true"}'::jsonb
LIMIT 100
OFFSET 0
Here @> is containment operator. It tests whether jsonb value on the right-hand side is contained in the jsonb value on the left-hand side.
Without indexes Postgres has to check this condition for every Patient resource stored in the database.
However, GIN indexes can speed up these kind of queries. A GIN index inverts the jsonb structure into a lookup table of the keys and values it contains, so a containment test (@>) can jump straight to matching rows instead of scanning the whole table.
We can create GIN index for the resource column
CREATE INDEX patient_resource_gin_idx
ON Patient
USING GIN (resource)
Now Postgres can use this index to make search much faster.
Functional indexes
Consider more complex example: name search parameter for Patient resource.
Request
GET /fhir/Patient?name=abc
Generates SQL like
SELECT *
FROM Patient
WHERE
aidbox_text_search(
knife_extract_text(
resource,
'[["name","family"],["name","given"],["name","middle"],["name","text"],["name","prefix"],["name","suffix"]]'
)
) ILIKE unaccent('% abc%')
LIMIT 100
OFFSET 0
Postgres' pg_trgm module supports index searches for ILIKE queries.
You can create functional index to speed up this query:
CREATE INDEX patient_name_trgm_idx
ON Patient
USING GIN (
aidbox_text_search(
knife_extract_text(
resource,
'[["name","family"],["name","given"],["name","middle"],["name","text"],["name","prefix"],["name","suffix"]]'
)
) gin_trgm_ops
)
Which indexes does Aidbox need?
It depends — and that's the point. A short tour of what can vary:
- Index method. GIN for
@>over jsonb, GIN withgin_trgm_opsfor fuzzy text (name,_text), btree for ordered access (id,_lastUpdated,date), GiST for spatial (nearon Location). The right choice depends on the SearchParameter's type, not its name. - Modifiers.
:containsand:exacton the samenameparameter need different functional indexes;:in/:not-in/:above/:belowon token parameters expand into ValueSet lookups;:identifier/:of-typepull from different jsonb paths. - Path expressions. Aidbox stores resources as jsonb, so the suggester emits functional indexes over
knife_extract_text(...)orjsonb_path_query(...)— one per SP path — rather than indexes on plain columns. - Joins. Chained queries (
Observation?subject:Patient.name=John) and reverse-chain_hasqueries translate into SQL joins or subselects; both sides need their own indexes. - Full-resource fallback. Token and reference parameters without a dedicated path fall back to a GIN over the whole jsonb. It rescues queries that no functional index covers, but it's larger on disk.
Hand-picking the right combination per parameter is impractical. The next sections cover Aidbox's suggest-index RPCs, which compute the candidates for you, and the usage-statistics RPCs, which tell you which suggestions actually deserve the disk space.
Index suggestion
Aidbox provides two RPCs that can suggest you indexes
Suggest indexes for parameter
Use aidbox.index/suggest-index RPC to get index suggestion for specific search parameter
POST /rpc
Content-Type: text/yaml
Accept: text/yaml
method: aidbox.index/suggest-index
params:
resource-type: <resourceType>
search-param: <searchParameter>
Suggest indexes for query
Use aidbox.index/suggest-index-query RPC to get index suggestions based on query
POST /rpc
Content-Type: text/yaml
Accept: text/yaml
method: aidbox.index/suggest-index-query
params:
resource-type: Observation
query: date=gt2022-01-01&_id=myid
Usage statistics
Aidbox tracks how often each SearchParameter is queried and exposes the numbers via RPCs. Use them to rank "hot" parameters, decide which suggested indexes are worth creating, and confirm a created index is actually being used. Available since Aidbox 2605.