$viewdefinition-export operation
Available in Aidbox versions 2605 and later. Requires fhir-schema mode. Implements SQL-on-FHIR v2 $viewdefinition-export.
Exports a ViewDefinition's flattened rows into a sink (e.g. a Databricks Delta table). The backend is contributed by an Aidbox module; you pick it with the kind parameter.
Use this for one-shot snapshots, backfills, or ad-hoc dumps when standing up a streaming AidboxTopicDestination is overkill.
Registered backends
kind | Sink | Module |
|---|---|---|
data-lakehouse | Databricks Unity Catalog managed Delta table | Data Lakehouse module |
Kick-off
POST /fhir/ViewDefinition/$viewdefinition-export
Content-Type: application/fhir+json
Prefer: respond-async
{
"resourceType": "Parameters",
"parameter": [
{"name": "view",
"part": [{"name": "name", "valueString": "patient_flat"},
{"name": "viewReference", "valueReference": {"reference": "ViewDefinition/patient_flat"}}]},
{"name": "kind", "valueString": "data-lakehouse"},
{"name": "writeMode", "valueString": "managed-zerobus"},
{"name": "databricksWorkspaceUrl", "valueString": "https://xxx.cloud.databricks.com"},
{"name": "databricksWorkspaceId", "valueString": "xxx"},
{"name": "databricksRegion", "valueString": "xxx"},
{"name": "tableName", "valueString": "xxx.xxx.patient_flat"},
{"name": "databricksWarehouseId", "valueString": "xxx"},
{"name": "awsRegion", "valueString": "xxx"},
{"name": "stagingTablePath", "valueString": "s3://xxx/staging/patient_flat/"}
]
}
Response:
202 Accepted
Content-Location: /fhir/ViewDefinition/$viewdefinition-export/status/<export-id>
{
"resourceType": "Parameters",
"parameter": [
{"name": "exportId", "valueString": "<uuid>"},
{"name": "status", "valueCode": "in-progress"},
{"name": "location", "valueUri": "/fhir/ViewDefinition/$viewdefinition-export/status/<uuid>"}
]
}
Parameters
Spec parameters
| Parameter | Required | Notes |
|---|---|---|
view | yes | Exactly one entry. viewReference must point at a server-stored ViewDefinition. Inline viewResource is not yet supported. |
kind | yes | Selects the backend (e.g. data-lakehouse). |
clientTrackingId | no | Echoed back in the status response. |
_format | no | ndjson, parquet, json, or omitted. Functionally ignored — the sink format is determined by the backend (Delta for kind=data-lakehouse). |
header | no | Echoed; not meaningful for non-CSV sinks. |
_since | no | ISO-8601 instant. Filters rows by the view's timestamp column (ts or last_updated). 400 if _since is set but the view exposes neither column. |
patient (0..*) | no | List of Patient references. Currently accepted but not yet applied — the full view is exported. |
group (0..*) | no | List of Group references. Same status as patient. |
source | no | External data source URI. Not supported — rejected. |
Data Lakehouse backend parameters (kind=data-lakehouse)
| Parameter | Required | Notes |
|---|---|---|
writeMode | yes | managed-zerobus (default — REST row-insert) or managed-sql (SQL warehouse INSERT). |
tableName | yes | Managed UC table full name catalog.schema.table. |
databricksWorkspaceUrl | yes | https://<workspace>.cloud.databricks.com. |
databricksWorkspaceId | yes | Numeric workspace id (for Zerobus URL). |
databricksRegion | yes | Workspace AWS region. |
databricksWarehouseId | yes | SQL warehouse id (used at setup for schema sync + at finalize for the MERGE). |
awsRegion | yes | AWS region of the S3 staging bucket. |
stagingTablePath | yes | s3://bucket/staging/<table>/ — must start with s3:// or s3a://. |
chunkCount | no | Positive integer (default 1). Splits the export into N per-chunk staging tables and N concurrent writers. Capped by Aidbox's HikariCP pool size; 400 with parallelism-exceeds-pool if exceeded. |
OAuth M2M credentials are sourced from Aidbox-wide settings — BOX_DATABRICKS_DATA_LAKEHOUSE_CLIENT_ID / _CLIENT_SECRET env vars or the corresponding settings registry entries. They are NOT accepted as per-request parameters. See the Data Lakehouse tutorial for the full Databricks-side setup.
Status polling
GET /fhir/ViewDefinition/$viewdefinition-export/status/<export-id>
Response codes:
202 Accepted— still in progress. The sameContent-Locationis returned so the client can keep polling.200 OK— terminal. Body is aParametersresource with the final shape (status=completed,status=failed, orstatus=cancelled, plusoutput[].locationon success).404 Not Found— unknownexport-id.
Completed output for kind=data-lakehouse:
{
"resourceType": "Parameters",
"parameter": [
{"name": "exportId", "valueString": "<uuid>"},
{"name": "status", "valueCode": "completed"},
{"name": "clientTrackingId", "valueString": "..."},
{"name": "exportStartTime", "valueInstant": "2026-05-22T00:00:00Z"},
{"name": "exportEndTime", "valueInstant": "2026-05-22T00:01:30Z"},
{"name": "output",
"part": [{"name": "name", "valueString": "patient_flat"},
{"name": "location", "valueUri": "databricks-uc:catalog.schema.patient_flat"}]}
]
}
The output[].location URI scheme is backend-specific (databricks-uc: for the data-lakehouse backend).
Cancellation
DELETE /fhir/ViewDefinition/$viewdefinition-export/status/<export-id>
Stops an in-flight export and triggers backend cleanup (the data-lakehouse backend drops per-chunk staging tables it created). Response codes:
202 Accepted— cancel acknowledged, async cleanup running. Body reportsstatus=canceling. Subsequent status polls return200 OKwithstatus=cancelledonce cleanup is done.200 OK— operation already terminal (completed/failed/cancelled). DELETE is idempotent.404 Not Found— unknownexport-id.
What cancellation does not do:
- It does not roll back rows already merged into the target managed table. The merge is the last step of finalize-export — if cancel arrives before finalize, no rows have landed; if after finalize, the operation is already terminal.
- It does not delete history rows of chunks that already completed.
Failure model
- Input validation failures (missing
view, missingkind, multiple views,sourceset, etc.) — synchronous400 OperationOutcomefrom the kick-offPOST. Noexport-idis allocated. - Backend-side failures — async. Kick-off returns
202with anexport-id; status polling later reportsstatus=failedwith the error in theerrorparameter. Common causes:- No backend registered for
kind(typo, module not deployed). - Databricks auth (bad
client-id/client-secret). - Missing target table / missing required parameter (e.g., no
tableName). - Schema mismatch the module can't auto-
ALTER.
- No backend registered for
How it works (kind=data-lakehouse)
With chunkCount = N (default N=1), the backend hash-partitions sof.<view> into N chunks, writes each chunk into its own external Delta table under <stagingTablePath>/chunk-K/, then materializes the union into the managed target with one MERGE INTO target USING (SELECT * FROM staging_0 UNION ALL …) against the SQL warehouse and drops every staging. The N=1 case is the degenerate path of the same flow.
Steps:
plan-exportdecides the chunk count fromchunkCount.setup-exportsyncs the target schema againstsof.<view>(auto-ALTER ADD COLUMNSif Aidbox added columns) and pre-computes the staging-column spec.- Each chunk task creates its own external Delta table at
<stagingTablePath>/chunk-K/, vends short-lived STS credentials from Unity Catalog for that prefix, and streams its hash-partition ofsof.<view>as one Delta commit. - Once all chunks complete, the coordinator task invokes
finalize-export:MERGE INTO target USING (SELECT * FROM staging_0 UNION ALL …) ON t.id = s.id WHEN NOT MATCHED THEN INSERT *. - The module drops every staging table.
On failure the per-chunk stagings are best-effort dropped via cancel-export. Chunks retry up to 2 times with a 30-second backoff; if they exhaust retries the export is reported as failed.
The MERGE is idempotent on id — a retried export after a lost response inserts nothing instead of duplicating. Your ViewDefinition must select an id column.
The Databricks-side setup (catalog, schema, target table, staging schema, service principal, grants, warehouse) is documented in the Data Lakehouse tutorial — the same setup is reused here.
Multi-pod execution
Chunks run on Aidbox's standard async-task engine, so they distribute across every pod sharing the metastore. Status polling and cancellation answer from any pod — no kick-off-pod affinity, no client-side load-balancer pinning.
A pod failure mid-chunk is recovered automatically: the chunk task's heartbeat lapses and another pod re-leases it.
Cloud support
The Data Lakehouse backend's staging path currently supports AWS S3 only (s3://... / s3a://...). The Unity Catalog managed target table is unaffected — UC manages its own storage.
Need GCS (gs://...) or Azure ADLS Gen2 (abfss://...) staging? Contact us — they're not wired through the backend yet, but the Aidbox-side wiring is cloud-agnostic and we can prioritize.
Limitations (current)
- One
viewper request (spec allows1..*). patient/groupfilters extracted but not yet applied to the SQL._sinceis applied.cancelUrl(the spec's pointer to a cancel endpoint exposed in the kick-off response) is not yet returned. Cancellation itself works —DELETEon the status URL is supported (see Cancellation above).estimatedTimeRemainingis not computed.