Aidbox Docs

$export

The $export operation implements the FHIR Bulk Data Export specification, allowing you to export large volumes of FHIR resources in ndjson format. This operation is designed for scenarios where you need to extract data for analytics, migration, or backup purposes.

Aidbox supports three export levels: patient-level, group-level, and system-level. When you submit an export request, the server processes it asynchronously and returns a status URL. You can poll this URL to check when the export completes. Once finished, the status endpoint provides signed URLs to download the exported files from your configured cloud storage.

Export operations run one at a time to prevent resource exhaustion. If you attempt to start a new export while another is in progress, the server returns a 429 Too Many Requests error.

Cloud storage setup

$export requires cloud storage to be configured. If no storage provider is set, export requests (system-, group-, or patient-level) return 500 with an error such as "storage-type not specified". Storage validation is performed before other checks (e.g. group existence or output format).

Aidbox exports data to cloud storage backends including GCP, Azure, and AWS. Export files are organized in timestamped folders with the pattern <datetime>_<uuid> to ensure unique paths for each export operation.

Each cloud provider supports two authentication modes: credential-based (using stored keys or tokens) and workload identity (using cloud-native pod identity). Workload identity is recommended for managed Kubernetes deployments (GKE, AKS) as it eliminates the need to manage credentials in Aidbox resources and uses the pod's identity to authenticate with cloud storage.

GCP

Start by creating a Cloud Storage bucket where Aidbox will write export files. The bucket should have appropriate lifecycle policies if you want to automatically delete old exports.

Using workload identity

This is the recommended approach for GKE deployments. When running Aidbox in Google Kubernetes Engine with Workload Identity, your pods automatically authenticate using their Kubernetes service account. No credentials need to be stored in Aidbox.

Before configuring bulk export, set up workload identity following the GCP Cloud Storage: Workload Identity guide. This includes enabling Workload Identity on your GKE cluster, creating a GCP service account with Cloud Storage permissions, granting URL signing permissions, and binding your Kubernetes service account to the GCP service account.

Once workload identity is configured, set the environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=gcp
BOX_FHIR_BULK_STORAGE_GCP_BUCKET=your-bucket-name

With workload identity configured, Aidbox uses the pod's identity to generate signed URLs for export files.

Using service account credentials

For non-GKE deployments or when workload identity isn't available, create a service account with Storage Object Admin role on your bucket.

Create a GcpServiceAccount resource in Aidbox:

resourceType: GcpServiceAccount
id: gcp-service-account
service-account-email: export@your-project.iam.gserviceaccount.com
private-key: |
  -----BEGIN PRIVATE KEY-----
  your-private-key-here
  -----END PRIVATE KEY-----

Configure environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=gcp
BOX_FHIR_BULK_STORAGE_GCP_SERVICE_ACCOUNT=gcp-service-account
BOX_FHIR_BULK_STORAGE_GCP_BUCKET=your-bucket-name

See also:

Azure

Start by creating an Azure storage account and a blob container where Aidbox will write export files.

Using workload identity

This is the recommended approach for AKS deployments. When running Aidbox in Azure Kubernetes Service with Workload Identity, your pods automatically authenticate using Azure managed identities. No credentials need to be stored in Aidbox.

Before configuring bulk export, set up workload identity following the Azure Blob Storage: Workload identity guide. This includes creating a managed identity, configuring federated credentials, assigning storage roles, and setting up your Kubernetes ServiceAccount.

Once workload identity is configured, create an AzureContainer resource in Aidbox that references your storage account and container:

resourceType: AzureContainer
id: export-container
storage: mystorageaccount
container: exports

Note that the AzureContainer resource does not include an account field for workload identity mode.

Configure environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=azure
BOX_FHIR_BULK_STORAGE_AZURE_CONTAINER=export-container

With workload identity configured, Aidbox uses the pod's managed identity to generate user delegation SAS tokens for export files.

Using SAS tokens

For non-AKS deployments or when workload identity isn't available, you can use Shared Access Signature (SAS) tokens for authentication.

Create an AzureAccount resource with your storage account key:

resourceType: AzureAccount
id: azure-account
key: your-storage-account-key-here

Create an AzureContainer resource that references both the storage account and the Azure account:

resourceType: AzureContainer
id: export-container
storage: mystorageaccount
container: exports
account:
  resourceType: AzureAccount
  id: azure-account

Configure environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=azure
BOX_FHIR_BULK_STORAGE_AZURE_CONTAINER=export-container

When the AzureContainer has an account field, Aidbox uses the account key to generate SAS tokens.

AWS

Start by creating an S3 bucket where Aidbox will write export files. Configure appropriate bucket policies and lifecycle rules for your use case.

IAM permissions

The IAM role or user needs the following permissions on the bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

Required actions:

  • s3:PutObject - Write export files to the bucket
  • s3:GetObject - Generate signed URLs for downloading export results
  • s3:ListBucket - List bucket contents for export operations

Using default credentials

Available since version 2601.

This is the recommended approach for AWS deployments. When running Aidbox on AWS compute (EKS, ECS, EC2, Lambda), you can use the default credentials provider chain instead of storing credentials in Aidbox.

Before configuring bulk export, set up credentials for your compute environment:

See File storage: AWS S3 — EKS Pod Identity for detailed setup instructions.

Create an AwsAccount resource without credentials:

resourceType: AwsAccount
id: aws-account
region: us-east-1

Configure environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=aws
BOX_FHIR_BULK_STORAGE_AWS_ACCOUNT=aws-account
BOX_FHIR_BULK_STORAGE_AWS_BUCKET=your-bucket-name

When access-key-id is omitted, Aidbox uses the default credentials provider to authenticate with S3.

Using access keys

For S3-compatible services (MinIO, Garage) or legacy setups, use explicit credentials.

Create an IAM user with the IAM permissions and create an AwsAccount resource with credentials:

resourceType: AwsAccount
id: aws-account
region: us-east-1
access-key-id: your-access-key-id
secret-access-key: your-secret-access-key

Configure environment variables:

BOX_FHIR_BULK_STORAGE_PROVIDER=aws
BOX_FHIR_BULK_STORAGE_AWS_ACCOUNT=aws-account
BOX_FHIR_BULK_STORAGE_AWS_BUCKET=your-bucket-name

See also:

Parameters

The $export operation accepts several parameters to customize the export: query parameters on GET, or the same parameters inside a FHIR Parameters resource when you use POST on patient- and group-level exports (see POST with a Parameters body). Supported parameter names are _outputFormat, _since, _until, _type, _typeFilter, and patient.

ParameterDescription
_outputFormatSpecifies the format in which the server generates files. Default: application/fhir+ndjson. Canonical values are application/fhir+ndjson (generates .ndjson files) and application/fhir+ndjson+gzip (generates compressed .ndjson.gz files).
_typeComma-separated list of resource types to include in the export. Only the specified types will be exported.
_sinceIncludes only resources whose last modification time is after the given instant (ts > _since). ISO 8601 format.
_untilIncludes only resources whose last modification time is before the given instant (ts < _until). ISO 8601 format. Together with _since, this defines an open window on the modification timestamp.
_typeFilterRestricts exported rows using FHIR search criteria per resource type, in the form ResourceType?searchParams (same idea as the FHIR Bulk Data _typeFilter parameter). You may repeat _typeFilter multiple times. Multiple filters for the same resource type are combined with OR. If _type is present, every type used in _typeFilter must also appear in _type; types listed in _type but without a filter are exported in full. Standard FHIR search parameters such as _id are allowed. Not allowed inside the search part: _sort, _count, _page, _total, _summary, _elements, _include, _revinclude, _has, _assoc, _with.
patientRestricts the export to specific patients. GET: comma-separated patient ids (not full references), e.g. patient=pt-1,pt-2. Supported only on patient-level GET. POST: repeat a parameter named patient, each with valueReference.reference set to Patient/{id}. Supported on patient-level and group-level POST; for group export, every listed patient must exist and be a member of that group.

POST with a Parameters body (patient and group)

For patient-level and group-level exports you can use POST with a JSON body of resourceType: Parameters instead of query string parameters. System-level export (GET /fhir/$export) does not support POST with a Parameters body in this implementation.

Use the same async headers as for GET and include a body content type:

  • Accept: application/fhir+json
  • Content-Type: application/fhir+json
  • Prefer: respond-async

Each parameter uses the same name as its query-string counterpart. The value type for each:

ParameterType
_outputFormatvalueString
_sincevalueInstant
_untilvalueInstant
_typevalueString
_typeFiltervalueString
patientvalueReference

For patient, set valueReference.reference to Patient/{id}. For _type, use one part with comma-separated resource types or repeat _type several times; Aidbox joins all _type values into one comma-separated list. Repeat _typeFilter or patient parts when you need multiple values.

Any other parameter name in the body returns 400 with an operation outcome. Unknown or invalid _typeFilter syntax, search errors, or patient ids that do not exist (or are not in the group, for group export) also return 400.

Example: POST /fhir/Patient/$export

POST /fhir/Patient/$export
Accept: application/fhir+json
Content-Type: application/fhir+json
Prefer: respond-async
{
  "resourceType": "Parameters",
  "parameter": [
    { "name": "_type", "valueString": "Patient,Observation" },
    { "name": "_since", "valueInstant": "2024-01-01T00:00:00Z" },
    { "name": "_until", "valueInstant": "2025-01-01T00:00:00Z" },
    { "name": "_typeFilter", "valueString": "Observation?status=final" },
    { "name": "patient", "valueReference": { "reference": "Patient/pt-1" } }
  ]
}

Patient-level export

Patient-level export extracts all Patient resources and resources associated with them. The association is defined by FHIR Patient Compartment, which specifies which resource types reference patients and through which fields.

To start a patient-level export, send GET to /fhir/Patient/$export with optional query parameters, or POST to the same URL with a Parameters JSON body (for example when using _typeFilter or repeated patient references).

Rest console

GET /fhir/Patient/$export
Accept: application/fhir+json
Prefer: respond-async

Status

202 Accepted

Headers

  • Content-Location — URL to check export status (e.g. /fhir/$export-status/<id>)

Poll the status endpoint to check when the export completes:

Rest console

GET /fhir/$export-status/<id>

Status

200 OK

Body

{
  "status": "completed",
  "transactionTime": "2021-12-08T08:28:06.489Z",
  "requiresAccessToken": false,
  "request": "[base]/fhir/Patient/$export",
  "output": [
    {
      "type": "Patient",
      "url": "https://storage/some-url",
      "count": 2
    },
    {
      "type": "Observation",
      "url": "https://storage/some-other-url",
      "count": 15
    }
  ],
  "error": []
}

To cancel an active export, send a DELETE request to the status endpoint:

Rest console

DELETE /fhir/$export-status/<id>

Status

202 Accepted

Group-level export

Group-level export extracts all Patient resources that belong to a specified Group resource, plus all resources associated with those patients. The group characteristics themselves are not exported. Association is defined by the FHIR Patient Compartment.

To start a group-level export, send GET to /fhir/Group/<group-id>/$export with optional query parameters, or POST to the same path with a Parameters body to pass patient restrictions or long _typeFilter values.

Rest console

GET /fhir/Group/<group-id>/$export
Accept: application/fhir+json
Prefer: respond-async

Status

202 Accepted

Headers

  • Content-Location — URL to check export status (e.g. /fhir/$export-status/<id>)

The status endpoint works the same way as patient-level export. Poll /fhir/$export-status/<id> to check progress, and send a DELETE request to cancel.

System-level export

System-level export extracts data from the entire FHIR server, whether or not it's associated with a patient. Use the _type parameter to restrict which resource types are exported.

System-level export works only for standard FHIR resources, not for custom resources defined in your Aidbox configuration.

GET /fhir/$export
Accept: application/fhir+json
Prefer: respond-async

Status

200 OK

Body

{
  "status": "completed",
  "transactionTime": "2021-12-08T08:28:06.489Z",
  "requiresAccessToken": false,
  "output": [
    {
      "type": "Patient",
      "url": "https://storage/some-url",
      "count": 2
    },
    {
      "type": "Practitioner",
      "url": "https://storage/some-other-url",
      "count": 5
    }
  ],
  "error": []
}

The status endpoint works the same way as other export levels. Poll /fhir/$export-status/<id> to check progress, and send a DELETE request to cancel.

Last updated: