---
{
  "title": "@atomic-ehr/codegen: US Core Profiles in Python",
  "description": "Generate typed US Core profile classes in Python from the FHIR IG with @atomic-ehr/codegen — typed factories, extensions, slices, and profile-aware validation.",
  "date": "2026-07-03",
  "author": "Mikhail Artemyev, Aleksandr Penskoi",
  "reading-time": "12 minutes",
  "tags": [
    "FHIR Tools",
    "FHIR Standard",
    "Code Generation",
    "Python",
    "Pydantic",
    "Aidbox"
  ]
}
---
Building US Core resources by hand is tedious. You stamp `meta.profile`, look up LOINC codes, hand-roll the `us-core-race` nested extension — every field is a typo waiting to happen, every profile is its own version of the same ceremony.

[`@atomic-ehr/codegen`](https://github.com/atomic-ehr/codegen) makes that boilerplate disappear. Point it at the [US Core IG](https://www.hl7.org/fhir/us/core/) and you get one Pydantic model per base type plus a plain-Python wrapper class per profile, with typed accessors for fixed values, extensions, and slices, and a `validate()` that knows what the profile requires.

This tutorial walks through that end-to-end on two US Core profiles: [US Core Patient](https://www.hl7.org/fhir/us/core/StructureDefinition-us-core-patient.html) and [US Core Blood Pressure](https://www.hl7.org/fhir/us/core/StructureDefinition-us-core-blood-pressure.html).

## What You'll Build

A CSV-to-FHIR converter, built step by step:

1. generate profile classes for [US Core Patient](https://www.hl7.org/fhir/us/core/StructureDefinition-us-core-patient.html) and [US Core Blood Pressure](https://www.hl7.org/fhir/us/core/StructureDefinition-us-core-blood-pressure.html) from `hl7.fhir.us.core@8.0.1`,
2. turn each row into a US Core Patient — typed extension setters and `apply()`,
3. turn each row into a US Core Blood Pressure — typed slices, fixed LOINC, and `validate()`,
4. package them as a Bundle,
5. read the bundle back with typed getters to compute an average BP,
6. post the bundle to a local Aidbox server via [fhirpy client](https://pypi.org/project/fhirpy/).

## Prerequisites

- **Node.js 20+** (or Bun) — the generator *itself* is the `@atomic-ehr/codegen` Node package. You run it once to emit Python; after that you don't need Node again. The generation script is a few lines of TypeScript (shown below).
- **Python 3.12+** — the generated code targets modern Python ([PEP 604](https://peps.python.org/pep-0604/) `X | None` unions, generic models via `typing_extensions`).
- **Pydantic v2** (`pydantic>=2.11`) and **fhirpy** — generated models are Pydantic v2; with the default fhirpy client they also drop into fhirpy's async client (Step 6). Both are pinned in the generated `requirements.txt` (see Step 1); pass `client: "none"` for plain Pydantic with no client code.
- Basic familiarity with FHIR and US Core (knowing what "profile" and "slice" mean is enough).

## Step 1 — Generate Profile Classes

Code generation runs through the Node tool, so set up a small generator project alongside your Python app:

```bash
mkdir py-us-core-tutorial && cd py-us-core-tutorial
npm init -y
npm install --save-dev @atomic-ehr/codegen tsx typescript
```

Create `generate.ts`:

```typescript
import { APIBuilder, mkCodegenLogger, prettyReport } from "@atomic-ehr/codegen";

const main = async () => {
  const logger = mkCodegenLogger({
    suppressTags: ["#fieldTypeNotFound", "#duplicateSchema", "#duplicateCanonical", "#largeValueSet"],
  });

  const builder = new APIBuilder({ logger })
    .fromPackage("hl7.fhir.us.core", "8.0.1")
    .typeSchema({
      treeShake: {
        "hl7.fhir.us.core": {
          "http://hl7.org/fhir/us/core/StructureDefinition/us-core-patient": {},
          "http://hl7.org/fhir/us/core/StructureDefinition/us-core-blood-pressure": {},
        },
        "hl7.fhir.r4.core": {
          "http://hl7.org/fhir/StructureDefinition/Bundle": {},
        },
      },
    })
    .python({
      generateProfile: true,
      allowExtraFields: false,
      primitiveTypeExtension: true,
    })
    .outputTo("./fhir_types")
    .cleanOutput(true);

  const report = await builder.generate();
  console.log(prettyReport(report));
  if (!report.success) process.exit(1);
};

main();
```

The knobs that matter here:

- **`generateProfile: true`** — emit a wrapper class per profile with typed accessors for extensions, slices, and fixed values. Without it you get only the base R4 Pydantic models.
- **`allowExtraFields: false`** — generated models use Pydantic's `extra="forbid"`, so an unknown field raises at parse time instead of being silently dropped.
- **`primitiveTypeExtension: true`** — also generate the FHIR primitive-extension siblings (the `_field` companions, e.g. `birthDateExtension`) so you can attach extensions and `id`s to primitive values.
- **`treeShake: { ... }`** — only the listed canonicals and their transitive deps are generated (~20 files instead of hundreds).

Run it. `prettyReport(report)` prints a grouped summary so you see what got emitted without crawling the output dir:

```bash
$ npx tsx generate.ts
# generation logs omitted; this is the prettyReport summary
Generated files (24 files, 12 kloc):
  python (23 files, 3.6 kloc):
    - fhir_types/ (4 files, 622 loc)
    - fhir_types/hl7_fhir_r4_core/ (8 files, 1.1 kloc)
    - fhir_types/hl7_fhir_r4_core/profiles/ (2 files, 164 loc)
    - fhir_types/hl7_fhir_us_core/profiles/ (9 files, 1.8 kloc)
  ir-report (1 files, 8.2 kloc):
    - fhir_types/README.md (8223 loc)
Duration: 8097ms
Status: 🟩 Success
```

The on-disk layout looks like this:

```
fhir_types/
├── hl7_fhir_r4_core/                  # Base R4 Pydantic models
│   ├── base.py                        # Element, Coding, CodeableConcept, Quantity, ...
│   ├── resource.py                    # Resource, DomainResource, Meta
│   ├── patient.py
│   ├── observation.py
│   ├── bundle.py
│   ├── profiles/                      # base R4 profiles US Core builds on
│   │   └── observation_observation_vitalsigns.py   # vital-signs base (BP derives from it)
│   └── ...
├── hl7_fhir_us_core/
│   └── profiles/
│       ├── __init__.py                # re-exports the profile classes
│       ├── patient_uscore_patient_profile.py
│       ├── observation_uscore_blood_pressure_profile.py
│       ├── extension_uscore_race_extension.py
│       └── ...
├── fhirpy_base_model.py               # fhirpy client base model (default fhirpy client)
├── profile_helpers.py                 # Runtime helpers shared by all profile classes
├── README.md                          # IR report — human-readable dump of the generated types
└── requirements.txt                   # pydantic, fhirpy (+ pytest, requests for tests/Step 6)
```

<details><summary>Setup Python virtual environment</summary>

```bash
python3.14 -m venv venv
source venv/bin/activate
```

</details>

Point your Python app at the emitted `fhir_types/` and install the dependencies:

```bash
pip install -r fhir_types/requirements.txt
```

The generated `requirements.txt` pins Pydantic and fhirpy plus pytest and requests for the tests and examples.

The full tutorial code lives in [`Aidbox/examples`](https://github.com/Aidbox/examples/tree/main/developer-experience/atomic-ehr-codegen-python-us-core-profiles) — `generate.ts`, `load.py`, `avg.py`, `post.py`, the CSV, and the committed `fhir_types/` so you can browse the generated code without running the generator. For broader profile-API exploration, the codegen repo also has a [`python-r4-us-core` test example](https://github.com/atomic-ehr/codegen/tree/main/examples/python-r4-us-core). Both use the default `camelCase` attribute names, just like the snippets here. (Pass `fieldFormat: "snake_case"` if you'd rather spell attributes `birth_date`, `effective_date_time`; serialization always emits FHIR-correct camelCase JSON either way.)

## Step 2 — Row to a US Core Patient

The input is `patients.csv` — basic demographics plus one BP reading per patient. Race uses the [OMB-category codes](https://www.hl7.org/fhir/us/core/ValueSet-omb-race-category.html) US Core expects:

```csv
mrn,family,given,birthDate,gender,raceCode,raceDisplay,effectiveDateTime,systolic,diastolic
MRN-001,Lovelace,Ada,1815-12-10,female,2106-3,White,2026-04-15,120,80
MRN-002,Turing,Alan,1912-06-23,male,2106-3,White,2026-04-15,118,76
MRN-003,Curie,Marie,1867-11-07,female,2106-3,White,2026-04-16,125,82
MRN-004,Carver,George,1864-01-01,male,2054-5,Black or African American,2026-04-16,135,88
MRN-005,Ochoa,Ellen,1958-05-10,female,2054-5,Black or African American,2026-04-17,128,84
```

`csv.DictReader` hands each row over as a plain `dict[str, str]`; numeric parsing happens later, where we pass values to typed profile setters.

The US Core Patient profile adds a few extensions and makes `identifier` and `name` required. The generated class has a typed setter for each:

```python
from fhir_types.hl7_fhir_r4_core.base import Identifier, HumanName, Coding
from fhir_types.hl7_fhir_r4_core.patient import Patient
from fhir_types.hl7_fhir_us_core.profiles import UscorePatientProfile

def row_to_patient(row: dict[str, str]) -> UscorePatientProfile:
    base_patient = Patient(
        resourceType="Patient",
        identifier=[Identifier(system="http://hospital.example.org/mrn", value=row["mrn"])],
        name=[HumanName(family=row["family"], given=[row["given"]])],
        gender=row["gender"],          # gender is a Literal type — Pydantic validates the value
        birthDate=row["birthDate"],    # default camelCase attrs match the FHIR wire names
    )

    patient = UscorePatientProfile.apply(base_patient)

    patient.set_race({
        "ombCategory": {"system": "urn:oid:2.16.840.1.113883.6.238", "code": row["raceCode"], "display": row["raceDisplay"]},
        "text": row["raceDisplay"],
    })

    return patient
```

Two phases:

1. **Build the plain `Patient`** — profile-required (`identifier`, `name`) and must-support (`gender`, `birthDate`) fields as a typed R4 Pydantic model. Construct with the default camelCase attribute names (`birthDate`, the FHIR wire names); values are validated immediately (e.g. `gender` is a `Literal["male", "female", "other", "unknown"]`).
2. **Then `UscorePatientProfile.apply(base_patient)`** stamps `meta.profile` and returns a profile instance with typed accessors for the US Core extensions. `apply()` wraps the resource in place — the profile mutates the same `Patient` object.

Three notes on what the profile API does for you:

- **Three extension setter forms.** `set_race({ "ombCategory": ..., "text": ... })` takes flat input — note the sub-extension keys (`ombCategory`, `detailed`, `text`) are the camelCase slice names — and generates the nested `extension[]` plumbing. The same setter also accepts a typed extension-profile instance (`UscoreRaceExtension`) or a raw `Extension`, and raises if a raw extension's `url` doesn't match.
- **Single-value extensions take the value directly.** `us-core-individual-sex` carries one `valueCoding`, so `set_sex(Coding(code="female"))` takes a `Coding` (or a raw `Extension`).
- **No setters for must-support base fields.** `gender`, `birthDate`, and `address` aren't profiled further by US Core, so the profile class emits no `.set_gender()`-style wrappers — populate them as normal `Patient` fields. `validate()` still warns if a must-support field is missing.

> Pydantic emits a `UserWarning` when an `extension[]` list holds plain dicts rather than `Extension` instances — expected with the current flat-dict plumbing. Silence it with `warnings.filterwarnings("ignore", category=UserWarning, module="pydantic")`.

## Step 3 — Row to a US Core Blood Pressure

The BP profile is where codegen really earns its keep. The US Core Blood Pressure profile:

- fixes `code` to LOINC 85354-9 ("Blood pressure panel"),
- fixes a `vital-signs` category slice,
- defines `component[systolic]` and `component[diastolic]` slices with specific LOINC discriminators (8480-6 and 8462-4),
- requires an `effectiveDateTime` or `effectivePeriod`,
- requires `valueQuantity` inside each slice.

Hand-rolling that per row is the kind of thing codegen eliminates. The generated class collapses it to three setters:

```python
from fhir_types.hl7_fhir_r4_core.base import Reference
from fhir_types.hl7_fhir_us_core.profiles import UscoreBloodPressureProfile

def row_to_bp(row: dict[str, str], patient_urn: str) -> UscoreBloodPressureProfile:
    bp = UscoreBloodPressureProfile.create(
        status="final",
        subject=Reference(reference=patient_urn),
    )

    (
        bp.set_effective_date_time(row["effectiveDateTime"])
          .set_systolic({"value": float(row["systolic"]), "unit": "mmHg", "system": "http://unitsofmeasure.org", "code": "mm[Hg]"})
          .set_diastolic({"value": float(row["diastolic"]), "unit": "mmHg", "system": "http://unitsofmeasure.org", "code": "mm[Hg]"})
    )

    errors = bp.validate()["errors"]
    if errors:
        raise ValueError(f"{row['mrn']}: {'; '.join(errors)}")

    return bp
```

What happens behind the scenes:

- **`create()` does the ceremony.** It stamps `meta.profile`, fills the fixed `code` (LOINC 85354-9), appends the vital-signs category slice, and adds empty `component[systolic]` / `component[diastolic]` stubs with discriminator codes already set. `create()` takes keyword-only args; `create_resource()` is the same but returns a plain `Observation` instead of a profile wrapper.
- **`set_systolic({ "value": ..., "unit": ... })` fills the `valueQuantity`** inside the systolic slice. The discriminator `code` on that component is already there from `create()` — you only supply the reading.
- **`validate()` returns `{"errors": [...], "warnings": [...]}`.** Errors block (required fields, excluded fields, disallowed choice variants, slice cardinality). Warnings surface must-support concerns. A malformed row fails fast with the MRN — you don't discover it at POST time.

You didn't type the discriminator codes. You didn't remember `85354-9`. The setters chain fluently (each returns the profile), just like the TypeScript API.

## Step 4 — Assemble the Bundle

Each row produces a Patient and a BP Observation linked by the Patient's `urn:uuid` placeholder. Package them as transaction entries. The generated `Bundle` and `BundleEntry` are generic over the contained resource, so a `Bundle[Patient | Observation]` keeps `entry[].resource` typed to that union. (`row_to_patient` and `row_to_bp` are the Step 2–3 functions; in the example they all live in one `load.py`, so they're already in scope here.)

```python
import json
import csv
import uuid

from fhir_types.hl7_fhir_r4_core.patient import Patient
from fhir_types.hl7_fhir_r4_core.observation import Observation
from fhir_types.hl7_fhir_r4_core.bundle import Bundle, BundleEntry, BundleEntryRequest

def row_to_entries(row: dict[str, str]) -> list[BundleEntry[Patient | Observation]]:
    patient_urn = f"urn:uuid:{uuid.uuid4()}"
    patient = row_to_patient(row)
    bp = row_to_bp(row, patient_urn)

    return [
        BundleEntry(fullUrl=patient_urn, resource=patient.to_resource(),
                    request=BundleEntryRequest(method="POST", url="Patient")),
        BundleEntry(fullUrl=f"urn:uuid:{uuid.uuid4()}", resource=bp.to_resource(),
                    request=BundleEntryRequest(method="POST", url="Observation")),
    ]

rows = list(csv.DictReader(open("patients.csv")))
print(f"Loaded {len(rows)} rows")

entries = [entry for row in rows for entry in row_to_entries(row)]

bundle = Bundle[Patient | Observation](
    resourceType="Bundle",
    type="transaction",
    entry=entries,
)

with open("bundle.json", "w") as f:
    json.dump(bundle.model_dump(by_alias=True, exclude_none=True), f, indent=2)
print(f"Wrote bundle with {len(entries)} entries")
```

```bash
$ python load.py
Loaded 5 rows
Wrote bundle with 10 entries
```

Worth noticing:

- **`to_resource()` gives you the plain model** — the underlying Pydantic resource, no wrapper, ready to drop into a `BundleEntry`.
- **`model_dump(by_alias=True, exclude_none=True)` produces FHIR JSON** — `by_alias` serializes through the FHIR-wire aliases (so a snake_case build still emits `effectiveDateTime`) and `exclude_none` drops `None`-valued fields. The one serialization call you'll use everywhere.
- **`urn:uuid` references.** The patient's `fullUrl` and the observation's `subject.reference` share one UUID; the server resolves it to a real id on commit.

## Step 5 — Read Back: Average BP from the Bundle

Now read it back. Parse `bundle.json` and compute the average systolic/diastolic to exercise the read-side API:

```python
import json
from typing import Any

from fhir_types.hl7_fhir_r4_core.observation import Observation
from fhir_types.hl7_fhir_us_core.profiles import UscoreBloodPressureProfile

bundle = json.load(open("bundle.json"))

def is_us_core_bp(resource: dict[str, Any]) -> bool:
    return (
        resource.get("resourceType") == "Observation"
        and UscoreBloodPressureProfile.canonical_url in (resource.get("meta", {}).get("profile") or [])
    )

bps = [
    UscoreBloodPressureProfile.from_resource(Observation.model_validate(entry["resource"]))
    for entry in bundle.get("entry", [])
    if is_us_core_bp(entry["resource"])
]

def avg(xs: list[float]) -> float:
    return sum(xs) / len(xs)

# get_systolic()/get_diastolic() are Optional, so guard with a walrus before indexing.
systolic = [s["value"] for bp in bps if (s := bp.get_systolic()) is not None]
diastolic = [d["value"] for bp in bps if (d := bp.get_diastolic()) is not None]

print(f"Avg BP: {avg(systolic):.1f}/{avg(diastolic):.1f} mmHg (n={len(bps)})")
```

```bash
$ python avg.py
Avg BP: 125.2/82.0 mmHg (n=5)
```

Three things the profile does here:

- **`from_resource(obs)` validates as it wraps.** It checks that `meta.profile` includes the canonical URL and returns a profile instance, raising if a resource that *claims* the profile is malformed — so a broken bundle fails at read time, not on the next field access.
- **No built-in type guard.** Unlike the TypeScript API's `is()` predicate, the Python classes don't ship a `.filter()`-style guard. You select candidates yourself — check `resourceType` and `canonical_url in meta.profile` (above), or wrap `from_resource()` in `try/except ValueError`. Either way `canonical_url` is exposed as a class attribute for exactly this.
- **`get_systolic()` / `get_diastolic()` return the flat slice value.** No walking `component[].code.coding[].code` to match LOINC codes — the profile already knows which slice is which, and hands you back the `Quantity` data as a plain dict.

That's the round-trip: CSV → typed profiles → validated Bundle → typed read-back with profile-aware getters. The same handful of lines would process BPs fetched from a FHIR server, loaded from a file, or received on a Subscription — the typed profile is the common shape, no matter the source.

## Step 6 — Land Your Bundle on a FHIR Server

The typed pipeline is only half the story. To actually see the transaction commit — patient IDs assigned, `urn:uuid` references rewritten, resources stored and searchable — you need a FHIR server. Spin up and run [Aidbox](https://www.health-samurai.io/aidbox):

```bash
curl -JO https://aidbox.app/runme && docker compose up -d
```

Open <http://localhost:8080> in your browser to grab a free developer license, then pull the root client secret out of `docker-compose.yaml` into an env var — reused by the curls and the Python script below:

```bash
export BOX_ROOT_CLIENT_SECRET=$(awk '/BOX_ROOT_CLIENT_SECRET:/{print $2}' docker-compose.yaml)
```

Verify the FHIR endpoint is up:

```bash
curl -u "root:$BOX_ROOT_CLIENT_SECRET" http://localhost:8080/fhir/metadata
```

You should see a JSON `CapabilityStatement`.

Send the `bundle.json` you just wrote with fhirpy's async client:

```python
import asyncio
import base64
import json
import os

from fhirpy import AsyncFHIRClient
from fhir_types.hl7_fhir_r4_core import Bundle

secret = os.environ["BOX_ROOT_CLIENT_SECRET"]  # exported above
auth = base64.b64encode(f"root:{secret}".encode()).decode()


async def main() -> None:
    client = AsyncFHIRClient("http://localhost:8080/fhir", authorization=f"Basic {auth}")
    bundle = json.load(open("bundle.json"))
    resp: Bundle = await client.execute("/", method="post", data=bundle)
    if resp.entry is None:
        return

    for entry in resp.entry:
        if entry.response is None:
            continue
        print(entry.response.status, entry.response.location)


asyncio.run(main())
```

Aidbox returns a `transaction-response` bundle — one entry per input, each with a `201 Created` and a `location` pointing at the stored resource:

```bash
$ python post.py
201 Created Patient/<id>/_history/1
201 Created Observation/<id>/_history/1
...
```

Query an observation back and look at its `subject`:

```bash
$ curl -u "root:$BOX_ROOT_CLIENT_SECRET" \
  "http://localhost:8080/fhir/Observation?code=http://loinc.org|85354-9" \
  | jq '.entry[].resource.subject.reference'
"Patient/01J..."
"Patient/01J..."
```

No `urn:uuid` — Aidbox rewrote the placeholders atomically on commit.

## Type-Check the Pipeline

The generated models are Pydantic v2, so the converter type-checks with [mypy](https://mypy-lang.org/) — already pinned in the generated `requirements.txt`. One requirement: enable the **Pydantic mypy plugin** (it ships with Pydantic, no extra install), or mypy can't tell that a `Field(None, ...)` default makes a field optional and floods you with false "missing argument" errors on every model you construct.

Drop a `mypy.ini` next to your code:

```ini
[mypy]
strict = True
plugins = pydantic.mypy
```

Then run it:

```bash
$ mypy .
Success: no issues found in 35 source files
```

Both your converter modules — `load.py`, `avg.py`, `post.py` — and the generated `fhir_types/` package come back clean. The typed factories, `to_resource()`, and the `Bundle[Patient | Observation]` generic all check out, so a wrong field type or a missing required argument is caught before you ever reach the server. The generated profile layer type-checks under full `--strict` too — the whole point of `mypy.ini` being just those two lines: no `disable_error_code`, no `strict_optional = False`, nothing hand-edited inside `fhir_types/`. The only thing your own code supplies is an ordinary None-guard when you read an optional field — the walrus in `avg.py` above, or `if entries is None: ...` before indexing a list. That's plain strict-mode Python, not a generator wart.

## Where To Go Next

- **More of the profile API.** Other factories, getters, and slice/extension forms are exercised in the codegen example tests: [`test_profile_patient.py`](https://github.com/atomic-ehr/codegen/blob/main/examples/python-r4-us-core/test_profile_patient.py), [`test_profile_bp.py`](https://github.com/atomic-ehr/codegen/blob/main/examples/python-r4-us-core/test_profile_bp.py), [`test_profile_bodyweight.py`](https://github.com/atomic-ehr/codegen/blob/main/examples/python-r4-us-core/test_profile_bodyweight.py), and [`test_profile_typed_bundle.py`](https://github.com/atomic-ehr/codegen/blob/main/examples/python-r4-us-core/test_profile_typed_bundle.py).
- **Typed bundles with named entry slices.** A profiled Bundle generates per-slice setters/getters (`set_patient_entry`, `get_organization_entry`) with single vs. unbounded (`max: *`) cardinality handled for you — see the typed-bundle test above.
- **Tune the output for your codebase.** `fieldFormat` (`snake_case`/`camelCase`), `client` (`"fhirpy"`/`"none"`), `allowExtraFields`, and `primitiveTypeExtension` are all toggles on `.python({ ... })`.
- **Mix profiles from multiple packages.** `APIBuilder.fromPackage()` chains — US Core alongside your custom IG or a regional base. `localStructureDefinitions()` pulls in profiles straight from a folder of `StructureDefinition` JSON.

## Wrap Up

The generator emits both the base R4 Pydantic models and a thin profile-class layer on top — no runtime DSL, no ORM, no framework. `to_resource()` always gives you a plain Pydantic resource, and `model_dump(by_alias=True, exclude_none=True)` always gives you plain FHIR JSON you can send to any server.

`@atomic-ehr/codegen` is MIT-licensed; issues and PRs welcome.

[GitHub](https://github.com/atomic-ehr/codegen) | [NPM](https://www.npmjs.com/package/@atomic-ehr/codegen) | [US Core IG](https://www.hl7.org/fhir/us/core/)
