Building US Core resources by hand is tedious. You stamp meta.profile, look up LOINC codes, hand-roll the us-core-race nested extension — every field is a typo waiting to happen, every profile is its own version of the same ceremony.
@atomic-ehr/codegen makes that boilerplate disappear. Point it at the US Core IG and you get one Pydantic model per base type plus a plain-Python wrapper class per profile, with typed accessors for fixed values, extensions, and slices, and a validate() that knows what the profile requires.
This tutorial walks through that end-to-end on two US Core profiles: US Core Patient and US Core Blood Pressure.
What You'll Build
A CSV-to-FHIR converter, built step by step:
- generate profile classes for US Core Patient and US Core Blood Pressure from
hl7.fhir.us.core@8.0.1, - turn each row into a US Core Patient — typed extension setters and
apply(), - turn each row into a US Core Blood Pressure — typed slices, fixed LOINC, and
validate(), - package them as a Bundle,
- read the bundle back with typed getters to compute an average BP,
- post the bundle to a local Aidbox server via fhirpy client.
Prerequisites
- Node.js 20+ (or Bun) — the generator itself is the
@atomic-ehr/codegenNode package. You run it once to emit Python; after that you don't need Node again. The generation script is a few lines of TypeScript (shown below). - Python 3.12+ — the generated code targets modern Python (PEP 604
X | Noneunions, generic models viatyping_extensions). - Pydantic v2 (
pydantic>=2.11) and fhirpy — generated models are Pydantic v2; with the default fhirpy client they also drop into fhirpy's async client (Step 6). Both are pinned in the generatedrequirements.txt(see Step 1); passclient: "none"for plain Pydantic with no client code. - Basic familiarity with FHIR and US Core (knowing what "profile" and "slice" mean is enough).
Step 1 — Generate Profile Classes
Code generation runs through the Node tool, so set up a small generator project alongside your Python app:
mkdir py-us-core-tutorial && cd py-us-core-tutorial
npm init -y
npm install --save-dev @atomic-ehr/codegen tsx typescript
Create generate.ts:
import { APIBuilder, mkCodegenLogger, prettyReport } from "@atomic-ehr/codegen";
const main = async () => {
const logger = mkCodegenLogger({
suppressTags: ["#fieldTypeNotFound", "#duplicateSchema", "#duplicateCanonical", "#largeValueSet"],
});
const builder = new APIBuilder({ logger })
.fromPackage("hl7.fhir.us.core", "8.0.1")
.typeSchema({
treeShake: {
"hl7.fhir.us.core": {
"http://hl7.org/fhir/us/core/StructureDefinition/us-core-patient": {},
"http://hl7.org/fhir/us/core/StructureDefinition/us-core-blood-pressure": {},
},
"hl7.fhir.r4.core": {
"http://hl7.org/fhir/StructureDefinition/Bundle": {},
},
},
})
.python({
generateProfile: true,
allowExtraFields: false,
primitiveTypeExtension: true,
})
.outputTo("./fhir_types")
.cleanOutput(true);
const report = await builder.generate();
console.log(prettyReport(report));
if (!report.success) process.exit(1);
};
main();
The knobs that matter here:
generateProfile: true— emit a wrapper class per profile with typed accessors for extensions, slices, and fixed values. Without it you get only the base R4 Pydantic models.allowExtraFields: false— generated models use Pydantic'sextra="forbid", so an unknown field raises at parse time instead of being silently dropped.primitiveTypeExtension: true— also generate the FHIR primitive-extension siblings (the_fieldcompanions, e.g.birthDateExtension) so you can attach extensions andids to primitive values.treeShake: { ... }— only the listed canonicals and their transitive deps are generated (~20 files instead of hundreds).
Run it. prettyReport(report) prints a grouped summary so you see what got emitted without crawling the output dir:
$ npx tsx generate.ts
# generation logs omitted; this is the prettyReport summary
Generated files (24 files, 12 kloc):
python (23 files, 3.6 kloc):
- fhir_types/ (4 files, 622 loc)
- fhir_types/hl7_fhir_r4_core/ (8 files, 1.1 kloc)
- fhir_types/hl7_fhir_r4_core/profiles/ (2 files, 164 loc)
- fhir_types/hl7_fhir_us_core/profiles/ (9 files, 1.8 kloc)
ir-report (1 files, 8.2 kloc):
- fhir_types/README.md (8223 loc)
Duration: 8097ms
Status: 🟩 Success
The on-disk layout looks like this:
fhir_types/
├── hl7_fhir_r4_core/ # Base R4 Pydantic models
│ ├── base.py # Element, Coding, CodeableConcept, Quantity, ...
│ ├── resource.py # Resource, DomainResource, Meta
│ ├── patient.py
│ ├── observation.py
│ ├── bundle.py
│ ├── profiles/ # base R4 profiles US Core builds on
│ │ └── observation_observation_vitalsigns.py # vital-signs base (BP derives from it)
│ └── ...
├── hl7_fhir_us_core/
│ └── profiles/
│ ├── __init__.py # re-exports the profile classes
│ ├── patient_uscore_patient_profile.py
│ ├── observation_uscore_blood_pressure_profile.py
│ ├── extension_uscore_race_extension.py
│ └── ...
├── fhirpy_base_model.py # fhirpy client base model (default fhirpy client)
├── profile_helpers.py # Runtime helpers shared by all profile classes
├── README.md # IR report — human-readable dump of the generated types
└── requirements.txt # pydantic, fhirpy (+ pytest, requests for tests/Step 6)
Setup Python virtual environment
python3.14 -m venv venv
source venv/bin/activate
Point your Python app at the emitted fhir_types/ and install the dependencies:
pip install -r fhir_types/requirements.txt
The generated requirements.txt pins Pydantic and fhirpy plus pytest and requests for the tests and examples.
The full tutorial code lives in Aidbox/examples — generate.ts, load.py, avg.py, post.py, the CSV, and the committed fhir_types/ so you can browse the generated code without running the generator. For broader profile-API exploration, the codegen repo also has a python-r4-us-core test example. Both use the default camelCase attribute names, just like the snippets here. (Pass fieldFormat: "snake_case" if you'd rather spell attributes birth_date, effective_date_time; serialization always emits FHIR-correct camelCase JSON either way.)
Step 2 — Row to a US Core Patient
The input is patients.csv — basic demographics plus one BP reading per patient. Race uses the OMB-category codes US Core expects:
mrn,family,given,birthDate,gender,raceCode,raceDisplay,effectiveDateTime,systolic,diastolic
MRN-001,Lovelace,Ada,1815-12-10,female,2106-3,White,2026-04-15,120,80
MRN-002,Turing,Alan,1912-06-23,male,2106-3,White,2026-04-15,118,76
MRN-003,Curie,Marie,1867-11-07,female,2106-3,White,2026-04-16,125,82
MRN-004,Carver,George,1864-01-01,male,2054-5,Black or African American,2026-04-16,135,88
MRN-005,Ochoa,Ellen,1958-05-10,female,2054-5,Black or African American,2026-04-17,128,84
csv.DictReader hands each row over as a plain dict[str, str]; numeric parsing happens later, where we pass values to typed profile setters.
The US Core Patient profile adds a few extensions and makes identifier and name required. The generated class has a typed setter for each:
from fhir_types.hl7_fhir_r4_core.base import Identifier, HumanName, Coding
from fhir_types.hl7_fhir_r4_core.patient import Patient
from fhir_types.hl7_fhir_us_core.profiles import UscorePatientProfile
def row_to_patient(row: dict[str, str]) -> UscorePatientProfile:
base_patient = Patient(
resourceType="Patient",
identifier=[Identifier(system="http://hospital.example.org/mrn", value=row["mrn"])],
name=[HumanName(family=row["family"], given=[row["given"]])],
gender=row["gender"], # gender is a Literal type — Pydantic validates the value
birthDate=row["birthDate"], # default camelCase attrs match the FHIR wire names
)
patient = UscorePatientProfile.apply(base_patient)
patient.set_race({
"ombCategory": {"system": "urn:oid:2.16.840.1.113883.6.238", "code": row["raceCode"], "display": row["raceDisplay"]},
"text": row["raceDisplay"],
})
return patient
Two phases:
- Build the plain
Patient— profile-required (identifier,name) and must-support (gender,birthDate) fields as a typed R4 Pydantic model. Construct with the default camelCase attribute names (birthDate, the FHIR wire names); values are validated immediately (e.g.genderis aLiteral["male", "female", "other", "unknown"]). - Then
UscorePatientProfile.apply(base_patient)stampsmeta.profileand returns a profile instance with typed accessors for the US Core extensions.apply()wraps the resource in place — the profile mutates the samePatientobject.
Three notes on what the profile API does for you:
- Three extension setter forms.
set_race({ "ombCategory": ..., "text": ... })takes flat input — note the sub-extension keys (ombCategory,detailed,text) are the camelCase slice names — and generates the nestedextension[]plumbing. The same setter also accepts a typed extension-profile instance (UscoreRaceExtension) or a rawExtension, and raises if a raw extension'surldoesn't match. - Single-value extensions take the value directly.
us-core-individual-sexcarries onevalueCoding, soset_sex(Coding(code="female"))takes aCoding(or a rawExtension). - No setters for must-support base fields.
gender,birthDate, andaddressaren't profiled further by US Core, so the profile class emits no.set_gender()-style wrappers — populate them as normalPatientfields.validate()still warns if a must-support field is missing.
Pydantic emits a
UserWarningwhen anextension[]list holds plain dicts rather thanExtensioninstances — expected with the current flat-dict plumbing. Silence it withwarnings.filterwarnings("ignore", category=UserWarning, module="pydantic").
Step 3 — Row to a US Core Blood Pressure
The BP profile is where codegen really earns its keep. The US Core Blood Pressure profile:
- fixes
codeto LOINC 85354-9 ("Blood pressure panel"), - fixes a
vital-signscategory slice, - defines
component[systolic]andcomponent[diastolic]slices with specific LOINC discriminators (8480-6 and 8462-4), - requires an
effectiveDateTimeoreffectivePeriod, - requires
valueQuantityinside each slice.
Hand-rolling that per row is the kind of thing codegen eliminates. The generated class collapses it to three setters:
from fhir_types.hl7_fhir_r4_core.base import Reference
from fhir_types.hl7_fhir_us_core.profiles import UscoreBloodPressureProfile
def row_to_bp(row: dict[str, str], patient_urn: str) -> UscoreBloodPressureProfile:
bp = UscoreBloodPressureProfile.create(
status="final",
subject=Reference(reference=patient_urn),
)
(
bp.set_effective_date_time(row["effectiveDateTime"])
.set_systolic({"value": float(row["systolic"]), "unit": "mmHg", "system": "http://unitsofmeasure.org", "code": "mm[Hg]"})
.set_diastolic({"value": float(row["diastolic"]), "unit": "mmHg", "system": "http://unitsofmeasure.org", "code": "mm[Hg]"})
)
errors = bp.validate()["errors"]
if errors:
raise ValueError(f"{row['mrn']}: {'; '.join(errors)}")
return bp
What happens behind the scenes:
create()does the ceremony. It stampsmeta.profile, fills the fixedcode(LOINC 85354-9), appends the vital-signs category slice, and adds emptycomponent[systolic]/component[diastolic]stubs with discriminator codes already set.create()takes keyword-only args;create_resource()is the same but returns a plainObservationinstead of a profile wrapper.set_systolic({ "value": ..., "unit": ... })fills thevalueQuantityinside the systolic slice. The discriminatorcodeon that component is already there fromcreate()— you only supply the reading.validate()returns{"errors": [...], "warnings": [...]}. Errors block (required fields, excluded fields, disallowed choice variants, slice cardinality). Warnings surface must-support concerns. A malformed row fails fast with the MRN — you don't discover it at POST time.
You didn't type the discriminator codes. You didn't remember 85354-9. The setters chain fluently (each returns the profile), just like the TypeScript API.
Step 4 — Assemble the Bundle
Each row produces a Patient and a BP Observation linked by the Patient's urn:uuid placeholder. Package them as transaction entries. The generated Bundle and BundleEntry are generic over the contained resource, so a Bundle[Patient | Observation] keeps entry[].resource typed to that union. (row_to_patient and row_to_bp are the Step 2–3 functions; in the example they all live in one load.py, so they're already in scope here.)
import json
import csv
import uuid
from fhir_types.hl7_fhir_r4_core.patient import Patient
from fhir_types.hl7_fhir_r4_core.observation import Observation
from fhir_types.hl7_fhir_r4_core.bundle import Bundle, BundleEntry, BundleEntryRequest
def row_to_entries(row: dict[str, str]) -> list[BundleEntry[Patient | Observation]]:
patient_urn = f"urn:uuid:{uuid.uuid4()}"
patient = row_to_patient(row)
bp = row_to_bp(row, patient_urn)
return [
BundleEntry(fullUrl=patient_urn, resource=patient.to_resource(),
request=BundleEntryRequest(method="POST", url="Patient")),
BundleEntry(fullUrl=f"urn:uuid:{uuid.uuid4()}", resource=bp.to_resource(),
request=BundleEntryRequest(method="POST", url="Observation")),
]
rows = list(csv.DictReader(open("patients.csv")))
print(f"Loaded {len(rows)} rows")
entries = [entry for row in rows for entry in row_to_entries(row)]
bundle = Bundle[Patient | Observation](
resourceType="Bundle",
type="transaction",
entry=entries,
)
with open("bundle.json", "w") as f:
json.dump(bundle.model_dump(by_alias=True, exclude_none=True), f, indent=2)
print(f"Wrote bundle with {len(entries)} entries")
$ python load.py
Loaded 5 rows
Wrote bundle with 10 entries
Worth noticing:
to_resource()gives you the plain model — the underlying Pydantic resource, no wrapper, ready to drop into aBundleEntry.model_dump(by_alias=True, exclude_none=True)produces FHIR JSON —by_aliasserializes through the FHIR-wire aliases (so a snake_case build still emitseffectiveDateTime) andexclude_nonedropsNone-valued fields. The one serialization call you'll use everywhere.urn:uuidreferences. The patient'sfullUrland the observation'ssubject.referenceshare one UUID; the server resolves it to a real id on commit.
Step 5 — Read Back: Average BP from the Bundle
Now read it back. Parse bundle.json and compute the average systolic/diastolic to exercise the read-side API:
import json
from typing import Any
from fhir_types.hl7_fhir_r4_core.observation import Observation
from fhir_types.hl7_fhir_us_core.profiles import UscoreBloodPressureProfile
bundle = json.load(open("bundle.json"))
def is_us_core_bp(resource: dict[str, Any]) -> bool:
return (
resource.get("resourceType") == "Observation"
and UscoreBloodPressureProfile.canonical_url in (resource.get("meta", {}).get("profile") or [])
)
bps = [
UscoreBloodPressureProfile.from_resource(Observation.model_validate(entry["resource"]))
for entry in bundle.get("entry", [])
if is_us_core_bp(entry["resource"])
]
def avg(xs: list[float]) -> float:
return sum(xs) / len(xs)
# get_systolic()/get_diastolic() are Optional, so guard with a walrus before indexing.
systolic = [s["value"] for bp in bps if (s := bp.get_systolic()) is not None]
diastolic = [d["value"] for bp in bps if (d := bp.get_diastolic()) is not None]
print(f"Avg BP: {avg(systolic):.1f}/{avg(diastolic):.1f} mmHg (n={len(bps)})")
$ python avg.py
Avg BP: 125.2/82.0 mmHg (n=5)
Three things the profile does here:
from_resource(obs)validates as it wraps. It checks thatmeta.profileincludes the canonical URL and returns a profile instance, raising if a resource that claims the profile is malformed — so a broken bundle fails at read time, not on the next field access.- No built-in type guard. Unlike the TypeScript API's
is()predicate, the Python classes don't ship a.filter()-style guard. You select candidates yourself — checkresourceTypeandcanonical_url in meta.profile(above), or wrapfrom_resource()intry/except ValueError. Either waycanonical_urlis exposed as a class attribute for exactly this. get_systolic()/get_diastolic()return the flat slice value. No walkingcomponent[].code.coding[].codeto match LOINC codes — the profile already knows which slice is which, and hands you back theQuantitydata as a plain dict.
That's the round-trip: CSV → typed profiles → validated Bundle → typed read-back with profile-aware getters. The same handful of lines would process BPs fetched from a FHIR server, loaded from a file, or received on a Subscription — the typed profile is the common shape, no matter the source.
Step 6 — Land Your Bundle on a FHIR Server
The typed pipeline is only half the story. To actually see the transaction commit — patient IDs assigned, urn:uuid references rewritten, resources stored and searchable — you need a FHIR server. Spin up and run Aidbox:
curl -JO https://aidbox.app/runme && docker compose up -d
Open http://localhost:8080 in your browser to grab a free developer license, then pull the root client secret out of docker-compose.yaml into an env var — reused by the curls and the Python script below:
export BOX_ROOT_CLIENT_SECRET=$(awk '/BOX_ROOT_CLIENT_SECRET:/{print $2}' docker-compose.yaml)
Verify the FHIR endpoint is up:
curl -u "root:$BOX_ROOT_CLIENT_SECRET" http://localhost:8080/fhir/metadata
You should see a JSON CapabilityStatement.
Send the bundle.json you just wrote with fhirpy's async client:
import asyncio
import base64
import json
import os
from fhirpy import AsyncFHIRClient
from fhir_types.hl7_fhir_r4_core import Bundle
secret = os.environ["BOX_ROOT_CLIENT_SECRET"] # exported above
auth = base64.b64encode(f"root:{secret}".encode()).decode()
async def main() -> None:
client = AsyncFHIRClient("http://localhost:8080/fhir", authorization=f"Basic {auth}")
bundle = json.load(open("bundle.json"))
resp: Bundle = await client.execute("/", method="post", data=bundle)
if resp.entry is None:
return
for entry in resp.entry:
if entry.response is None:
continue
print(entry.response.status, entry.response.location)
asyncio.run(main())
Aidbox returns a transaction-response bundle — one entry per input, each with a 201 Created and a location pointing at the stored resource:
$ python post.py
201 Created Patient/<id>/_history/1
201 Created Observation/<id>/_history/1
...
Query an observation back and look at its subject:
$ curl -u "root:$BOX_ROOT_CLIENT_SECRET" \
"http://localhost:8080/fhir/Observation?code=http://loinc.org|85354-9" \
| jq '.entry[].resource.subject.reference'
"Patient/01J..."
"Patient/01J..."
No urn:uuid — Aidbox rewrote the placeholders atomically on commit.
Type-Check the Pipeline
The generated models are Pydantic v2, so the converter type-checks with mypy — already pinned in the generated requirements.txt. One requirement: enable the Pydantic mypy plugin (it ships with Pydantic, no extra install), or mypy can't tell that a Field(None, ...) default makes a field optional and floods you with false "missing argument" errors on every model you construct.
Drop a mypy.ini next to your code:
[mypy]
strict = True
plugins = pydantic.mypy
Then run it:
$ mypy .
Success: no issues found in 35 source files
Both your converter modules — load.py, avg.py, post.py — and the generated fhir_types/ package come back clean. The typed factories, to_resource(), and the Bundle[Patient | Observation] generic all check out, so a wrong field type or a missing required argument is caught before you ever reach the server. The generated profile layer type-checks under full --strict too — the whole point of mypy.ini being just those two lines: no disable_error_code, no strict_optional = False, nothing hand-edited inside fhir_types/. The only thing your own code supplies is an ordinary None-guard when you read an optional field — the walrus in avg.py above, or if entries is None: ... before indexing a list. That's plain strict-mode Python, not a generator wart.
Where To Go Next
- More of the profile API. Other factories, getters, and slice/extension forms are exercised in the codegen example tests:
test_profile_patient.py,test_profile_bp.py,test_profile_bodyweight.py, andtest_profile_typed_bundle.py. - Typed bundles with named entry slices. A profiled Bundle generates per-slice setters/getters (
set_patient_entry,get_organization_entry) with single vs. unbounded (max: *) cardinality handled for you — see the typed-bundle test above. - Tune the output for your codebase.
fieldFormat(snake_case/camelCase),client("fhirpy"/"none"),allowExtraFields, andprimitiveTypeExtensionare all toggles on.python({ ... }). - Mix profiles from multiple packages.
APIBuilder.fromPackage()chains — US Core alongside your custom IG or a regional base.localStructureDefinitions()pulls in profiles straight from a folder ofStructureDefinitionJSON.
Wrap Up
The generator emits both the base R4 Pydantic models and a thin profile-class layer on top — no runtime DSL, no ORM, no framework. to_resource() always gives you a plain Pydantic resource, and model_dump(by_alias=True, exclude_none=True) always gives you plain FHIR JSON you can send to any server.
@atomic-ehr/codegen is MIT-licensed; issues and PRs welcome.
GitHub | NPM | US Core IG




