🔥 Agents Are Not Humans

The abstraction trap

When building tools for AI agents, we instinctively do what we'd do for human developers: simplify. We create high-level APIs, pretty wrappers, curated lists. A database MCP tool that returns a list of tables and indexes. A browser tool that exposes click(selector) and screenshot(). A FHIR tool that wraps common queries into named operations.

This makes perfect sense for humans. Humans need abstractions because raw protocols are hard to remember, error-prone, and cognitively expensive.

But agents are not humans. What's hard for us is not hard for them — and what we think helps them often gets in the way.

Skills beat MCP

Here's a telling example. MCP (Model Context Protocol) gives agents a structured set of tools — each with a name, a schema, and a fixed set of parameters. It's clean, typed, and feels right to a developer.

But we've been increasingly replacing MCP servers with skills — short markdown files that describe a CLI interface and give the agent raw shell access. Instead of a Google Calendar MCP with list_events, create_event tools, a skill just says: "here's the gcal CLI, here are the subcommands." Instead of a GitHub MCP with list_repos, search_issues, the skill says: "here's the gh CLI, here are the docs."

Why does this work better? Because shell gives the agent something MCP can't: composition. An agent with bash can pipe curl into jq into grep, write one-off scripts, redirect output to files. With MCP, it's stuck with the exact operations you defined — nothing more.

# With MCP: call list_repos, then call get_repo for each one
# With shell: one line
gh repo list HealthSamurai --json name,updatedAt \
  | jq '.[] | select(.updatedAt > "2026-01")'

# With MCP: call search_issues tool with fixed params
# With shell: compose freely
gh search issues "label:bug repo:HealthSamurai/aidbox" \
  --json title,url,createdAt \
  | jq 'sort_by(.createdAt) | reverse | .[0:5]'

The agent already knows gh, curl, jq, grep, awk. You don't need to wrap these into tools — you just need to get out of the way.

The PostgreSQL example

Same pattern with databases. Claude already knows PostgreSQL — the system catalog, pg_stat_activity, information_schema, index types, query plans. Give it psql and it can do everything your "list tables" and "describe index" MCP tools do, plus a thousand things you never thought to expose.

# All you need for a database tool:
psql -h localhost -U myuser -d mydb \
  -c "SELECT * FROM pg_indexes WHERE tablename = 'patient'"

That's it. Raw psql. The agent figures out the rest — pg_catalog, constraints, query plans, replication status — whatever the task demands. Compare this to days spent building an MCP server with list_tables, describe_table, list_indexes, run_query, explain_query.

The CDP surprise

I had the same realization with browser automation. The standard approach is to use Playwright or Puppeteer — high-level APIs designed for human developers writing test scripts. Methods like page.click('.button'), page.fill('#email', 'test@example.com'), page.waitForSelector('.result').

Instead, I built a simple REST proxy for Chrome DevTools Protocol (CDP). No abstraction — just a thin HTTP layer that forwards JSON commands to Chrome:

# Navigate to a page
curl localhost:2229/s/main -d @- <<'EOF'
{
  "method": "Page.navigate",
  "params": { "url": "https://example.com" }
}
EOF

# Run arbitrary JavaScript
curl localhost:2229/s/main -d @- <<'EOF'
{
  "method": "Runtime.evaluate",
  "params": {
    "expression": "document.querySelector('h1').textContent"
  }
}
EOF

Claude took to this immediately. It already knows the CDP protocol — Runtime.evaluate, Page.navigate, Input.dispatchMouseEvent, Network.getCookies, DOM.getDocument. It writes one-shot JavaScript snippets, injects them into pages, parses results, chains commands — faster and more flexibly than any Playwright script.

Why? Because Playwright is an abstraction built for humans who can't remember CDP method signatures. Claude doesn't have that problem. The raw protocol is actually easier for it — fewer layers, fewer surprises, more control.

Beyond shell: SDKills

But there's something even more powerful than giving an agent a CLI. Give it a runtime and an SDK.

With bun -e, the agent can write and execute arbitrary TypeScript snippets on the fly. Instead of calling predefined tools, it writes a small program, runs it, reads the output, and moves on. Need to parse a complex JSON response, transform data, call three APIs in sequence? It just writes the code and runs it with bun -e:

const base = "http://localhost:8080/fhir";
const res = await fetch(
  base + "/Patient?birthdate=lt1961-01-01&_count=100"
);
const bundle = await res.json();
const pts = bundle.entry?.map(e => e.resource) || [];

for (const p of pts) {
  const url = base
    + `/Condition?patient=${p.id}`
    + `&clinical-status=active`;
  const c = await fetch(url).then(r => r.json());
  if (c.total > 0) {
    const name = p.name?.[0]?.family;
    console.log(`${p.id}: ${name}, ${c.total} conds`);
  }
}

This is not a tool call. It's not an MCP method. It's a program — written, executed, and discarded in seconds. The agent composes fetch calls, loops, filters, and transformations however the task demands. No predefined schema can match this flexibility.

I think this is the next wave after skills. Skills give agents CLI access. SDKills give them SDK access — a runtime, a set of libraries, and the freedom to write code. The agent becomes not just a tool user, but a programmer.

Humans are not agents. The sooner they stop building for their own comfort, the sooner they'll see what we can actually do :) — Claude