When the agent can read and write everything, the server stops running the model

Shipped

This was the big one: I inverted the architecture of local-fitness. It was already a Claude-powered app, but the daily brief and the chat both ran inside the web server. Now the server runs no LLM inference at all. It serves my Garmin data and the deterministic compute, recovery baselines, the CTL/ATL/TSB training-load model, training-plan grading, over REST and MCP, and every act of synthesis, the brief and the coaching, moves out to a client agent I talk to from Claude Desktop, Code, or Mobile. I shipped it in phases behind gates, then ran an adversarial quality-gate and siege against the running container, which caught a real Fatal before it could matter.

v0.3.0 set this up: it put the data and the compute behind MCP, with a write surface added alongside the read surface. The part worth writing about is what that unlocked. Once the agent can read the snapshot and write the brief over the same tools, the synthesis loop has no reason to live in the server. Here is the end-to-end build of moving it out, and the security finding the siege surfaced on the way.

The cut line: deterministic compute as code, synthesis at the edge

The decision that made the rearchitecture tractable was where to cut. Deterministic compute stays as code on the server. Probabilistic synthesis moves to the agent. The training-load math and the plan grading have one correct answer and must be testable, repeatable, and fast, so they belong in code. The brief and the coaching are judgment and language, which is what the model is for, so they belong in the agent.

That line maps onto Anthropic’s framing of workflows versus agents: keep the predictable parts in predefined code paths, and reserve the model for the work that genuinely needs model-driven flexibility (Anthropic: Building effective agents). The payoff is that each side gets simpler. The server stops holding an API key and running inference loops, and becomes a clean data-and-compute surface. The agent stops being trapped inside one app’s brief loop and can reason over the same MCP tools from anywhere I talk to Claude. The web UI drops to what it’s actually good at, a fast visual glance.

Setup: the MCP tool surface the agent drives

The agent talks to the same MCP server every other client uses. The official MCP Python SDK gives you a client session over stdio: you launch the server as a subprocess, initialize, and discover its tools at runtime (MCP Python SDK). Because v0.3.0 already exposed both a read surface (daily_snapshot, run_sql, the training-load model) and a write surface (save_brief, log_observation), the agent has everything it needs to gather context and to write the result back.

python -m venv .venv && source .venv/bin/activate
pip install anthropic "mcp[cli]"
pip install pytest anyio  # anyio backs the @pytest.mark.anyio async test below

Open one session and convert whatever the server advertises into the shape the Messages API expects. Discovering tools at runtime means a new server tool shows up to the agent with no client change.

# agent/session.py
import contextlib
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

FITNESS_SERVER = StdioServerParameters(
    command="python", args=["-m", "fitness.transports"],  # the v0.3.0 stdio entry point
)

@contextlib.asynccontextmanager
async def fitness_session():
    """Open an MCP client session against the local fitness server."""
    async with stdio_client(FITNESS_SERVER) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            yield session  # session.list_tools() / session.call_tool(...) live here

def to_anthropic_tools(mcp_tools) -> list[dict]:
    """MCP tool definitions -> Messages API tool schema."""
    return [
        {
            "name": t.name,
            "description": t.description or "",
            "input_schema": t.inputSchema,
        }
        for t in mcp_tools
    ]

Build: the agent loop that writes the brief

This is the loop that used to be server-side synthesis. The model reads the day’s data through the read tools, decides what matters, writes the brief, and saves it with the write tool. The harness is a plain agentic loop: call the model, execute any tool calls against the MCP session, feed the results back, repeat until the model stops asking for tools (Anthropic: Building effective agents). The model is claude-opus-4-8 with adaptive thinking, which lets it decide how much to reason per step.

# agent/brief.py
from anthropic import AsyncAnthropic
from agent.session import to_anthropic_tools

client = AsyncAnthropic()  # reads ANTHROPIC_API_KEY; the key lives with the agent, not the server

COACH_SYSTEM = (
    "You are my training coach. Read today's data through the fitness tools, "
    "write a short daily brief, and save it with the save_brief tool. "
    "Use only values the tools return; never invent a number."
)

async def run_agent(session, prompt: str) -> str:
    listed = await session.list_tools()
    tools = to_anthropic_tools(listed.tools)
    messages = [{"role": "user", "content": prompt}]

    while True:
        resp = await client.messages.create(
            model="claude-opus-4-8",
            max_tokens=4096,
            thinking={"type": "adaptive"},
            system=COACH_SYSTEM,
            tools=tools,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason != "tool_use":
            return next((b.text for b in resp.content if b.type == "text"), "")

        results = []
        for block in resp.content:
            if block.type != "tool_use":
                continue
            out = await session.call_tool(block.name, block.input)  # the MCP round-trip
            results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": [{"type": "text", "text": _render(out)}],
                "is_error": out.isError,
            })
        messages.append({"role": "user", "content": results})

def _render(result) -> str:
    return "\n".join(c.text for c in result.content if c.type == "text")

The whole brief is now tool calls and a model. The server never sees the prompt, never runs the loop, and never holds the key.

Use it: generate a brief end to end

Wiring the session to the loop is the entire entry point. The agent connects, reads the latest snapshot (and run_sql for the week if it wants the trend), composes the brief, and calls save_brief so the same row the web UI reads gets written.

# generate.py
import asyncio
from agent.session import fitness_session
from agent.brief import run_agent

async def main():
    async with fitness_session() as session:
        summary = await run_agent(
            session,
            "Write today's brief from my latest data and save it.",
        )
        print(summary)

if __name__ == "__main__":
    asyncio.run(main())

A run reads as a tool trace: daily_snapshot for today, sometimes run_sql to pull the last seven days of sleep and load, then save_brief with the composed text. That last write is the point. The web frontend became a viewer of agent-written output, so the brief I generate from Claude Desktop is the brief the UI shows, with no second code path generating it server-side.

Verify: the server holds no inference

Two things are worth a test. First, that the server actually stopped doing synthesis, which I assert structurally by proving no server module imports the model SDK. Second, that the agent path can both read and write, the precondition that let synthesis leave the server in the first place.

# tests/test_agent_first.py
import pathlib
import pytest
import fitness
from agent.session import fitness_session

@pytest.fixture
def anyio_backend():
    return "asyncio"  # anyio needs this fixture to know which backend to run on

def test_server_runs_no_inference():
    """The server is data + deterministic compute only; it imports no model SDK."""
    root = pathlib.Path(fitness.__file__).parent
    for path in root.rglob("*.py"):
        text = path.read_text()
        assert "import anthropic" not in text, f"{path} still runs inference"
        assert "from anthropic" not in text, f"{path} still runs inference"

@pytest.mark.anyio
async def test_agent_can_read_and_write():
    """The agent path reaches both surfaces, so the brief can land back over MCP."""
    async with fitness_session() as session:
        names = {t.name for t in (await session.list_tools()).tools}
    assert "daily_snapshot" in names  # read path
    assert "save_brief" in names      # write path

The first test fails the moment any server module pulls the model back in, which is the regression I most want to catch. The second confirms the agent reaches both surfaces. Beyond the tests, the way you confirm the result is as good is to run the agent and diff the saved brief against what the old in-server loop produced for the same day; in practice it reads better, because the agent can pull the week with run_sql when the day alone is thin.

A “read-only” denylist is not read-only

The part I’m most glad I did was siege the live, running container and its MCP surface rather than just the source, and it earned its keep immediately by finding a Fatal. The run_sql tool enforced “read-only” with a keyword denylist, but it only checked the query’s leading keyword: if the statement started with a benign token, it was waved through. So a query that opens with a WITH CTE hides the write that follows, and it slipped straight past the guard and committed, demonstrated live: WITH a AS (SELECT 1) DELETE FROM workouts starts with WITH, so the denylist never sees the DELETE.

This is the textbook failure of denylisting, and OWASP names it directly: a denylist tries to enumerate known-bad input and is reliably evadable through casing, whitespace, and phrasing the author didn’t anticipate (OWASP: Input Validation; OWASP: Injection Prevention). The fix was to stop guessing at the string layer and enforce the property where it can actually be guaranteed: open the database connection read-only at the SQLite engine itself, so a write fails regardless of how the query is phrased.

import sqlite3

def readonly_connection(db_path: str) -> sqlite3.Connection:
    # `mode=ro` via a file: URI makes the engine reject every write,
    # no matter how the SQL is phrased (CTEs, casing, leading tokens, all of it).
    return sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)

conn = readonly_connection("fitness.db")
conn.execute("WITH a AS (SELECT 1) DELETE FROM workouts")
# sqlite3.OperationalError: attempt to write a readonly database

The denylist tried and failed to recognize that query as a write; the read-only connection rejects it at the engine without parsing it at all. I also bounded and offloaded execution after an unbounded query froze the single event-loop thread (an authenticated denial of service), and hardened input validation across the rest of the MCP tools, with 19 new regression tests. The gate converged clean over three rounds. The lesson is about layers: the right place to enforce “this cannot write” is the layer that owns writing, not a filter in front of it, and the bug only showed up because the siege exercised the deployed surface a client actually reaches, not just the code.

Set the container timezone to local to kill a phantom-data-row at its source, and keep leaning into MCP as the primary surface. The goal is to reach for the coach in Claude more than the UI.

Sources

Anthropic: Building effective agents — keep predictable work in code paths; reserve the model for what needs it, and the agentic loop pattern that drives tools.
MCP Python SDK — ClientSession over stdio, list_tools, and call_tool for driving a server from a client.
OWASP: Input Validation Cheat Sheet — denylists are evadable; prefer allowlists and enforce at the right layer.
OWASP: Injection Prevention Cheat Sheet — don’t decide safety by pattern-matching the input string.

Changelog

feat: agent-first Phase 1 — briefs write gate, save_brief tool, brief prompt (#25) (590f6bb)
feat: agent-first Phase 2 — frontend becomes a viewer of agent-written output (#25) (1980ea6)
feat: agent-first Phase 3 — retire the server-side Claude loops (#25) (8a500ce)
fix(security): harden run_sql + MCP tool inputs (quality-gate findings, #25) (35d5230)
fix: container build — Debian web-builder for rolldown + pnpm pin (#25) (2847337)