Grade the state, not the calendar: a date is a stand-in for the question you actually care about

Shipped

I asked the coach how my training plan was going and two things were wrong. Today’s run, already synced from my watch, showed as “pending.” And Saturday, when I took a long recovery walk instead of a run, showed as a missed day. v0.8.0 fixes both, makes walks count on easy days, and corrects the plan tab so its row colors match the grade. The walk-counting change is a rules tweak. The part worth writing about is the shape underneath both bugs: the grader was reading the calendar as a proxy for “did this day succeed,” and the calendar drifts from that question right at the boundary called “today.”

The fix is to grade the day by its outcome, not its date. A completed workout grades immediately, a rest day reads compliant, a half-finished run stays pending instead of booking half credit, and an un-synced day holds pending instead of booking a false miss. Here is the end-to-end build, from the state model to the tests that pin the boundaries down.

A date is a stand-in: model the state, not the clock

The “pending” bug came from grading against the clock. The grader held any day at or after the most recent sync as pending, so it would not call a day missed before the data had arrived. Reasonable, except the most recent synced day is today, so today was always pending even with a finished run sitting in the database. That is the classic split between when something happened and when the system got around to looking: stream systems name it event time versus processing time, where processing time tracks the machine’s wall clock and drifts from the data’s own timeline, especially near the present edge (Apache Flink: Timely Stream Processing). Grading on the wall clock means the answer changes with when you ask, not with what you did.

So model the thing you actually care about. A day has a plan (what was scheduled) and an actual (what happened), and the grade is a function of those two, not of the date. This is a pure mapping: a decision table that takes (plan, actual) to a verdict, with no transitions and no memory carried between days (Decision table). Start with the model.

# grading/model.py
from dataclasses import dataclass
from datetime import date
from enum import Enum

class Plan(Enum):
    RUN  = "run"     # a specific workout was scheduled
    EASY = "easy"    # an easy day; a recovery walk counts as done
    REST = "rest"    # nothing scheduled, on purpose

class Actual(Enum):
    COMPLETED = "completed"  # the planned work finished
    WALK      = "walk"       # a walk was recorded (counts as done on an easy day)
    PARTIAL   = "partial"    # started, not finished (a run in progress)
    NONE      = "none"       # nothing recorded for the day

@dataclass(frozen=True)
class Day:
    on: date
    plan: Plan
    actual: Actual
    is_synced: bool          # has this day's data arrived from the watch?

Nothing here references “today.” A day carries everything the grader needs: what was asked, what was done, and whether the data is in. The calendar becomes one ordinary field instead of the hidden axis the whole decision pivots on.

grade_day: map state to verdict by outcome

Now the function. Grade the outcome first, then apply one narrow rule for uncertainty: hold pending only when the read is not already a settled success and the day genuinely is not finished, either because the data has not synced or because a run is still in progress. That keeps a finished day from ever sitting at pending, and keeps an un-synced or half-done day from booking a false miss or false half-credit.

# grading/grade.py
from enum import Enum
from grading.model import Day, Plan, Actual

class Verdict(Enum):
    DONE      = "done"       # planned work finished
    COMPLIANT = "compliant"  # did what the day asked (a rest day taken at rest)
    MISSED    = "missed"     # work was scheduled and none came
    PARTIAL   = "partial"    # internal to evaluate(); grade_day collapses it to PENDING, so the UI never sees it
    PENDING   = "pending"    # can't judge yet; data may still arrive

SETTLED_SUCCESS = {Verdict.DONE, Verdict.COMPLIANT}

def evaluate(day: Day) -> Verdict:
    """Map (plan, actual) -> verdict purely by outcome. No reference to the date."""
    if day.plan is Plan.REST:
        return Verdict.COMPLIANT          # rest grades compliant regardless of what's recorded
    if day.plan is Plan.EASY and day.actual in (Actual.COMPLETED, Actual.WALK):
        return Verdict.DONE               # the headline rule: a walk counts as done on an easy day
    if day.actual is Actual.COMPLETED:
        return Verdict.DONE               # planned work finished
    if day.actual is Actual.PARTIAL:
        return Verdict.PARTIAL            # half a run, not yet anything
    return Verdict.MISSED                 # scheduled work, nothing recorded (a walk on a run day lands here)

def grade_day(day: Day) -> Verdict:
    verdict = evaluate(day)
    if verdict in SETTLED_SUCCESS:
        return verdict                    # done / compliant grade immediately, today included
    # verdict is partial or missed: only final once the day is settled.
    settled = day.is_synced and day.actual is not Actual.PARTIAL
    return verdict if settled else Verdict.PENDING

Trace the four cases that broke before. A completed run returns DONE straight from evaluate, regardless of date, so today no longer hangs. A rest day returns COMPLIANT regardless of what was recorded, so a planned day off stops reading as a miss even if you logged an easy jog. A partial run is never “settled,” so it returns PENDING rather than half credit that quietly heals later; a review round flagged that one before it shipped. PARTIAL is therefore an internal value that evaluate emits and grade_day always folds into PENDING, which means the four verdicts a caller can actually receive are DONE, COMPLIANT, MISSED, and PENDING. An un-synced day with nothing recorded is also not settled, so it holds PENDING instead of a false miss. Only a synced day that scheduled work and recorded none returns MISSED, which is the one case that is truly a miss.

The headline walk-counting rule is the one explicit branch on Plan.EASY: on an easy day a recorded walk maps to DONE exactly like a completed workout, which is why Saturday’s recovery walk now grades green. On a RUN day that same walk is not the scheduled run, so it falls through to MISSED; the plan type, not the activity alone, decides whether a walk counts.

Grade a stretch of days

The UI grades a window of days and renders one verdict each. Because grade_day is pure over the Day model, scoring a week is a map, and the per-day verdicts fall straight out.

from datetime import date
from grading.model import Day, Plan, Actual
from grading.grade import grade_day

week = [
    Day(date(2026, 6, 20), Plan.RUN,  Actual.COMPLETED, is_synced=True),   # Sat target run
    Day(date(2026, 6, 21), Plan.REST, Actual.NONE,      is_synced=True),   # planned rest
    Day(date(2026, 6, 22), Plan.EASY, Actual.WALK,      is_synced=True),   # easy day, a walk counts as done
    Day(date(2026, 6, 23), Plan.RUN,  Actual.PARTIAL,   is_synced=True),   # run in progress
    Day(date(2026, 6, 24), Plan.RUN,  Actual.NONE,      is_synced=False),  # today, not synced yet
    Day(date(2026, 6, 25), Plan.RUN,  Actual.NONE,      is_synced=True),   # scheduled, nothing ran
]

for d in week:
    print(d.on, grade_day(d).value)

2026-06-20 done
2026-06-21 compliant
2026-06-22 done
2026-06-23 pending
2026-06-24 pending
2026-06-25 missed

Adherence then reads off those verdicts, and the key choice is what to leave out: pending days are unsettled, so counting them either way is a guess. They drop out of the denominator entirely instead of dragging the rate down.

from grading.grade import grade_day, Verdict

def adherence(days) -> float | None:
    verdicts = [grade_day(d) for d in days]
    graded = [v for v in verdicts if v is not Verdict.PENDING]  # pending isn't a judgment yet
    if not graded:
        return None
    good = sum(v in {Verdict.DONE, Verdict.COMPLIANT} for v in graded)
    return good / len(graded)

The float | None return uses PEP 604 union syntax, which needs Python 3.10 or newer; on older runtimes write Optional[float] instead. A pending day no longer counts against you, and it no longer needs a backfill job to “correct” the rate once the data lands; the next grade is simply right.

One verdict, computed once

The most useful catch came from review, not tests. The backend was clean and unit-tested, but a second-look review flagged that the plan tab colored each row by recomputing a pace miss on its own, independent of the verdict the grader had already produced. With walks now counting as done on easy days, a completed walk has a pace far slower than a run target, so that done day would have rendered red anyway. Two parts of the system computed “did this day succeed,” and they disagreed.

That is the single-source-of-truth problem in miniature: every fact should have one authoritative home, and everything else should read from it rather than keep its own copy (Single source of truth). It is also the core of how React tells you to structure state, where you derive values from the source instead of storing or recomputing them separately so they cannot fall out of sync (React: Choosing the State Structure), and it is Don’t Repeat Yourself applied to a decision rather than to text (Don’t repeat yourself). So the row reads the verdict and nothing else.

// PlanRow.tsx — color comes from the verdict, never recomputed from pace

// Mirrors the four verdicts grade_day(...).value can actually return.
// (The server enum also has "partial", but grade_day collapses it to "pending", so it never reaches the client.)
type Verdict = "done" | "compliant" | "pending" | "missed";

interface GradedDay {
  verdict: Verdict;   // the one authoritative grade from the server
  // ...plus whatever the row renders: date, pace, distance, etc.
}

const COLOR: Record<Verdict, string> = {
  done: "row--green",
  compliant: "row--green",
  pending: "row--gray",
  missed: "row--red",
};

function PlanRow({ day }: { day: GradedDay }) {
  // day.verdict was produced once by grade_day on the server (its enum .value).
  // This row does not look at pace or distance to second-guess it.
  return <tr className={COLOR[day.verdict]}>{/* cells */}</tr>;
}

After the change I confirmed it with a screenshot: Saturday’s walk shows green at a 17 minute mile, today grades done, and adherence reads 100 percent. The unit tests had passed the whole time, because the grader was right; what they could not see was a second component re-deriving the grader’s decision and getting a different answer. That inconsistency lives between layers, not inside one, which is exactly the gap a fresh-eyes review is for.

Verify the boundaries

The decisions worth defending are the boundary cases, so each became a test. A finished run grades now even when its date is today, a rest day is compliant, a partial run is pending rather than half credit, an un-synced day holds pending rather than a false miss, and a synced scheduled day with nothing recorded is a real miss.

# tests/test_grade_day.py
from datetime import date
from grading.model import Day, Plan, Actual
from grading.grade import grade_day, Verdict

def day(plan, actual, *, synced=True):
    return Day(date(2026, 6, 23), plan, actual, is_synced=synced)

def test_grade_ignores_the_date_field():
    # the real "today" guard: two days identical except `on`, one dated today and
    # one in the past, must grade the same. This is the test that fails the moment
    # grading reads the calendar again.
    today = Day(date.today(),     Plan.RUN, Actual.COMPLETED, is_synced=True)
    past  = Day(date(2020, 1, 1), Plan.RUN, Actual.COMPLETED, is_synced=True)
    assert grade_day(today) == grade_day(past) == Verdict.DONE

def test_completed_run_grades_immediately():
    # a finished run must not sit at pending, and sync state can't change that
    assert grade_day(day(Plan.RUN, Actual.COMPLETED)) is Verdict.DONE
    assert grade_day(day(Plan.RUN, Actual.COMPLETED, synced=False)) is Verdict.DONE

def test_walk_counts_as_done_on_an_easy_day():
    # the headline rule, and that it's scoped to easy days
    assert grade_day(day(Plan.EASY, Actual.WALK)) is Verdict.DONE
    assert grade_day(day(Plan.RUN,  Actual.WALK)) is Verdict.MISSED

def test_rest_day_is_compliant_not_missed():
    # rest grades compliant regardless of sync state
    assert grade_day(day(Plan.REST, Actual.NONE)) is Verdict.COMPLIANT
    assert grade_day(day(Plan.REST, Actual.NONE, synced=False)) is Verdict.COMPLIANT

def test_partial_run_is_pending_not_half_credit():
    assert grade_day(day(Plan.RUN, Actual.PARTIAL)) is Verdict.PENDING

def test_unsynced_day_holds_pending_not_a_false_miss():
    assert grade_day(day(Plan.RUN, Actual.NONE, synced=False)) is Verdict.PENDING

def test_synced_scheduled_nothing_is_a_real_miss():
    assert grade_day(day(Plan.RUN, Actual.NONE, synced=True)) is Verdict.MISSED

Each test names a case the old date-based logic got wrong. The one that does the real guarding is test_grade_ignores_the_date_field: it grades two otherwise-identical days, one dated today and one years in the past, and asserts the same verdict, so it fails the moment grading reads the calendar again. The grading is honest now, and that date-independence is the first thing the tests pin down.

The grading reads outcomes correctly. The natural follow-on is letting the coach fold a recovery walk into its read of the week, not just score it correctly, so an easy walk shapes the next day’s recommendation instead of only landing as a green row.

Sources

Apache Flink: Timely Stream Processing — processing time tracks the machine’s wall clock and drifts from the data’s own event time, which is exactly the gap at “today.”
Decision table — a compact mapping from input conditions to actions; the shape behind grading by (plan, actual) with no transitions or stored state.
Single source of truth — each fact has one authoritative home; everything else reads from it.
React: Choosing the State Structure — derive values from the source instead of recomputing them separately so they cannot desync.
Don’t repeat yourself — a single decision should be expressed in exactly one place.

Changelog

fix: outcome-based plan grading + recovery walks count on easy days (0.8.0) (20b19e0)
docs: design for training-plan grading fixes (today-pending + walks) (f93db60)