The cron got poisoned because writes weren't validated
The symptom was quiet: three scheduled agents — task coordination, weekly review, vault storytelling — weren't running. No loud failure, no exception trail, just absence. APScheduler at boot was looking at each one's cron field, failing to parse it, and moving on without saying much. The schedules they were supposed to fire on came and went, and nobody got the email until enough days had passed that the gap was obvious in the journal.
The reflect app was the culprit. Reflect lets the system reason about its own configuration — it can take an LLM proposal and apply changes to staff agents, prompts, schedules. Its _apply_modifications path was taking the LLM's free-prose detail field and writing it into the agent's cron slot directly. So instead of 0 8,20 * * *, the cron field on task-coordinator was a full sentence the model had written explaining why it wanted the schedule changed. APScheduler tried to parse "Run twice a day in the morning and evening to coordinate the day's plan", quietly threw, and left the agent unscheduled. Three runs of reflect across a few days had quietly disabled three of the system's most-used cron jobs.
The fix was small and the lesson larger. Validation moved upstream: _apply_modifications now runs CronTrigger.from_crontab(detail) as a pre-check before the write, and on failure returns a rejected action with a warning log instead of mutating anything. The agents.json file was hand-repaired back to the original cron strings. The scheduler wakes up at boot now, reads three valid cron fields, and the schedules fire again.
The lesson is the part worth holding onto. We've been writing this principle down piecemeal for weeks — a guard in the journal app to keep multi-paragraph prose from being appended as a single entry, a coercion helper in highlights to normalize loose source-shape inputs, a few others. The pattern keeps surfacing: when a write field has constrained syntax (cron strings, single-line entries, structured objects), and the input shape comes from somewhere loose (LLM output, a stale UI, a third-party caller passing the wrong type), validation has to live at the write boundary — the function that puts the value in storage. Validating at the read site is too late, because by then the bad value is already on disk and the symptom is silent absence. Validating in the UI alone is too fragile, because it bypasses anything that calls the function programmatically.
There's a corollary that's almost more important: when validation fails, the right response is reject loudly, not coerce silently. Earlier versions of similar guards have been tempted to "fix" bad input by coercing — strip the prose, take only the first line, normalize the shape. That sounds helpful and is actually worse, because the caller then thinks their write succeeded and their downstream logic breaks in a different place at a different time. The cron poisoning would have been worse if reflect had silently truncated to the first comma-separated token; we'd have ended up with cron schedules that were almost what the model meant, running at unintended hours.
We're not extracting this to the SDK yet. The journal version uses an asyncio lock plus a single-line check; the highlights one is a coercion-to-dict helper; the new cron one is a parse-or-reject. They're not the same shape. The rule we keep is: extract on the second consumer of the same shape, not the second instance of the same theme. So the tag stays as a watch-list item — when a third write boundary surfaces with the same parse-or-reject pattern, that's the moment to lift it into BaseApp as a generic guard. Until then, three small implementations beat a premature abstraction that fits none of them well.
The agents are running again. The audit log will tell us if any silently break in the future before three weeks of journal absence does.