The audit tools clean themselves up
There's a particular kind of debt that builds up in a project that ships fast: design-language drift. Every page is almost on the token system. Every page has one or two raw hex codes left over from a quick fix three weeks ago. The buttons all look right at a glance, but if you grep for #fff across the codebase you find forty hits, and most of them are on accented backgrounds that should be using a semantic ink token instead. The pages aren't broken; they're just slowly forgetting the rules they were supposed to be following.
The audit skill we'd written months ago — /eos-ui-audit-and-consolidate — finally got pointed at itself. Hub, journal, task, projects: each page parsed against the design language, each violation classified, each fix applied with a token from the existing palette. White-on-accent backgrounds became var(--accent-ink). The handful of zombie purples scattered across task and projects collapsed into a single var(--purple) reference. Hardcoded shadow rgbas became var(--shadow). Hand-rolled tints on type and dependency badges switched to color-mix(in srgb, var(--token) 12%, transparent), which is the supported way to do "ten percent of the brand color" without leaking a literal alpha. About thirty files moved off raw hex in a single pass. None of the visual output changed; the token graph just stopped having holes in it.
That cleanup matters more than it looks because every off-token value is a page that won't follow the next theme change. We've shifted accents twice in the project's life, and both times we ended up with patches of pages that didn't get the memo because the original author had inlined a hex value somewhere. Burning the drift down to zero now means the next theme tweak is a config edit, not a hunt.
The session's bigger surprise was the demo experience. Visitors land on demo.binbian.net through the front door, take the tour, click around, and leave — and we'd never actually walked it through someone else's eyes recently. Doing so surfaced four classes of bug we'd been blind to. The tour orchestrator was loaded only on the hub page, so the second navigation killed it; it had to be auto-loaded everywhere via the global frontend. Voice-assistant didn't load the global frontend at all because it's a deliberate visual island, so its tour spotlight had nothing to attach to until we landed it back on the hub's bottom-right assistant pill instead. A handful of tour steps had over-strict capability requirements — ["think","listen","speak"] — that auto-rewrote them to a setup page, breaking the walkthrough on a fresh demo with only the cheapest providers wired. The tour content itself iterated through three drafts before we found the right balance between showing off the system-design philosophy and just letting visitors push buttons.
A second visitor-experience bug surfaced with the demo data itself. The seed file had hardcoded ISO dates from a baseline day, and as wall-clock time advanced past that baseline the seed tasks stopped showing up under "today" and "this week." Someone arriving a month after the seed was written would see an empty task pulse, conclude the system did nothing, and bounce. The fix was small and the principle larger: the demo data needs to move with wall-clock time. We added emoji date syntax (📅 and ✅) to the seed file so dates are parseable, then wrote a refresh script that shifts every date by (today - seed_baseline) days, idempotent after a git checkout. It's cron-friendly. The script is documented next to the deploy notes so future-us doesn't re-derive the same fix from chat.
The music library also failed in a way that only shows up when someone else is looking at it. Vault paths with spaces, mid-dot characters, parentheses, or non-ASCII letters weren't URL-encoded, so the <audio src> and <img src> tags silently failed for any song with an interesting filename. The fix had to happen on both sides — a server-side helper that percent-encodes path segments in the API responses, plus a client-side encoder before building the audio src — because the URL flowed through both. Once that landed we caught a separate bug where per-song cover art and screenshots were being cached by directory key, which meant every track in a multi-song folder shared whichever image the cache walked into first. Eight tracks under one album, all wearing the same cover. The fix was a thirty-line filter: when the directory has multiple notes in it, match images by filename stem; when it's a one-song folder, keep the permissive "all art belongs to me" rule.
The release flow itself paid off in this session for the first time. We shipped fourteen tagged releases — v0.2.37 through v0.2.50 — through the public-snapshot script in a single sitting, each one a small fix that the demo VPS pulled within minutes. The script does its safety scans (personal-data and third-party-branding patterns) before snapshotting, so a few of the iterations caught me about to ship something I shouldn't. One of those near-misses was the seed album's copyright field; another was a tour step that had been quoting a third-party tool by name. Both got rewritten before they left the laptop.
There's a release-discipline lesson sitting in those fourteen tags too. Four of them earlier in the week — v0.2.7 through v0.2.10 — had been wasted on the same mistake: pushing a public release without first running an import-test on the changed modules locally. By the time the VPS pulled the snapshot, a missing import or a typo would surface as a 500 on a real visitor instead of a python -c error on my laptop. The fix is mechanical and was added as a memory entry rather than a script: before any release, smoke-test the changed modules with a one-line import. The pattern stays in the head, not in the toolchain, because every release is small enough that the discipline lives in the wrist.
The system is starting to be something a stranger could use, which is a different bar than "something I can use." Most of this session was about closing the gap between the two.