Prose flows
.flow.md is the lead authoring surface. Describe the scenario the way
you’d type it into a ticket. The planner compiles prose to a deterministic
IR, commits the result alongside as .flow.json, and the runtime executes
that cache like any hand-authored YAML flow.
# Sign in and see today's notifications
Sign in with the seeded test user, dismiss the onboarding modal, and
assert that the home screen shows today's notifications. Take a visual
snapshot called "home-after-login".That is the whole flow file. No imports, no selectors, no setup boilerplate.
File structure
A .flow.md file has three parts; only the title is required.
---
hints:
effort: high
preferRoles: ['button', 'link']
fixtures:
email: pm@example.com
---
# Login smoke
Type the email into the email field and tap Sign in. Wait for the
welcome greeting.Front-matter (optional)
YAML between --- fences. Three keys:
fixtures— free-form key/value pairs surfaced to the planner. Useful for inline test data that does not deserve a separatefixtures/file.hints.model/hints.effort/hints.preferRoles— planner overrides.effort: highlets the planner think longer;preferRoles: ['button']biases targets toward role-tagged elements.- Unknown keys are rejected at parse time.
Title
The first # Heading line. It becomes Flow.name in the IR.
Body
Everything after the title is handed to the LLM planner verbatim. Write prose; the planner figures out the steps.
How the planner compiles prose to IR
The planner takes three inputs: prose body, an element-graph snapshot of
the screen the flow starts on, and the current planner version. It emits
a SemanticPlan — the IR plus a _meta block recording how the plan
was produced.
prose body ─┐
snapshot ─┼──▶ planner ──▶ SemanticPlan { steps, _meta }
version ─┘ │
▼
flows/login.flow.json (committed)Each prose sentence maps to one or more IR steps. “Tap Sign In” becomes a
tap step; “if a What’s New modal appears, dismiss it” becomes an
optional step (see IR reference). The planner only
emits IR variants the runtime understands — it cannot invent step kinds.
Capturing the snapshot is a one-liner the CLI handles for you:
klera plan flows/login.flow.md --snapshot snap.jsonIf snap.json does not exist, the CLI starts the runtime, captures the
current screen, and writes the snapshot. Subsequent klera plan calls
reuse it unless the screen has materially changed.
The .flow.json cache
Treat the cache like a lockfile. It is generated, committed, and reviewed alongside the prose change in the same PR — adopters never hand-edit it.
- login.flow.md
- login.flow.json
_meta carries fingerprints of every input the planner saw:
{
"_meta": {
"model": "anthropic:claude-sonnet-4-7",
"promptHash": "sha256:a4f9…",
"snapshotHash": "sha256:1e2c…",
"plannerVersion": "0.1.0",
"combined": "sha256:7b09…",
"fixturesUsed": ["users.regular"]
},
"steps": [
{ "tap": { "testID": "login-email" } },
/* … */
]
}The combined hash is hash(prose + element_graph + planner_version).
When any of the three changes the cache is stale.
Staleness detection
The CLI knows three states:
| State | Meaning |
|---|---|
| fresh | combined matches a recompute against the current prose + snapshot + planner version. |
| stale | At least one input changed. CI gates on this. |
| missing-combined | A pre-ADR-0054 cache. Treated as stale; regenerate once and the new field lands. |
klera compile flows/login.flow.md --check # exits 1 if stale, 0 if fresh
klera compile flows/login.flow.md --force # always regenerate
klera compile flows/login.flow.md --diff out.md # write a Markdown step-list diff
klera compile --all --check # gate every flow at once (CI)
klera compile --all # batch regenerate every flowklera compile is the canonical way to bring a stale cache up to date.
klera plan covers the same ground but is biased toward first-time
generation; once you have a .flow.json committed, prefer compile.
CI typically runs klera compile --all --check as a PR gate. The
optional klera ci scaffold flag compileMode: 'auto' instead
regenerates stale caches and posts a Markdown step-list diff as a
PR comment for review.
Run-time drift recovery vs recompile
The cache fingerprints what the planner saw at compile time; the runtime sees what the screen actually looks like at run time. They can disagree.
| Symptom | Resolution |
|---|---|
Selector drift (testID renamed, button moved a pixel) | Matcher self-heals via the strategy ladder. No replan, no recompile. |
| One-off optional surface (What’s New modal, A/B test) | Runtime replans the remaining steps in-memory. No cache rewrite. |
| Whole-screen redesign (the planner’s snapshot is wrong) | Recompile: delete snap.json, rerun klera compile --force. |
| Prose intent itself changed | Edit the prose, run klera compile, commit both files together. |
Runtime replanning is on by default and bounded — three rungs of recovery,
and every replan attempt is recorded in the report’s matcher trace.
Replans never rewrite the on-disk .flow.json; PR diffs stay deterministic.
--strict mode
klera run flows/login.flow.md --strict--strict disables runtime replanning entirely. Intent drift surfaces as
a hard failure with the same matcher diagnostics a YAML flow produces.
This is the expected mode for CI — the cache committed on the PR is what
runs, full stop.
Runtime replanning is a debugging affordance for local iteration. CI
should always run --strict. If --strict fails and the local
non-strict run passes, you have a stale cache; run klera compile
and commit the result.
How the planner uses the snapshot
The element-graph snapshot is a JSON tree of every accessible node the
runtime saw on the starting screen — testID, accessibilityLabel,
role, text, frame, parent. The planner projects only IR-relevant
fields (no internal id, no fiber bookkeeping) into the prompt so the
LLM cannot cite handles that won’t exist at run time.
The snapshot has two jobs:
- Disambiguate references. “Tap the Sign In button” is one node; “tap the second Sign In link” is a different one. The snapshot tells the planner which is which.
- Anti-hallucination grounding. The planner is instructed to prefer
testIDs and labels that appear in the snapshot. Targets that don’t appear get rejected by a semantic-check pass before the cache is written; the retry loop carries the rejection back to the LLM.
The snapshot itself participates in the cache key — change the screen substantially and the cache goes stale, even if the prose did not change.
Planner transports
Four transports produce a bit-identical SemanticPlan cache. Adopters
pick based on what auth they already have.
API
klera plan flows/login.flow.md --snapshot snap.jsonDefault transport. Calls the Anthropic API directly. Needs
ANTHROPIC_API_KEY in the environment. Deterministic, headless,
ideal for CI.
The _meta.model field on the cached IR records which transport produced
the plan: "anthropic:<model-id>", "manual",
"manual:claude-3-5-sonnet" (when you pass a custom modelTag), or
"mcp:host" / "mcp:server". Triage and the HTML report viewer surface
this in the report header.
See planner transports for the full transport reference.
What klera run does with the cache
klera run flows/login.flow.md never calls the LLM. It loads the sibling
.flow.json, validates it via Zod, and executes the IR. If the cache is
missing the runner errors with the exact klera compile command needed
to generate it.
klera run flows/login.flow.md
klera run flows/login.flow.md --strict # CI mode
klera run flows/login.flow.md --watch # iterate on prose; re-runs on saveWatch mode hooks Metro’s file watcher (or the @klera/metro-plugin) so
saving the .flow.md triggers a debounced klera compile followed by a
re-run against the same attached bridge. Iteration latency drops by an
order of magnitude vs cold-run-per-edit. See
watch mode for details.
Optional steps and conditionals
Prose conditionals like “if X appears, do Y” compile to an optional
IR step. The matcher evaluates the predicate against the runtime element
graph and only runs the inner step when it matches:
{
"optional": {
"when": { "visible": { "testID": "whats-new-modal" } },
"do": { "tap": { "testID": "whats-new-dismiss" } }
}
}Optional steps are flat — they cannot be nested inside another optional. Express compound conditionals as multiple sequential optionals.
Next steps
- YAML flows — the power-user escape hatch and what prose compiles into.
- IR reference — every step kind the planner can emit.
- Fixtures and secrets — committed test data and credential handling.
- Planner transports — wiring up local CLIs, manual paste, MCP.