Skip to Content
DocsStart hereWhy klera

Why klera

Anyone who writes tickets can author end-to-end tests.

klera is LLM-native E2E testing for Expo / React Native. PMs, founders, and engineers describe scenarios in plain prose; klera compiles them into self-healing flows that run against real simulators and devices. If you know Detox or Maestro, klera is what they look like when prose is the authoring surface and the IR is plumbing.

The thesis

Nobody writes E2E tests by hand in the future. The people who already describe scenarios in tickets — PMs, designers, founders — are the largest underserved authoring cohort in mobile testing today. Closing the format gap is what compounds.

Write a flow the way you’d describe it to a coworker:

# Sign in and see today's notifications Sign in with the seeded test user, dismiss the onboarding modal, and assert that the home screen shows today's notifications. Take a visual snapshot called "home-after-login".

That is the whole flow file (flows/login.flow.md). klera compiles it into a deterministic IR cache committed alongside (.flow.json, plumbing — adopters never hand-edit it). On every prose change, CI regenerates the cache and posts a visual flow diff for review.

What’s broken with hand-authored E2E

Three failure modes show up in every shop that runs a Detox / Appium suite past the first redesign:

  1. Selectors break. A button gets a new testID and 40 tests go red overnight. Ownership scatters; people stop trusting the suite.
  2. Failures are unreadable. “Element not visible” tells you nothing about whether the app shipped a regression, whether the matcher gave up too early, or whether the API is just slow today.
  3. The authoring surface excludes the people who know what to test. PMs file tickets in prose. QA writes test cases in prose. Engineers re-encode that prose into Java / TS / Swift selectors by hand.

klera attacks all three: prose authoring, a self-healing matcher, and auto-triage on every failure.

Prose-primary authoring

The lead authoring surface is .flow.md. The LLM planner compiles it into the same Zod-validated IR that hand-authored YAML produces, then commits the result as a paired .flow.json cache. Prose stays the source of truth; the cache is plumbing.

You can compile flows three ways without ever setting an API key:

  • Local coding-agent CLIclaude, codex, or gemini on your PATH compiles flows under your existing subscription. klera init auto-detects which one you have.
  • Manual pasteklera plan --manual writes the prompt to disk, paste it into any chat-style LLM, paste the response back via klera plan --apply-response.
  • MCP routing — point Cursor or Claude Code at @klera/mcp and the editor’s ambient LLM compiles via plan_flow.

Power-user escape hatch: hand-author YAML, or run your existing Maestro YAML directly via the compatibility loader. Both paths share the same executor, matcher, drivers, and reports.

The self-healing matcher

Every step resolves through a strategy ladder:

  1. testID (exact match) — fastest, most stable
  2. accessibilityLabel (exact match) — preserves semantics across redesigns
  3. role + text — matches “the Sign In button” without coupling to a specific node
  4. Fuzzy text — last-resort tolerance for copy tweaks

When the first strategy fails, the matcher walks the ladder. Every attempt is recorded; nothing silently mutates your committed flow. Drift recovery is bounded — three rungs, with an explicit --strict mode that disables it entirely.

runtime · v2 redesign
looking for "Sign In"
not where it was last time
found it labelled "Log in"
drift saved for review → __drift__/sign-in.json
flow passed · 1 drift, 0 failures

This is why klera flows survive a redesign. A button that moved from a <TouchableOpacity testID="signin"> to a <Pressable role="button">Sign In</Pressable> still resolves — strategy 1 fails, strategy 3 wins, the run is green, the matcher trace records the drift for review.

Drift is recorded; it is not auto-applied. The committed flow is unchanged. Adopters review drift in the report and decide whether to update the prose, leave it, or chase the underlying redesign.

Auto-triage on every failure

When a flow fails, klera classifies the failure into one of four verdicts before the report lands on your desk:

  • regression — matcher exhausted the ladder, no workable replan. Carries a suspect commit list (git log -200 against the implicated source files).
  • drift — planner found an equivalent target the matcher missed. Carries a proposed test update.
  • flake — synchronisation gate timed out, planner agrees with the cached IR. Retry candidate.
  • data — value-mismatch error. Test fixture or seed data disagreement.

A deterministic classifier picks the verdict from the matcher trace + IR diff; an LLM narrates it into PM and engineer prose. PNG triplets (actual / baseline / diff) and the matcher trace ship inline.

failureflows/checkout-android.flow.mdstep 4 of 6 · failed at 00:12.4 · 2026-04-29
last frame · captured
verdict
regressiondriftflakedata

The runtime tapped “Place order”, but the next screen never mounted. The element graph shows the button transitioning to disabled — no navigation event followed.

suspect commit
a1c4f29checkout: gate submit on payment-method validity@miyu · 2h ago · packages/checkout/src/PlaceOrderButton.tsx · first flow run after this commit to fail
proposed fix · pick a payment method before tapping
- Tap “Place order” and confirm the order receipt appears.
+ Pick a saved card, tap “Place order”, and confirm the order receipt appears.
Open PR with this fix View element graph__failure-evidence__/checkout-android/14-22

The escape hatches: --no-triage and KLERA_NO_TRIAGE=1.

Who klera is for

  • PMs and QA who already describe scenarios in tickets. Author in prose; let CI compile.
  • Engineers who maintain the suite. Get the matcher trace, the suspect commit, and the source-link denormalisation on every failure.
  • Platform teams standardising test infra across multiple Expo / React Native apps. Ship OpenTelemetry to your existing observability stack with one env var.

If your team writes tests in JavaScript or Swift today, klera is a declarative-only system by design — there is no JS / Swift authoring path. Code-based steps remain the territory of Detox and Appium.

Next steps

Last updated on