Rendered DOM snapshots (post-hydration), ordinality, and Jest-style assertions for AI web agents.
"Think Jest for AI web agents — deterministic selection, pass/fail verification, explainable failures."
Not HTML parsing. Sentience snapshots the rendered DOM + layout from a live browser after SPA hydration. It works on JS-heavy sites because it captures post-hydration state.
Local-first by default. When the page is unstable, we resnapshot, then reset to checkpoint. Only if snapshot attempts are exhausted do we use optional vision fallback or model escalation.
Most tools help agents load pages or read content.
SentienceAPI helps agents act — and verify — reliably.
Modern LLMs are good at deciding what they want to do.
They are unreliable at deciding where to do it — and whether it worked.
Sentience fixes this with semantic snapshots, ordinal selection, and built-in assertions.
Failure modes are explicit: resnapshot → reset → fallback → escalate.
Sentience works entirely locally with the browser extension, or you can add the optional Gateway for ML-powered reranking.
Extension-only • Free
Works on SPAs because snapshots are taken from the live rendered page, not static HTML.
Best for: Development, testing, cost-sensitive production, frameworks like browser-use
Cloud API • Pro/Enterprise
Best for: Production agents, maximum accuracy, team collaboration, observability
Each snapshot contains ~0.6–1.2k tokens per step — enough for your LLM to make deterministic decisions.
With structured snapshots, 3B-class models become viable. Larger models (7B/14B+) still help with planning and recovery, but they're no longer required just to operate a browser.
{
"id": 42,
"role": "button",
"text": "Add to Cart",
"importance": 95,
"bbox": { "x": 320, "y": 480, "width": 120, "height": 40 },
"in_viewport": true,
"is_occluded": false,
"visual_cues": {
"is_primary": true,
"is_clickable": true,
"background_color_name": "green"
},
// Ordinal fields for "click 3rd result" queries
"group_key": "480_main",
"group_index": 2,
"in_dominant_group": true,
// Layout detection
"layout": {
"grid_id": 1,
"grid_pos": { "row_index": 0, "col_index": 2 },
"region": "main"
}
}{
"snapshot_confidence": 0.92,
"stability_reasons": [], // empty = stable
// On unstable pages:
// "stability_reasons": ["dom_unstable", "layout_shifting"]
}id – stable element identifierrole – button, link, input, etc.text – visible labelbbox – exact pixel coordinatesin_viewport – currently visibleis_occluded – covered by overlayimportance – relevance scoreis_primary – main CTAgroup_key – geometric bucketgroup_index – position in groupin_dominant_group – main contentgrid_pos – row/column indices~0.6–1.2k tokens per snapshot — compare to 10–50k+ tokens for vision-based approaches
Your agent (LLM + logic) decides what it wants to do:
Sentience does not replace planning or reasoning.
Using the Sentience SDK (Python or TypeScript), your agent:
snapshot()This snapshot is not raw HTML. Not screenshots by default. It's the rendered DOM + layout signals captured from the live page after hydration.
A lightweight browser extension captures from the rendered page:
No inference. No guessing. Ground truth from the live browser.
Your agent selects a target from the snapshot and executes:
click("Add to Cart")type("search input", "query")group_index=2 for "3rd result"row_index=0, col_index=2Actions execute exactly where intended — no coordinate guessing.
When actions fail or pages are unstable, Sentience follows a deterministic escalation path:
Assertions remain the verifier throughout — pass or fail, not "maybe."
Like Jest for web automation — assert expectations, not hope:
assert_("Order confirmed") — verify text appearsassert_("cart badge", text="3") — verify contentassert_done("checkout complete") — task completionAssertions use the same semantic snapshot — deterministic, traceable verification.
Want to see this in action?
Run a live example using the Sentience SDK — no setup required.👉Try it live
Browser infrastructure gives you a place to run code.
Sentience gives your agent certainty about where to act.
Without grounded action selection, agents still guess.
Scrapers parse static HTML. Sentience snapshots the rendered DOM after SPA hydration.
Scrapers don't tell agents:
Reading ≠ acting. Sentience is for agents that must interact.
Already using browser-use? Sentience integrates seamlessly via BrowserUseAdapter — just swap your backend and get semantic snapshots + assertions.
Instead of pixels (~10–50k tokens) or raw DOM, your agent gets a compact semantic snapshot:
Every step is recorded automatically — use local JSON traces or Sentience Studio:
These traces power:
When something fails, you get a reasoned failure artifact — not a vague LLM apology.
If your agent only reads text, Sentience is unnecessary.
If your agent must click, type, scroll, or submit — Sentience is the missing layer.
If you're building agents that must act, SentienceAPI is the missing layer.
Explore interactive SDK examples or test the API directly with real automation scenarios
Navigate to a login page, find email/password fields semantically, and submit the form.
1# No selectors. No vision. Stable semantic targets.
2from sentience import SentienceBrowser, snapshot, find, click, type_text, wait_for
3
4# Initialize browser with API key
5browser = SentienceBrowser(api_key="sk_live_...")
6browser.start()
7
8# Navigate to login page
9browser.page.goto("https://example.com/login")
10
11# PERCEPTION: Find elements semantically
12snap = snapshot(browser)
13email_field = find(snap, "role=textbox text~'email'")
14password_field = find(snap, "role=textbox text~'password'")
15submit_btn = find(snap, "role=button text~'sign in'")
16
17# ACTION: Interact with the page
18type_text(browser, email_field.id, "user@example.com")
19type_text(browser, password_field.id, "secure_password")
20click(browser, submit_btn.id)
21
22# VERIFICATION: Wait for navigation
23wait_for(browser, "role=heading text~'Dashboard'", timeout=5.0)
24
25print("✅ Login successful!")
26browser.close()Find elements by role, text, and visual cues - not fragile CSS selectors
Intelligent filtering reduces token usage by up to 73% vs vision models
Same input produces same output every time - no random failures
SentienceAPI focuses on execution intelligence. Browser runtimes and navigation engines are intentionally decoupled.