Docs/SDK Overview

Sentience SDK Overview

Sentience is like Jest for AI web agents. It provides semantic snapshots and assertions so agents can act deterministically and verify results — without vision models or brittle CSS selectors.

The SDK runs standalone with Playwright or embeds under agent frameworks like browser-use. Gateway API is optional — the core features work locally with just the browser extension.

How It Works

Sentience operates in two modes:

Mode A: Local (Extension-Only)

  1. SDK calls the browser extension to collect raw + semantic geometry
  2. SDK produces a snapshot prompt (structured text, ~0.6–1.2k tokens)
  3. Agent selects an action (click, type, scroll)
  4. SDK executes via the backend protocol
  5. SDK runs assert_ checks and records the trace

No server calls required. Works offline and with local LLMs.

Mode B: Gateway (Optional)

Same flow as above, but between steps 1 and 2:

Gateway is optional — use it for production accuracy boosts, not required for development.

What's in a Semantic Snapshot

Each snapshot contains elements with these key fields:

FieldDescription
idStable element identifier
roleSemantic role (button, link, input, etc.)
textNormalized visible text
importanceRelevance score (0–100)
bboxBounding box coordinates
doc_y / center_yDocument and viewport Y positions
group_keyGeometric bucket for ordinal grouping
group_indexPosition within group (0-indexed)
in_dominant_groupWhether element is in main content area
in_viewportCurrently visible on screen
is_occludedCovered by overlay/modal
hrefLink URL (for anchor elements)

Compact format example:

ID|role|text|imp|docYq|ord|DG|href
42|button|Add to Cart|95|480|0|true|
43|link|Product Details|70|520|1|true|/products/123
44|button|Buy Now|90|480|2|true|
45|link|Reviews|60|560|3|true|#reviews
46|input|Search|80|100|0|false|

This structured format lets LLMs make deterministic selections without parsing raw HTML.

Token Usage

Semantic snapshots are designed for token efficiency:

Why it helps:

Token savings vary by page complexity, but the bounded format consistently keeps prompts small enough for local LLMs like Qwen 2.5 3B.

Assertions (assert_)

Assertions are machine-verifiable checks over fresh snapshots. They help agents know when they're "done" and prevent drift.

Available assertions:

AssertionDescription
assert_(selector)Verify element exists and is visible
assert_(selector, text="...")Verify element contains expected text
assert_done(condition)Mark task as complete when condition met
url_matches(pattern)Check current URL matches regex
url_contains(substring)Check URL contains string
exists(selector)Element exists in snapshot

Planned assertions:

Example usage:

await runtime.snapshot()
runtime.assert_(exists("role=button"), label="has_buttons")
runtime.assert_(url_contains("checkout"), label="on_checkout")

if runtime.assert_done(exists("text~'Order confirmed'"), label="complete"):
    print("Task finished!")

browser-use Integration

Sentience integrates with browser-use via a backend adapter, enabling:

Installation:

pip install "sentience[browser-use]"

Usage:

from browser_use import BrowserSession, BrowserProfile
from sentience import get_extension_dir
from sentience.backends import BrowserUseAdapter
from sentience.agent_runtime import AgentRuntime

# Setup browser-use with Sentience extension
profile = BrowserProfile(args=[f"--load-extension={get_extension_dir()}"])
session = BrowserSession(browser_profile=profile)
await session.start()

# Create backend adapter
adapter = BrowserUseAdapter(session)
backend = await adapter.create_backend()

# Create runtime for assertions
runtime = AgentRuntime(backend=backend, tracer=tracer)
await runtime.snapshot()
runtime.assert_(exists("role=button"), label="ready")

See the browser-use integration guide for full details.

Live demos and examples:

Traces and Replay

Every agent run produces a trace containing:

Traces enable:

Traces can be stored locally (JSON/JSONL) or uploaded to Sentience Studio for visual debugging.

Open Source Repositories

Next Steps