Docs/SDK/Pydantic AI Integration

Sentience × PydanticAI (Python): User Manual

This guide shows:

PydanticAI docs: Pydantic AI

Use PydanticAI as the orchestration layer while keeping Sentience as the browser capability layer with typed tools, bounded context, and action verification.

Table of Contents

  1. What You Get
  2. Installation
  3. Integration Surface
  4. Concept: Dependency Injection
  5. Tool Reference
  6. What Each Tool Is For
  7. Quickstart: PydanticAI User Adds Sentience
  8. Example: Typed Extraction
  9. Example: Self-Correcting Click with Guard
  10. Example: Navigate → Snapshot → Scroll → Click
  11. Example: Clicking by Text Coordinates
  12. Tracing (Local + Cloud)
  13. Troubleshooting

What You Get


Installation

From the Python SDK:

pip install sentienceapi[pydanticai]

Integration Surface

Sentience provides a small integration layer:

Imports:

from sentience import AsyncSentienceBrowser
from sentience.integrations.pydanticai import SentiencePydanticDeps, register_sentience_tools

Concept: Dependency Injection

PydanticAI passes dependencies through ctx.deps. We inject:

deps = SentiencePydanticDeps(browser=browser, tracer=tracer)
result = await agent.run("...", deps=deps)

Tool Reference

Registered tools include:

Observe

ToolDescription
snapshot_state(limit=50, include_screenshot=False)Bounded BrowserState(url, elements[])
read_page(format="text"|"markdown"|"raw")Returns ReadResult

Act

ToolDescription
click(element_id)Click a specific element by ID
type_text(element_id, text)Type text into element
press_key(key)Send a keypress (e.g., "Enter")
scroll_to(element_id, behavior, block)Scroll element into view
navigate(url)Navigate to URL
click_rect(x, y, width, height, button, click_count)Click by pixel coordinates

Locate by Text

ToolDescription
find_text_rect(text, case_sensitive=False, whole_word=False, max_results=10)Find text coordinates on page

Verify / Guard

ToolDescription
verify_url_matches(pattern)Check URL contains pattern
verify_text_present(text, format, case_sensitive)Check text appears on page
assert_eventually_url_matches(pattern, timeout_s, poll_s)Wait for URL to match pattern

Notes:


What Each Tool Is For

Observe

snapshot_state(...)

read_page(...)

Act

Async vs Sync (Important)

click(element_id)

type_text(element_id, text)

press_key(key)

scroll_to(element_id, ...)

click_rect(x, y, width, height, ...)

Verify / Guard (How to Make Agents Reliable)

These are best used after an action to confirm the browser is now in the expected state.

verify_url_matches(pattern)

verify_text_present(text, ...)

assert_eventually_url_matches(pattern, timeout_s=..., poll_s=...)


Quickstart: PydanticAI User Adds Sentience

This is the minimal working pattern:

import asyncio
from pydantic import BaseModel
from pydantic_ai import Agent

from sentience import AsyncSentienceBrowser
from sentience.integrations.pydanticai import SentiencePydanticDeps, register_sentience_tools


class PageSummary(BaseModel):
  url: str
  headline: str


async def main():
  browser = AsyncSentienceBrowser(headless=False)
  await browser.start()
  await browser.page.goto("https://example.com")

  agent = Agent(
      "openai:gpt-5",
      deps_type=SentiencePydanticDeps,
      output_type=PageSummary,
      instructions="Use the Sentience tools to read the page and return a typed summary.",
  )
  register_sentience_tools(agent)

  deps = SentiencePydanticDeps(browser=browser)
  result = await agent.run("Return the url and the main headline.", deps=deps)
  print(result.output)

  await browser.close()


if __name__ == "__main__":
  asyncio.run(main())

Example: Typed Extraction

This pattern is ideal when you care about validated structured data.

See also: sdk-python/examples/pydantic_ai/pydantic_ai_typed_extraction.py

High-level approach:


Example: Self-Correcting Click with Guard

See also: sdk-python/examples/pydantic_ai/pydantic_ai_self_correcting_click.py

Pattern:


Example: Navigate → Snapshot → Scroll → Click

This is a common "reliable interaction" sequence when the target element is off-screen:

Concrete (copy/paste) example:

import asyncio
from pydantic_ai import Agent

from sentience import AsyncSentienceBrowser
from sentience.integrations.pydanticai import SentiencePydanticDeps, register_sentience_tools


async def main():
  browser = AsyncSentienceBrowser(headless=False)
  await browser.start()

  agent = Agent(
      "openai:gpt-5",
      deps_type=SentiencePydanticDeps,
      output_type=str,
      instructions=(
          "Use these tools in order: "
          "navigate(url), snapshot_state(), scroll_to(element_id), click(element_id), "
          "then assert_eventually_url_matches(...) if navigation is expected."
      ),
  )
  register_sentience_tools(agent)

  deps = SentiencePydanticDeps(browser=browser)
  result = await agent.run(
      "Go to https://example.com, find a link, scroll to it if needed, click it, and confirm URL changed.",
      deps=deps,
  )
  print(result.output)

  await browser.close()


if __name__ == "__main__":
  asyncio.run(main())

Example: Clicking by Text Coordinates

Use find_text_rect("Sign In") when the best handle is visible text.

from pydantic_ai import Agent

# ... create browser + agent + register tools ...

# In your agent instructions, encourage:
# 1) find_text_rect("Sign In")
# 2) click_rect(...) using the returned coordinates

Concrete pattern:

Concrete (copy/paste) example (direct tool calls, no LLM decision-making):

import asyncio
from pydantic_ai import Agent

from sentience import AsyncSentienceBrowser
from sentience.integrations.pydanticai import SentiencePydanticDeps, register_sentience_tools


async def main():
  browser = AsyncSentienceBrowser(headless=False)
  await browser.start()
  await browser.goto("https://example.com")

  agent = Agent(
      "openai:gpt-5",
      deps_type=SentiencePydanticDeps,
      output_type=str,
      instructions="You may call Sentience tools, but the Python code will also demonstrate direct tool usage.",
  )
  tools = register_sentience_tools(agent)

  ctx = type("Ctx", (), {})()
  ctx.deps = SentiencePydanticDeps(browser=browser)

  # 1) Locate text on screen
  matches = await tools["find_text_rect"](ctx, "Sign In")
  if matches.status != "success" or not matches.results:
      raise RuntimeError(f"Text not found: {matches.error}")

  # 2) Click the first in-viewport match by rectangle
  m0 = next((m for m in matches.results if m.in_viewport), matches.results[0])
  await tools["click_rect"](
      ctx,
      x=m0.rect.x,
      y=m0.rect.y,
      width=m0.rect.width,
      height=m0.rect.height,
  )

  await browser.close()


if __name__ == "__main__":
  asyncio.run(main())

Notes:


Tracing & Observability

How Tracing Works

When you pass a tracer via SentiencePydanticDeps(..., tracer=tracer), each tool call emits structured trace events:

This gives you a clean, replayable timeline of what the agent actually did in the browser, separate from PydanticAI's orchestration layer.

Local vs Cloud Tracing

Sentience tracing supports two modes:

Local tracing writes JSONL to disk (JsonlTraceSink) for debugging and development:

from sentience import create_tracer
from sentience.integrations.pydanticai import SentiencePydanticDeps

# Create local tracer
tracer = create_tracer(run_id="pydanticai-demo")
deps = SentiencePydanticDeps(browser=browser, tracer=tracer)

result = await agent.run("...", deps=deps)

# Always close to flush events
tracer.close()

Cloud tracing (Pro/Enterprise) buffers JSONL locally and uploads once on tracer.close():

from sentience import create_tracer
from sentience.integrations.pydanticai import SentiencePydanticDeps

# Create cloud tracer
tracer = create_tracer(
  api_key="sk_pro_...",
  upload_trace=True,
  goal="PydanticAI + Sentience run",
  agent_type="PydanticAI",
)
deps = SentiencePydanticDeps(browser=browser, tracer=tracer)

result = await agent.run("...", deps=deps)

# Uploads trace on close
tracer.close()

Orchestration vs Browser Tracing

Key insight: Your framework (PydanticAI) owns LLM orchestration, while Sentience owns browser execution + structured state.

You can (and often should) instrument both:

This dual-layer observability gives you complete visibility into both what the agent decided and what it actually did in the browser.


Troubleshooting

IssueSolution
window.sentience is not availableEnsure the Sentience extension is loaded and injected into the Playwright session.
Tool calls succeed but nothing changesAdd guards: verify_url_matches, verify_text_present, and/or assert_eventually_url_matches.
Extraction is flakyPrefer read_page(format="markdown") for extraction and keep snapshot_state(limit=50) for interaction targeting.

Additional Resources


Last updated: January 2026