Python SDK Reference

View on GitHub

Complete API reference for the Sentience Python SDK. All functions with parameters, return values, and examples.

Browser Setup

↑ Back to top

SentienceBrowser(api_key, api_url, headless)

Creates a browser instance with the Sentience extension loaded.

Parameters:

  • api_key (str, optional): Your SentienceAPI subscription key
  • api_url (str, optional): Custom API endpoint
  • headless (bool, optional): Run browser in headless mode

Example:

# Free tier (local processing only)
browser = SentienceBrowser(headless=False)

# Pro tier (server-side processing)
browser = SentienceBrowser(api_key="sk_...", headless=True)

# Context manager (auto-closes)
with SentienceBrowser(api_key="sk_...") as browser:
    browser.page.goto("https://example.com")
    # Browser automatically closes when done

Snapshot API

↑ Back to top

snapshot(browser, screenshot, limit, filter, use_api)

Captures the current page state and returns all interactive elements with semantic information.

Credit Consumption

When api_key is provided, this calls the server-side /v1/snapshot endpoint which consumes 1 credit per call (metered billing).

Parameters:

  • browser (SentienceBrowser): Browser instance
  • screenshot (bool or dict, optional): Capture screenshot. True for PNG, or {"format": "jpeg", "quality": 80}
  • limit (int, optional): Maximum elements to return (default: 50 server, all local)
  • filter (dict, optional): Filter options (min_area, allowed_roles, min_z_index)
  • use_api (bool, optional): Force server API (True) or local (False)

Returns:

Snapshot object with: elements, url, viewport, timestamp, screenshot

Example:

# Basic snapshot
snap = snapshot(browser)

# With screenshot and limit
snap = snapshot(browser, screenshot=True, limit=200)

# Force local processing (no credits)
snap = snapshot(browser, use_api=False)

# With filtering
snap = snapshot(browser, filter={
    "min_area": 100,
    "allowed_roles": ["button", "link"]
})

Element Properties:

  • id: Unique identifier for clicking
  • role: Semantic role (button, link, textbox, etc.)
  • text: Visible text content
  • importance: AI importance score (0-1000)
  • bbox: Bounding box (x, y, width, height)
  • visual_cues: Visual analysis (is_primary, is_clickable, background_color)
  • in_viewport: Is element visible?
  • is_occluded: Is element covered?

find(snapshot, selector)

Finds the single best matching element (by importance score).

Example:

button = find(snap, "role=button text~'Submit'")
if button:
    print(f"Found: {button.text} (importance: {button.importance})")

query(snapshot, selector)

Finds all matching elements (returns list).

Example:

all_buttons = query(snap, "role=button")
print(f"Found {len(all_buttons)} buttons")

Query DSL Operators

=Exact match: "role=button"
!=Not equal: "role!=link"
~Contains: "text~'sign in'"
^=Starts with: "text^='Add'"
$=Ends with: "text$='Cart'"
>/<Comparisons: "importance>500"

Query Examples:

# Find by role
button = find(snap, "role=button")

# Find by text (contains, case-insensitive)
link = find(snap, "role=link text~'more info'")

# Multiple conditions (AND)
primary_btn = find(snap, "role=button clickable=true importance>800")

# Spatial filtering
top_left = find(snap, "bbox.x<=100 bbox.y<=200")

# Visibility checks
visible_link = find(snap, "role=link visible=true in_viewport=true")

Action API

↑ Back to top

click(browser, element_id, use_mouse, take_snapshot)

Clicks an element by ID. Uses realistic mouse simulation by default.

Parameters:

  • browser (SentienceBrowser): Browser instance
  • element_id (int): Element ID from snapshot
  • use_mouse (bool): Use Playwright mouse.click() (default) or JS click
  • take_snapshot (bool): Capture snapshot after click

Returns:

ActionResult with: success, duration_ms, outcome, url_changed

Example:

result = click(browser, button.id)
if result.success:
    print(f"Click succeeded: {result.outcome}")
    if result.url_changed:
        print(f"Navigated to: {browser.page.url}")

click_rect(browser, rect, highlight, highlight_duration, take_snapshot)

Clicks at the center of a rectangle. Shows visual feedback (red border).

Example:

# Click at specific coordinates
click_rect(browser, {"x": 100, "y": 200, "w": 50, "h": 30})

# Click using element's bounding box
snap = snapshot(browser)
element = find(snap, "role=button")
click_rect(browser, {
    "x": element.bbox.x,
    "y": element.bbox.y,
    "w": element.bbox.width,
    "h": element.bbox.height
})

# Without highlight (for headless/CI)
click_rect(browser, {"x": 100, "y": 200, "w": 50, "h": 30}, highlight=False)

type_text(browser, element_id, text, take_snapshot)

Types text into an input field.

Example:

# Find input and type
snap = snapshot(browser)
email_input = find(snap, "role=textbox")
type_text(browser, email_input.id, "user@example.com")

press(browser, key)

Presses a keyboard key (Enter, Escape, Tab, etc.).

Example:

press(browser, "Enter")   # Submit form
press(browser, "Escape")  # Close modal

wait_for(browser, selector, timeout, interval, use_api)

Waits for an element matching the selector to appear. Auto-optimizes polling interval based on API usage.

Parameters:

  • browser (SentienceBrowser): Browser instance
  • selector (str): Query DSL selector
  • timeout (float): Maximum time to wait in seconds (default: 10.0)
  • interval (float, optional): Polling interval (auto: 0.25s local, 1.5s remote)
  • use_api (bool, optional): Force server API or local extension

Returns:

WaitResult with: found, element, duration_ms, timeout

Example:

# Wait for button to appear
result = wait_for(browser, "role=button text~'Submit'", timeout=5.0)
if result.found:
    print(f"Found after {result.duration_ms}ms")
    click(browser, result.element.id)

# Wait for clickable element
result = wait_for(browser, "clickable=true", timeout=10.0)

# Wait with custom interval (local processing)
result = wait_for(browser, "role=button", timeout=5.0, interval=0.5, use_api=False)

Expect API (Assertions)

↑ Back to top

expect(browser, selector)

Creates an assertion builder for fluent testing.

Methods:

  • .to_exist(timeout=5.0): Assert element exists
  • .to_be_visible(timeout=5.0): Assert element is visible
  • .to_have_text(text, timeout=5.0): Assert element contains text
  • .to_have_count(n, timeout=5.0): Assert query returns N elements

Example:

# Assertions
expect(browser, "role=button text='Submit'").to_exist(timeout=5.0)
expect(browser, "role=heading").to_be_visible()
expect(browser, "role=button").to_have_text("Submit")
expect(browser, "role=link").to_have_count(10)

Content Reading API

↑ Back to top

read(browser, format, enhance_markdown)

Extracts page content as text, markdown, or raw HTML.

Parameters:

  • format (str): Output format - "raw", "text", or "markdown" (default)
  • enhance_markdown (bool): Use markdownify for better conversion

Example:

# Get markdown content
result = read(browser, format="markdown")
print(result["content"])

# Get plain text
result = read(browser, format="text")
print(result["content"])

# Get raw HTML
result = read(browser, format="raw")
html = result["content"]

Screenshot API

↑ Back to top

screenshot(browser, format, quality)

Captures a screenshot of the current page.

Parameters:

  • format (str): Image format - "png" or "jpeg"
  • quality (int): JPEG quality (1-100, default: 80)

Example:

import base64

# Capture PNG screenshot
data_url = screenshot(browser, format="png")

# Save to file
image_data = base64.b64decode(data_url.split(",")[1])
with open("screenshot.png", "wb") as f:
    f.write(image_data)

# JPEG with quality control
data_url = screenshot(browser, format="jpeg", quality=85)

Next Steps