Semantic Geometry is the foundational feature of Sentience SDK that helps AI agents perceive and interact with web pages. Ordinality and layout support extend this by enabling agents to understand element positions when users specify goals like "click the first result", "select the 2nd item", or "choose the first product with rating >= 4".
When users give AI agents positional instructions, the agent needs to know:
Sentience SDK solves this with two complementary features:
Layout detection is the critical missing link that allows an LLM to reliably answer "list + constraint" queries like "Find the first product with rating >= 4".
Without layout detection, an LLM sees a flat soup of text nodes and has to "guess" which rating belongs to which product based on proximity. With layout support, that soup transforms into structured objects, making the task trivial.
The biggest challenge for an LLM on a web page is knowing boundaries between items.
Without Layout: The LLM sees a text sequence like:
...[Product A]... [Price]... [3.5 Stars]... [Product B]...
It might incorrectly assume "3.5 stars" belongs to Product B if the DOM structure is messy.
With Layout: You can present the LLM with structured objects:
{
"grid_id": 101,
"label": "Product Card",
"children": [
{ "text": "Wireless Headphones", "role": "title" },
{ "text": "4.0", "role": "rating" }
]
}
The rating is explicitly linked to its parent product - no guessing required.
"First" is ambiguous in a responsive grid.
Without Layout: DOM order often differs from visual order (e.g., in masonry layouts or flex-direction columns). The "first" product in the DOM might actually be in the top-right corner visually.
With Layout: The grid detection algorithm sorts items by visual rows first, then columns. This guarantees that "first" means "top-left," matching human intuition.
| Feature | Benefit for "List + Constraints" |
|---|---|
| Dominant Group | Filters out noise (nav, footer) so the LLM only checks the relevant list |
| Container Inference | Solves the "Association Problem" - knowing which price/rating belongs to which item |
| Grid Sorting | Solves the "Ordering Problem" - correctly identifying the "first" item visually |
The SDK automatically identifies the main content group on a page. This is typically the primary list or grid that users want to interact with (search results, product listings, article feeds).
Each element gets a group_key that indicates which visual group it belongs to. The most common group is marked as the dominant_group_key in the snapshot.
from sentience import SentienceBrowser, snapshot
with SentienceBrowser() as browser:
browser.page.goto("https://news.ycombinator.com")
snap = snapshot(browser)
# The dominant group is the main content area
print(f"Dominant group: {snap.dominant_group_key}")
# Find elements in the dominant group
main_items = [e for e in snap.elements if e.in_dominant_group]
print(f"Found {len(main_items)} items in main content")Each element in a group has a group_index (0-based position). This enables selecting elements by ordinal position:
# Get elements sorted by position in the dominant group
dominant_elements = sorted(
[e for e in snap.elements if e.in_dominant_group],
key=lambda e: e.group_index or 0
)
# Select by position
first_item = dominant_elements[0] # "click the first result"
second_item = dominant_elements[1] # "select the 2nd item"
last_item = dominant_elements[-1] # "click the last one"
print(f"First item: {first_item.text}")
print(f"Last item: {last_item.text}")Each element includes position data for ordinal selection:
| Field | Type | Description |
|---|---|---|
center_x | number | X coordinate of element center (viewport-relative) |
center_y | number | Y coordinate of element center (viewport-relative) |
doc_y | number | Absolute Y position in document (includes scroll offset) |
group_key | string | Geometric bucket key for grouping (format: x{bucket}-h{bucket}) |
group_index | number | Position within group (0-indexed, sorted by doc_y) |
in_dominant_group | boolean | Whether element is in the main content group |
href | string | Hyperlink URL (for link elements) |
Layout detection provides detailed grid and region information for complex page structures.
Elements may include a layout field with geometric metadata:
| Field | Type | Description |
|---|---|---|
grid_id | number | Unique ID for the grid this element belongs to |
grid_pos | GridPosition | Row and column indices (0-based) |
parent_index | number | Index of inferred parent element in the elements array |
children_indices | number[] | List of child element indices (capped at 30) |
region | string | Page region: header, nav, main, aside, or footer |
grid_confidence | number | Confidence score for grid assignment (0.0-1.0) |
Get bounding boxes and metadata for detected grids:
from sentience import SentienceBrowser, snapshot
with SentienceBrowser() as browser:
browser.page.goto("https://example.com/products")
snap = snapshot(browser)
# Get all detected grids
all_grids = snap.get_grid_bounds()
for grid in all_grids:
print(f"Grid {grid.grid_id}: {grid.item_count} items")
print(f" Size: {grid.row_count} rows x {grid.col_count} cols")
print(f" Position: ({grid.bbox.x}, {grid.bbox.y})")
print(f" Label: {grid.label}") # e.g., "product_grid", "search_results"
print(f" Is dominant: {grid.is_dominant}")
# Get a specific grid by ID
main_grid = snap.get_grid_bounds(grid_id=0)
if main_grid:
grid = main_grid[0]
print(f"Main grid: {grid.bbox.width}x{grid.bbox.height} pixels")| Property | Type | Description |
|---|---|---|
grid_id | number | Unique identifier for the grid |
bbox | BBox | Bounding box (x, y, width, height) in document coordinates |
row_count | number | Number of rows in the grid |
col_count | number | Number of columns in the grid |
item_count | number | Total number of items in the grid |
label | string | null | Inferred semantic label (see below) |
is_dominant | boolean | Whether this is the main content grid |
The SDK automatically infers grid labels based on content patterns:
| Label | Detected When |
|---|---|
product_grid | Price patterns ($, €, £), "Add to cart", ratings |
search_results | Snippets, ellipses, mostly links |
article_feed | Timestamps ("2 hours ago"), bylines, dates |
navigation | Short text, homogeneous links, nav keywords |
button_grid | All elements are buttons |
link_list | 80%+ of elements are links |
Access individual element positions within a grid:
# Access element layout data
for elem in snap.elements:
if elem.layout and elem.layout.grid_id is not None:
print(f"Element '{elem.text}' is in grid {elem.layout.grid_id}")
if elem.layout.grid_pos:
row = elem.layout.grid_pos.row_index
col = elem.layout.grid_pos.col_index
print(f" Position: row {row}, column {col}")
if elem.layout.region:
print(f" Region: {elem.layout.region}")from sentience import SentienceBrowser, snapshot, click
with SentienceBrowser() as browser:
browser.page.goto("https://google.com")
# ... perform search ...
snap = snapshot(browser)
# Find all items in the dominant group (search results)
results = sorted(
[e for e in snap.elements if e.in_dominant_group],
key=lambda e: e.group_index or 0
)
if results:
first_result = results[0]
click(browser, first_result.id)
print(f"Clicked: {first_result.text}")# Find product at row 1, column 2 (0-indexed)
target_row, target_col = 1, 2
for elem in snap.elements:
if elem.layout and elem.layout.grid_pos:
pos = elem.layout.grid_pos
if pos.row_index == target_row and pos.col_index == target_col:
print(f"Found product at ({target_row}, {target_col}): {elem.text}")
click(browser, elem.id)
break# Get only elements in the main content area
main_elements = [
e for e in snap.elements
if e.layout and e.layout.region == "main"
]
# Get navigation links
nav_links = [
e for e in snap.elements
if e.layout and e.layout.region == "nav"
]
print(f"Main content: {len(main_elements)} elements")
print(f"Navigation: {len(nav_links)} links")layout field is optional and may not be present in all snapshotschildren_indices is capped at 30 elements to prevent large payloadsgrid_confidence, region_confidence) indicate detection reliability