Agent commerce
Sell throughSell through Claude, ChatGPT, Operator, Perplexity, Gemini, and whoever's next.
A real Claude agent shops your store. We show you the exact step where it quits — the variant selector, the cart, the checkout — so the next customer doesn’t walk.
serge · new journey test
Test prompt
Compare the 2 best rated black leather backpacks on yourstore.com that are in stock and available for flash delivery.
Test #4127
yourstore.comyourstore.com/
okNavigated to homepage
yourstore.com/search?q=black+leather+backpack
okSearched ‘black leather backpack’
yourstore.com/search?q=black+leather+backpack&availability=in_stock
okFiltered to in-stock items
yourstore.com/products/commuter-backpack-20l
warnOpened Commuter Backpack 20L
yourstore.com/products/commuter-backpack-20l#variant
failTried to select size ‘L’
Agent gave up here — Variant selector has no accessible name — Claude can’t tell it’s a dropdown.
- <div onClick={selectSize}>Size</div>
+ <select aria-label='Size'>…</select>
What is Serge
Serge is the journey-test product for AI agent shoppers.
Serge runs real AI agents — Claude, ChatGPT, Operator, Perplexity, Gemini — on your live e-commerce storefront. You type a buying task in plain English, Serge dispatches the agent, and the report comes back in about 5 seconds with screenshots, the exact step the agent quit at, and a paste-ready fix snippet.
Two-way mirror · same page, two views
The page your shopper sees.The page an agent sees.
A product page is two documents at once. The human view renders for the eye. The agent view renders for the accessibility tree. Serge measures the gap between them.
Commuter Backpack 20L
Water-resistant · padded sleeve · two outer pockets
<img alt="?">missing alt
no schema.org/ImageObject
<h1>Commuter Backpack 20L</h1><p>Water-resistant · padded sleeve · two outer pockets</p><data value="79">CHF 79</data><div onClick>Add to cart</div>↑ not a button · agent skips
<link rel="canonical" href="https://yourstore.com/products/laptop-x-15" />
Every issue flagged on the right is the reason the agent fails to complete the task the human started.
The cockpit
Sessions are noise.Issues are what you fix.
Every failing journey test, every recurring blocker across tests, rolls up into a single object: an Issue. Ranked by revenue at risk, scoped to the platform and page where it bites, with the fix already drafted.
Demo data · Substrate sample store
12 open issues from this week’s journey tests
Ranked by estimated revenue at risk
Issue
Affected
At risk
Status
Variant selector not exposed as an accessible control
PDPs · ChatGPT + Claude
47 failing tests
CHF 12,400
Fix available · awaiting verification
Reproduced 3×
Variant selector not exposed as an accessible control
PDPs · ChatGPT + Claude
Affected: 47 failing testsAt risk: CHF 12,400Fix available · awaiting verification · Reproduced 3×
Cart total renders client-side after a 600ms hydration gap
/cart · all agents
31 failing tests
CHF 8,900
Triaged · fix queued
Reproduced 2×
Cart total renders client-side after a 600ms hydration gap
/cart · all agents
Affected: 31 failing testsAt risk: CHF 8,900Triaged · fix queued · Reproduced 2×
Search results page returns inventory only via XHR after first paint
/search · ChatGPT
24 failing tests
CHF 5,300
Detected · pending replay
Reproduced 1×
Search results page returns inventory only via XHR after first paint
/search · ChatGPT
Affected: 24 failing testsAt risk: CHF 5,300Detected · pending replay · Reproduced 1×
+ 9 more issues this week
Open the cockpit →
How it works
How an Agent Journey Test runs.
Three steps from prompt to fix. No staging environment, no scripted scenarios, no recorded sessions — a real agent runs against your live storefront on demand.
Type the prompt
Tell Serge the buying task in plain English. "Find a black leather backpack under 150 and add it to the cart." Pick the agent (Claude, ChatGPT, Operator) and click Run.
No scripted selectors. No scenario files. No staging environment.
Watch the agent try
The agent runs against your live storefront in real time. Serge streams every step — the URL navigated to, the elements the agent tried, the accessibility tree it parsed, the exact moment it quit.
Streams the result in ~5 seconds. Screenshots per step.
Ship the fix
Each failure carries a paste-ready fix snippet — replace `<div onClick>` with `<select aria-label>`, expose role/state on the variant selector, add machine-readable inventory. Re-run the test to verify the agent now succeeds.
Re-run loop closes the failure → fix → verify cycle in minutes.
Versus how you test today
Your existing test stackcan't see what AI agents do.
How teams check agent behavior today
All of them have a place. None of them run a real AI agent on your live storefront.
| Today's tooling | ||
|---|---|---|
| What it tests | Humans clicking, or a hand-coded test script with fixed selectors | Real AI agents — Claude, ChatGPT, Operator, Perplexity, Gemini — on your live storefront |
| What it catches | Bugs a human or a script can reproduce | Failures that only happen because an agent reads the DOM differently — no role, no accessible name, no keyboard |
| Where it runs | Staging, CI, recorded human sessions — not where the agent actually shops | Your live storefront, on demand, every release |
| Agent coverage | None — none of these tools run an AI agent | Every major agent platform, side by side, on the same prompt |
| What you ship after | A bug ticket, queued for next sprint | A fix snippet ready to paste + a re-run that confirms the agent now completes the task |
| When you find the failure | After a customer complains — or never | Before the next AI shopper arrives |
Why this is different
Manual QA tests what humans do. E2E scripts test what your selectors do. Session replay records what humans did. None of them test what an AI agent does when a real customer asks it to buy from your store. Serge does — with the actual agents on the actual page.
Adjacent categories Serge does not replace · Accessibility audit (axe / Deque) · GEO visibility (Athena, Profound) · attribution (Dreamdata, HockeyStack)
Pricing
Start with one store,one journey test, one clear failure.
Public pricing for teams running journey tests on their own storefronts. Larger retailers can start with a hands-on pilot while the product is still early.
Free
Free
Scan your store, run one sponsored journey test.
Pro
CHF 159 / mo
Daily journey tests, alerts, 12-month retention, PAYG add-ons.
Agency
Contact
Multi-workspace, multi-site, extra seats, distribution deals.
Pilot program
Want hands-on setup, weekly reviews, or custom reporting?
A small pilot is available for teams that want founder support while the product is still early — setup help, live replay walkthroughs, and tighter feedback loops than self-serve.
Talk to us about a pilotFull tier details and feature comparison → /pricing
Stop losing AI-assisted shoppers because your storefront isn't agent-readable.
Run a real journey test on your store.See the failure. Ship the fix in the same meeting.
Start by running a journey test on your live storefront. Book the founder walkthrough when you want a multi-task pilot, a shared cockpit for the team, and the first fix list out of the same meeting.
What you leave with
A real agent trace on your store, the first structural blockers ranked by revenue at risk, and a clear answer on whether to wire Serge into your stack now.
Founder walkthrough
01
Run a live journey test on your site.
02
Inspect the blockers — replay, reasoning, fix.
03
Leave with the first fix list and the install path.
FAQ