Agent commerce

Sell throughSell through Claude, ChatGPT, Operator, Perplexity, Gemini, and whoever's next.

A real Claude agent shops your store. We show you the exact step where it quits — the variant selector, the cart, the checkout — so the next customer doesn’t walk.

serge · new journey test

yourstore workspace

Test prompt

Compare the 2 best rated black leather backpacks on yourstore.com that are in stock and available for flash delivery.

AgentClaude · computer-use

Test #4127

yourstore.com
4.7s
  1. yourstore.com/

    ok

    Navigated to homepage

  2. yourstore.com/search?q=black+leather+backpack

    ok

    Searched ‘black leather backpack’

  3. yourstore.com/search?q=black+leather+backpack&availability=in_stock

    ok

    Filtered to in-stock items

  4. yourstore.com/products/commuter-backpack-20l

    warn

    Opened Commuter Backpack 20L

  5. yourstore.com/products/commuter-backpack-20l#variant

    fail

    Tried to select size ‘L’

    Agent gave up here Variant selector has no accessible name — Claude can’t tell it’s a dropdown.

    - <div onClick={selectSize}>Size</div>
    + <select aria-label='Size'>…</select>

What is Serge

Serge is the journey-test product for AI agent shoppers.

Serge runs real AI agents — Claude, ChatGPT, Operator, Perplexity, Gemini — on your live e-commerce storefront. You type a buying task in plain English, Serge dispatches the agent, and the report comes back in about 5 seconds with screenshots, the exact step the agent quit at, and a paste-ready fix snippet.

Two-way mirror · same page, two views

The page your shopper sees.The page an agent sees.

A product page is two documents at once. The human view renders for the eye. The agent view renders for the accessibility tree. Serge measures the gap between them.

type a domain · enter to scan

Commuter Backpack 20L

Water-resistant · padded sleeve · two outer pockets

CHF 79
Add to cart
<img alt="?">

missing alt

no schema.org/ImageObject

<h1>Commuter Backpack 20L</h1>
<p>Water-resistant · padded sleeve · two outer pockets</p>
<data value="79">CHF 79</data>
<div onClick>Add to cart</div>

↑ not a button · agent skips

<link rel="canonical" href="https://yourstore.com/products/laptop-x-15" />

◀ HUMAN VIEW
AGENT VIEW ▶

Every issue flagged on the right is the reason the agent fails to complete the task the human started.

The cockpit

Sessions are noise.Issues are what you fix.

Every failing journey test, every recurring blocker across tests, rolls up into a single object: an Issue. Ranked by revenue at risk, scoped to the platform and page where it bites, with the fix already drafted.

Demo data · Substrate sample store

12 open issues from this week’s journey tests

Ranked by estimated revenue at risk

  • Variant selector not exposed as an accessible control

    PDPs · ChatGPT + Claude

    Affected: 47 failing testsAt risk: CHF 12,400

    Fix available · awaiting verification · Reproduced 3×

  • Cart total renders client-side after a 600ms hydration gap

    /cart · all agents

    Affected: 31 failing testsAt risk: CHF 8,900

    Triaged · fix queued · Reproduced 2×

  • Search results page returns inventory only via XHR after first paint

    /search · ChatGPT

    Affected: 24 failing testsAt risk: CHF 5,300

    Detected · pending replay · Reproduced 1×

+ 9 more issues this week

Open the cockpit

How it works

How an Agent Journey Test runs.

Three steps from prompt to fix. No staging environment, no scripted scenarios, no recorded sessions — a real agent runs against your live storefront on demand.

  1. Type the prompt

    Tell Serge the buying task in plain English. "Find a black leather backpack under 150 and add it to the cart." Pick the agent (Claude, ChatGPT, Operator) and click Run.

    No scripted selectors. No scenario files. No staging environment.

  2. Watch the agent try

    The agent runs against your live storefront in real time. Serge streams every step — the URL navigated to, the elements the agent tried, the accessibility tree it parsed, the exact moment it quit.

    Streams the result in ~5 seconds. Screenshots per step.

  3. Ship the fix

    Each failure carries a paste-ready fix snippet — replace `<div onClick>` with `<select aria-label>`, expose role/state on the variant selector, add machine-readable inventory. Re-run the test to verify the agent now succeeds.

    Re-run loop closes the failure → fix → verify cycle in minutes.

Versus how you test today

Your existing test stackcan't see what AI agents do.

How teams check agent behavior today

Manual QAPlaywrightCypressHotjar / FullStory replayBrowserStackDatadog Synthetics

All of them have a place. None of them run a real AI agent on your live storefront.

 Today's toolingSerge
What it testsHumans clicking, or a hand-coded test script with fixed selectorsReal AI agents — Claude, ChatGPT, Operator, Perplexity, Gemini — on your live storefront
What it catchesBugs a human or a script can reproduceFailures that only happen because an agent reads the DOM differently — no role, no accessible name, no keyboard
Where it runsStaging, CI, recorded human sessions — not where the agent actually shopsYour live storefront, on demand, every release
Agent coverageNone — none of these tools run an AI agentEvery major agent platform, side by side, on the same prompt
What you ship afterA bug ticket, queued for next sprintA fix snippet ready to paste + a re-run that confirms the agent now completes the task
When you find the failureAfter a customer complains — or neverBefore the next AI shopper arrives

Why this is different

Manual QA tests what humans do. E2E scripts test what your selectors do. Session replay records what humans did. None of them test what an AI agent does when a real customer asks it to buy from your store. Serge does — with the actual agents on the actual page.

Adjacent categories Serge does not replace · Accessibility audit (axe / Deque) · GEO visibility (Athena, Profound) · attribution (Dreamdata, HockeyStack)

Pricing

Start with one store,one journey test, one clear failure.

Public pricing for teams running journey tests on their own storefronts. Larger retailers can start with a hands-on pilot while the product is still early.

Free

Free

Scan your store, run one sponsored journey test.

Pro

CHF 159 / mo

Daily journey tests, alerts, 12-month retention, PAYG add-ons.

Agency

Contact

Multi-workspace, multi-site, extra seats, distribution deals.

Pilot program

Want hands-on setup, weekly reviews, or custom reporting?

A small pilot is available for teams that want founder support while the product is still early — setup help, live replay walkthroughs, and tighter feedback loops than self-serve.

Talk to us about a pilot

Full tier details and feature comparison → /pricing

Stop losing AI-assisted shoppers because your storefront isn't agent-readable.

Run a real journey test on your store.See the failure. Ship the fix in the same meeting.

Start by running a journey test on your live storefront. Book the founder walkthrough when you want a multi-task pilot, a shared cockpit for the team, and the first fix list out of the same meeting.

What you leave with

A real agent trace on your store, the first structural blockers ranked by revenue at risk, and a clear answer on whether to wire Serge into your stack now.

Founder walkthrough

01

Run a live journey test on your site.

02

Inspect the blockers — replay, reasoning, fix.

03

Leave with the first fix list and the install path.

FAQ

Common questions