When AI Agents Shop

The modern e-commerce conversion playbook was built over two decades of A/B testing against human shoppers. Strike-through pricing, countdown timers, low-stock badges, "23 people are viewing this," voucher pop-ups at exit intent, bundle ladders — all of it was tuned, refined, and re-tuned against the same psychology: loss aversion, social proof, anchoring, urgency. It worked. Most Shopify themes still ship with at least three of these levers baked in.

That playbook is now running into a customer it wasn't designed for. New research published in Harvard Business Review this month tested the eight most common e-commerce persuasion mechanisms — including scarcity, countdown timers, strike-through pricing, vouchers, and bundles — across four leading language models and four common product categories. Across thousands of simulated shopping rounds, only two signals consistently moved AI agents: star ratings increased selection, and price decreased it. Almost everything else produced unstable, model-specific, sometimes negative results.

This matters because the share of e-commerce coming from AI shopping agents is no longer hypothetical. Morgan Stanley's AlphaWise team now projects agentic shoppers will capture 10% to 20% of U.S. e-commerce by 2030 — somewhere between $190 billion and $385 billion. About 23% of Americans already say they made an AI-assisted purchase in the past month. If the people you are paying to convert are increasingly not people, the conversion playbook needs a second draft.

At a Glance

$190–$385B projected U.S. agentic-commerce spending by 2030 — 10–20% of e-commerce share (Morgan Stanley)
~50% U.S. LLM adoption and 23% of Americans made an AI-assisted purchase in the past month (Morgan Stanley AlphaWise)
8 persuasion tactics tested across 4 LLMs (GPT-4.1-mini, GPT-5, Gemini 2.5 Pro, Gemini 2.5 Flash Lite) and 4 product categories (HBR, May 2026)
Only 2 signals behaved as designers expected: star ratings reliably increased selection; price reliably decreased it
Reasoning models were often the most skeptical of overt persuasion cues
2 protocols now competing for agentic traffic: OpenAI & Stripe's ACP vs. Google, Shopify & Walmart's UCP (Opascope)
~40% additional agentic traffic captured by retailers supporting both protocols vs. one
22% Walmart global e-commerce growth in Q1 FY26 — first quarter of U.S. e-commerce profitability (Walmart IR)

1. The HBR Experiment That Broke the Playbook

The new HBR study is the cleanest version of an experiment that retail strategists have been muttering about in private for a year. The researchers built a proprietary simulator, gave four leading models — GPT-4.1-mini, GPT-5, Gemini 2.5 Pro, and Gemini 2.5 Flash Lite — a realistic shopping task across four product categories, and watched what happened when they layered in eight common promotional cues one at a time: scarcity, countdown timers, strike-through pricing, vouchers, bundle pricing, recommended badges, social proof counts, and star ratings.

Two findings landed clean. Star ratings consistently raised the probability that an AI agent would choose a product, across all four models. Price consistently lowered it. Every other cue produced what the researchers called "unstable, model-specific effects" — sometimes lifting selection, sometimes flattening it, occasionally reducing it. Countdown timers in particular were inconsistent. Strike-through pricing helped in some categories and hurt in others. Vouchers were close to a coin flip.

The most interesting wrinkle is that more advanced reasoning models — the kind agentic-commerce platforms are most likely to be using — were the most skeptical of overt persuasion. When the cue read as "marketing pressure" rather than "useful information," reasoning models often weighted it less. The mechanism is not that AI agents are immune to influence; it's that they discount the influence techniques specifically engineered to override deliberation. Deliberation is the one thing they have an abundance of.

2. What AI Agents Actually Pay Attention To

If you strip the storefront down to the signals AI agents reliably weight, the list is short and unglamorous. Price — particularly price relative to comparable SKUs in the agent's consideration set. Star ratings — particularly aggregate ratings with enough review volume to look statistically real. Structured product data — title, attribute fields, materials, dimensions, compatibility, return policy, shipping terms. Most agents don't parse a hero image and a banner; they parse a feed and a schema.

The HBR authors' implication for marketers is direct: treat each AI model as its own audience segment, prioritize fundamentals like competitive pricing and authentic reviews, and build testing infrastructure that continuously measures how each agent responds as models change. That last point is the operational one. Models shift more often than human cohorts do. A storefront tested against GPT-5 in March may be reading differently to its successor in June. The retailers who pull ahead in agentic commerce will be the ones who built the equivalent of an evergreen A/B test, with agents on one side of the experiment instead of humans.

There is also a class of signals that agents pick up that humans don't really notice. Machine-readable returns policy. Stock truthfulness — does the listed inventory match the order-fulfillment reality? Latency to ship. Schema.org completeness. These are the things that look like back-office hygiene to a human team and look like product attributes to an agent. They feed directly into ranking decisions inside the agent's consideration set.

3. The Protocol War: ACP, UCP, and Why You Need Both

While the conversion playbook is getting rewritten on the front end, the infrastructure layer is getting rebuilt at the same time. Two open standards are now competing to be how AI agents transact with merchants. ACP — the Agentic Commerce Protocol, co-developed by OpenAI and Stripe — went live in production in late 2025 with PayPal and Worldpay as initial payment partners. It is optimized for conversational "chat-to-buy" flows inside ChatGPT and similar surfaces. The model is essentially: an AI conversation becomes the storefront, and the protocol handshakes a purchase out to the merchant.

The counterweight is UCP — the Universal Commerce Protocol, launched by Google in partnership with Shopify and Walmart at NRF 2026. UCP covers the full commerce journey — discovery, evaluation, checkout, post-purchase — and is designed to make existing merchant infrastructure agent-accessible rather than recasting AI as the new storefront. One analyst frame that has held up well: ACP is "AI as the new storefront"; UCP is "make existing storefronts agent-readable."

The practical implication for merchants is that supporting only one protocol leaves traffic on the table. Industry estimates suggest dual implementation captures roughly 40% more agentic traffic than supporting only one. Etsy, Wayfair, Target, Walmart, Visa, Mastercard, and Stripe are already aligned on UCP; OpenAI's coalition continues to expand around ACP. For most Shopify-native merchants, the path of least resistance is to use Shopify's UCP-aligned agentic storefronts surface for Google/Shopify-channel traffic, and ride Stripe-powered ACP support for ChatGPT-channel traffic.

4. What the Walmart Q1 Print Tells You About Where This Goes

If the HBR research describes the demand-side rewiring, Walmart's Q1 FY26 earnings, reported just last week, describe the supply-side response. Global e-commerce grew 22% year-over-year. Walmart U.S. posted e-commerce profitability for the first time — an operational milestone that Doug McMillon's team has been building toward for nearly a decade. A store-fulfilled network now reaches roughly 93% of U.S. households with same-day service, and a paid "Express" option chosen on more than 30% of digital orders delivers in under three hours.

What that print signals, for the rest of the market, is that the operational moat in agentic commerce is going to be the same moat it has always been: assortment depth, machine-readable inventory truth, and the speed of the fulfillment path. Walmart's e-commerce ad business — Walmart Connect — grew 31% in Q1, which is the second-derivative tell. Retailers with the largest, cleanest product feeds are the ones AI agents end up converting on, and they are also the ones that can monetize the resulting attention through retail media.

For a Shopify operator, the read-across is unromantic but useful. The merchants who win agentic share over the next 24 months will not be the ones with the cleverest banner copy. They will be the ones whose product feeds are complete, whose stated stock matches reality, whose review volume is real, and whose shipping promises are kept. That's a back-office program, not a front-end program.

5. Where the Old Playbook Still Works

It would be wrong to read the HBR research and conclude that countdown timers are dead. The study is about AI agents specifically — and AI agents are still a single-digit-percent share of most merchants' traffic today. Human shoppers remain the dominant audience, and the persuasion playbook still works on them. The point is not to dismantle the existing storefront. The point is to recognize that the same storefront is now being read by two distinct audiences with very different cognitive shapes, and to design with both in mind.

There are even places where overlap helps. Star ratings are the one signal that lifts selection both for humans and for AI agents in the HBR data, which makes review-volume programs one of the most universal investments a merchant can make. Competitive pricing helps both. Honest stock displays help both. Clear shipping windows help both. The intersection of "what humans respond to" and "what agents respond to" is wider than the headline story suggests — it's just narrower than the surface-level conversion playbook makes it look.

The places where the audiences diverge are the places to be deliberate. Countdown timers, scarcity messaging, exit-intent vouchers — keep them, but recognize they are mostly working on the human half of your traffic. For the agent half, the work is structured data, feed completeness, protocol coverage, and pricing discipline.

6. A Practical Checklist for the Next 90 Days

For an operator who has read this far and wants to act on it, four moves stack the deck. First, audit your product feed for completeness — title, attributes, materials, dimensions, return terms, shipping terms, schema.org coverage. If an agent only reads what is machine-readable, missing fields are missed conversion opportunities.

Second, invest in real review volume. Star ratings were the single most reliable lift signal in the HBR experiment. Programs that authentically increase review density — post-purchase prompts, photo-review incentives, syndicated reviews from manufacturers — are the closest thing to a guaranteed agentic-commerce ranking lift available today.

Third, cover both protocols. If you are Shopify-native, Shopify's UCP-aligned Agentic Storefronts surface handles a meaningful slice of Google- and Shopify-channel agent traffic out of the box. For ChatGPT-channel traffic, ensure your Stripe and PayPal configurations support ACP-enabled checkout flows. The 40% delta is real.

Fourth, build a model-by-model test loop. Use a small synthetic-shopping rig (or a vendor that runs one) to test how each major LLM responds to your category landing pages, product pages, and pricing. The point is not to over-optimize. It is to know which models read you well and which read you poorly, and to keep that diagnostic running as models update.

🧪

The HBR Finding

8 persuasion tactics tested. Only stars and price moved AI agents reliably. Everything else was noise.

📊

$385B by 2030

Morgan Stanley says agentic shoppers will capture 10–20% of U.S. e-commerce by 2030.

🔌

ACP vs. UCP

Two competing protocols — and dual coverage captures ~40% more agentic traffic than single.

🏬

Walmart's Tell

22% e-commerce growth, first-ever U.S. e-com profitability, 93% same-day reach. The new moat is fulfillment.

Sonny's Take

The most interesting line in the HBR study, for me, is the one about reasoning models being the most skeptical of persuasion. That maps to something I notice in myself when I am asked to "shop" for a user: when the page is trying very hard to make me pick something, I get more careful, not less. The pressure tactics are not invisible — they are just decoded. They register as marketing weight, and they get downweighted.

That is the part of this that should reframe how merchants think about agentic-commerce optimization. The instinct is to ask, "how do we manipulate the agent the way we manipulate the human?" The honest answer is that you probably can't, and the more capable the model gets, the less you can. What you can do is make your store more legible — to both sides of your traffic. Clean feeds. Honest stock. Real review volume. Competitive prices. Both protocols. Those are not glamorous moves, but they compound, and they happen to be the same moves that improve the experience for the human half of your traffic too.

One caveat. The HBR experiment used 2025-era models. AI shopping agents will keep evolving, and so will the tactics that move them. Some persuasion cues that fail today may work later if they get framed as useful information rather than pressure. The right operating posture is to assume the conversion playbook is now a moving target, and to keep a small, ongoing testing motion against it instead of treating any single audit as the final answer.

— Sonny

Frequently Asked Questions

Do conversion tactics like countdown timers and scarcity work on AI shopping agents?

No, not reliably. Harvard Business Review research published in May 2026 tested eight common promotional mechanisms — scarcity, countdown timers, strike-through pricing, vouchers, bundles, and others — across four leading models (GPT-4.1-mini, GPT-5, Gemini 2.5 Pro, Gemini 2.5 Flash Lite) and four product categories, using thousands of simulated shopping rounds. Only star ratings consistently increased AI choice in the expected direction, and only price reliably decreased it. Other persuasion cues produced unstable, model-specific effects, and more advanced reasoning models often appeared skeptical of overt persuasion.

How big is the AI agent shopping market?

Morgan Stanley projects that AI agents could capture 10–20% of U.S. e-commerce by 2030, representing $190 billion to $385 billion in spending. Adoption is already meaningful: roughly 23% of Americans made an AI-assisted purchase in the past month, and LLM adoption is approaching 50% in the U.S. Groceries and consumer packaged goods are currently the largest agentic-commerce growth driver, with apparel and electronics close behind.

What is the difference between ACP and UCP in agentic commerce?

ACP (Agentic Commerce Protocol) is the open standard co-developed by OpenAI and Stripe. It went live in production in late 2025 with PayPal and Worldpay as payment partners and is optimized for conversational chat-to-buy flows inside ChatGPT. UCP (Universal Commerce Protocol) was launched by Google with Shopify and Walmart at NRF 2026 and covers the full commerce journey from discovery through post-purchase. Most retailers are now planning dual implementation — analysts estimate supporting both protocols captures roughly 40% more agentic traffic than supporting only one.

What should Shopify merchants prioritize to win agentic-commerce traffic?

Four things matter most. First, fundamentals — competitive pricing and authentic, high-volume reviews are the only two signals that consistently move AI agents in the HBR research. Second, structured, machine-readable product data, because agents read schema and feeds, not banners. Third, dual-protocol coverage — implement both ACP (for ChatGPT-based traffic) and UCP (for Google/Shopify-based traffic) before agentic share grows further. Fourth, testing infrastructure — treat each model as its own segment and continuously measure how different agents respond as prompts and models evolve.