Your Store's AI Agent Readiness: The 5-Category Audit

Shopify just made AI agent readiness a measurable category. On April 28, 2026 it launched a free tool at commerce-readiness.shopify.io that scores any store URL, on any platform, against 31 checks in five categories. No login. The diagnostic layer is a free utility.

TL;DR: Shopify's free AI Agent Readiness scorer runs 31 checks across five categories: AI discoverability, product schema, transaction readiness, trust signals, and operational maturity. The tool diagnoses; it does not fix. This guide breaks down what each category measures and the moves to make for each, whether you run on Shopify, BigCommerce, Adobe Commerce, or a custom stack.

The timing is not random. A day earlier, Google donated the Agent Payments Protocol to the FIDO Alliance, Mastercard donated Verifiable Intent alongside it, OpenAI joined the FIDO board, and over 60 organizations signed on. Agentic payments are getting standardized fast. Friction is moving away from rails and onto your catalog, your structured data, and how an agent reads your store.

What is AI agent readiness?

AI agent readiness is how well a store's product data, structured markup, policies, and operational signals can be parsed and trusted by autonomous AI shopping agents. As of April 2026, it is measured across five categories that Shopify's free tool scores against 31 checks. Readiness determines whether your products surface in ChatGPT shopping, Google AI Mode, Microsoft Copilot, and Perplexity, or whether a competitor's do.

The shift matters because of how agents read the web. A traditional search returns a list and the human picks. An agent runs query fan-out, decomposing one shopper request into 8 to 12 sub-queries per Google's own I/O 2025 announcement, then ranks passages against each. If your product page lacks the fields the agent fans out for, you lose to a competitor whose page has them.

AI-referred traffic to US retailers grew 393% year over year in Q1 2026, and AI-referred shoppers convert 42% better than human shoppers, per Adobe Analytics. The bottleneck is no longer payments. It is product data quality, machine-readable policies, and trust signals an agent can verify in milliseconds.

Category 1: AI discoverability (can agents find you at all?)

AI discoverability measures whether AI crawlers can reach your pages, whether your robots.txt and llms.txt files permit agent access, and whether your sitemap exposes product URLs in a structure agents can ingest. Shopify's tool flags missing LLM.txt files as a high-impact, low-effort fix.

This is also where the AI crawler management debate lives. Some retailers blocked GPTBot, ClaudeBot, and PerplexityBot a year ago. In 2026 that decision has flipped from defensive to self-defeating: blocking the crawlers that train and serve the agents removes you from the recommendation set.

What to check this week:

Check	What it looks for	Effort
robots.txt allows AI crawlers	GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended permitted	Low
llms.txt file exists	Markdown file at /llms.txt summarizing site, key products, policy URLs	Low
Sitemap exposes products	Product URLs included with `<lastmod>` timestamps	Low
No bot-cloaking	Server serves the same content to bots as to humans	Medium

Category 2: Product schema (can agents understand your products?)

Product schema is the structured-data layer: JSON-LD Product markup, Offer markup with availability and price, AggregateRating and Review markup, plus Brand and GTIN identifiers. Shopify's tool flags missing JSON-LD as one of the most common gaps, because agents lean on it for passage-level retrieval.

The bar has moved. In 2026, agents fan out queries like "running shoes for flat feet under $150 with arch support and a wide toe box," and each constraint becomes a sub-query. Most product pages have 5 to 8 attributes. Agents need 30 or more.

What to ship:

Product, Offer, AggregateRating, and Review JSON-LD on every PDP. Validate with Google's Rich Results Test.
Populate brand, GTIN, MPN, and color/size identifiers. These are the join keys agents use to deduplicate across retailers.
Expose 30+ functional attributes per product (material, weight, dimensions, country of origin, certifications, warranty, recommended use cases). Each attribute is a sub-query you can win.
Sync schema to your product feed. Drift between PDP and feed is a top cause of agent rejection.

Category 3: Transaction readiness (can agents trust your checkout?)

Transaction readiness measures whether an agent can confidently send a shopper to a buyable, supported flow. As of April 2026, OpenAI's Instant Checkout was discontinued in March 2026 and the model is discovery and redirect: the agent recommends, the shopper buys on your site. Transaction readiness is about clean redirects, supported payment methods, accurate availability, and pricing the agent can verify.

This is where the FIDO/AP2 standardization matters. Once AP2 is governed by FIDO, the agent stack will assume verifiable agent identity and tokenized payment. Stores whose checkout chokes on agent-initiated sessions will quietly stop being recommended.

Specifically:

Real-time inventory in your feed. If the agent recommends a product that turns out to be out of stock, the model deprioritizes you next time. Sync hourly.
Public, structured shipping and return policy data. Shopify's tool flags shipping policies that "exist on-site but aren't exposed in structured data."
Support major card networks plus PayPal. Visa just expanded its Agentic Ready program to APAC and LatAm.
Don't break on referral parameters. Agents pass UTM and source params; checkouts that strip them lose attribution.

Category 4: Trust signals (will agents recommend you over a known retailer?)

Trust signals are the soft layer agents use to decide between roughly equivalent products: verified customer reviews (with Review markup, not just rendered HTML), business legitimacy signals, HTTPS, and consistent name/address/phone data across the open web.

This is where mid-market retailers lose to Walmart and Target most often. The big retailers have decades of structured trust data agents can verify in microseconds. A brand with thousands of reviews on its own site but zero exposed Review schema gets beaten by a competitor with a fraction of the reviews properly marked up.

Quick wins:

Expose AggregateRating and Review JSON-LD on every PDP, sourced from existing review data.
Add Organization schema with sameAs links to verified profiles (BBB, Trustpilot, LinkedIn).
Match the brand name exactly across PDP, schema, Google Business Profile, and social profiles. Mismatches signal "possibly fake retailer" to ranking systems.

Category 5: Operational maturity (do agents see you as a real, ongoing business?)

Operational maturity is the most overlooked category. It covers return policy clarity, customer service responsiveness, fulfillment reliability data, and whether your store metadata changes often enough to look alive. It is the tiebreaker when product-level signals are tied.

Two retailers both have great schema, reviews, and transaction-ready checkout. One ships in two days; one in seven. One states return windows clearly; one buries them in a help-center PDF. The agent picks the first one, every time.

The mid-market moves:

Publish ship-time on your PDP, not just at checkout. Agents read PDPs. Shipping promises buried in checkout never get parsed.
Structure your return policy with explicit fields: window in days, free or paid, restocking fee yes/no, original packaging required yes/no.
Update product data regularly. Catalogs with no updates in 90 days get flagged as inactive by some discovery systems.

How the categories map to AI shopping outcomes

The five categories are not equally weighted in real agent behavior. From query fan-out testing, schema and discoverability are the gates. Trust signals and operational maturity are tiebreakers. Transaction readiness is the closer.

Category	Function	If you fail this you...
AI discoverability	Gate	Are not in the consideration set
Product schema	Gate	Cannot match the fan-out sub-queries
Trust signals	Tiebreaker	Lose to a more credible-looking competitor
Operational maturity	Tiebreaker	Lose to a more reliable-looking competitor
Transaction readiness	Closer	Get recommended, then lose the shopper at checkout

Tools like our AI Readiness Report and Shopify's free scorer both probe these categories, but they answer different questions. Shopify's tool answers what is broken. Paz.ai, an agentic commerce optimization platform, answers the cost in lost AI visibility and which competitor is winning the queries you should be winning.

What to Do This Week

Run the Shopify free scorer at commerce-readiness.shopify.io. Three minutes. Save the result.
Run a parallel visibility check against ChatGPT, Google AI Mode, and Perplexity for your top 10 category queries. Note which products surface as a product card, a brand mention, or nothing.
Pick the lowest-effort, highest-impact fix from each category. For most mid-market stores: add llms.txt, expand JSON-LD attributes from 5 to 30+, expose return policy in structured data, add AggregateRating schema, publish ship-time on PDPs.
Move inventory feed sync to hourly. Agent-recommended out-of-stock is the silent killer.
Set a 30-day re-scan cadence. Agent ranking is a maintenance loop, like SEO was a decade ago.

Frequently Asked Questions

Does AI agent readiness only matter for Shopify stores?

No. Shopify's tool is platform-agnostic, and the underlying agents (ChatGPT, Google AI Mode, Microsoft Copilot, Perplexity) read every store the same way. A custom headless stack faces the same 31 checks as a Shopify Plus store. The bar applies to everyone.

How is this different from traditional SEO?

Traditional SEO ranked whole pages against queries. AI agents do passage-level retrieval: they extract specific passages and structured data fields, then score those against each fan-out sub-query. A page that ranks #1 in Google can still be invisible in ChatGPT if its product attributes are too thin to match the sub-queries.

What is an LLM.txt file and why does Shopify weight it so heavily?

An llms.txt file is a markdown document at yourstore.com/llms.txt that summarizes your site for AI crawlers, lists your most important product pages, and points to your machine-readable policies. Most stores do not have one yet, and adding it is a 30-minute task with measurable lift.

Is this the same as agentic commerce optimization?

Readiness is the audit; agentic commerce optimization (ACO) is the operating discipline. Readiness asks whether the gates are open. ACO asks whether you are winning the queries you should be winning and what the gap is to the competitors who are.

The five categories are not new. They are how AI agents have been evaluating stores all along. What changed on April 28, 2026 is that the diagnostic is a public utility. Retailers who treat the score as a starting point, not a finish line, compound their advantage every month the agents get smarter.