Buying Recall in AI Search

A research blog from the builder’s desk

Preface: this gets exponentially nerdy below. Bring coffee.

How I ended up here

This started in my headphones. I was listening to a deep dive on Google’s history and the economics of search. The reminder landed hard: search is the best software business ever built because a single query fans out to millions of advertisers competing for intent. The auction converts that competition into money without wrecking usefulness.

Now the interface is changing. People ask natural questions and expect a composed answer or a next step, not a page of links. That isn’t just a UI swing. Under the hood, modern systems assemble answers from a small working set of sources and tools. If something never enters that set, it cannot shape the answer or the follow-up. That observation is what kicked off this post.

I wanted to write down a clean, technical way to finance answer-first search that keeps the answer trustworthy. The idea is simple: sell bounded admission probability into the model’s working set and sell one clearly labeled follow-up. Nothing touches the model’s voice.

Why AI search matters right now

User behavior shifted. When questions are narrow (“is this adapter compatible”, “what size fits under this airline seat”), people prefer an answer that cites sources and lets them finish the task. They stop scanning ten tabs.

Publishers noticed. If the answer is composed from your content but doesn’t send traffic, the old bargain breaks. Source attribution and which sources are allowed into the working set suddenly determine who gets paid.

Advertisers noticed. Keyword targeting is less visible in an answer box. The predictable lever is no longer “bid on this exact phrase” but “be allowed into the set the model uses” and “be the one follow-up that launches the tool.”

New players smell opportunity. Answer engines and AI modes are shipping fast. Incumbents are threading a needle: keep answers useful, show citations, let ads appear without quietly editing the response. Startups are experimenting with sponsored follow-ups that feel like product steps, not banners.

All roads lead to the same bottleneck: what gets into the grounding set and what the user is asked to do next.

A quick mental model

A user issues a query $q$ with context $c$ (history, locale, device).
The system predicts an intent distribution $P (I ∣ q, c)$ over tasks $I$ such as learn, compare, buy, navigate, support.
A retriever pulls a candidate pool $C (q)$ .
A selector builds a grounding set $G (q) \subset C (q)$ .
The model composes an answer $A$ conditioned on $G$ and proposes a few follow-ups $F$ .

The product goal is to reduce uncertainty and move the task forward:

{max}_{G, F} 𝔼 [U (A (q; G), N e x t (q; F))]

subject to latency and trust. $U (\cdot)$ is a learned proxy for user value.

Two places are fair to charge for value:

Admission to $G$ - a bounded nudge that lets high-quality items get considered.
One sponsored follow-up - a labeled, optional steer that launches a useful action.

We’ll also allow a single sponsored citation block in the answer, clearly marked and only alongside organic citations. That’s visibility, not authorship.

One sponsored follow-up: selling a steer, not speech

The system proposes a few follow-ups $F (q)$ . We allow one sponsored option, picked by

s c o r e (f) = b_{f} \cdot {\hat{p}}_{engage} (f ∣ q)^{γ}, . . . (5)

and priced via the same GSP logic as (3) with ${\hat{p}}_{engage}$ in place of $\hat{u}$ , charged on CPC-engage. If the follow-up launches a tool with a measurable outcome, support CPA-tool. The answer remains editorial; the steer is transparent and optional.

Flowcharts

Figure 1 - Query to answer with the two places we charge

flowchart LR
  Q[Query & context] --> I{Predict intent}
  I --> R[Retrieve candidates]
  R --> Co[Organic pool (top-K, û ≥ τ)]
  ADB[Targets & bids] --> ELG[Eligibility gate]
  I --> ELG
  R --> ELG
  ELG --> Cs[Sponsored pool]
  Co --> M[Merge under cap ρ]
  Cs --> M
  M --> G[Grounding set G]
  G --> LLM[Compose answer]
  LLM --> ANS[Answer + citations]
  I --> FGEN[Generate follow-ups]
  ADB --> FGEN
  FGEN --> SFF[Pick 1 sponsored follow-up]
  ANS --> OUT[User view]
  SFF --> OUT

Figure 2 — Sponsored retrieval: ranking and pricing

flowchart LR
  subgraph Ads
    A1[Doc d1, bid b1]; A2[Doc d2, bid b2]; A3[Doc d3, bid b3]
  end
  A1 --> U[Predict û(d|q)]
  A2 --> U
  A3 --> U
  U --> SCORE[score = b * û^γ]
  SCORE --> RANK[Order within Cs]
  RANK --> GSP[Price vs next score]
  RANK --> RSV[Reserve from Δû vs organic Kth]
  GSP --> PRICE[max(GSP, Reserve)]
  RSV --> PRICE
  PRICE --> PICK[Admit ≤ ρK if û ≥ τ]

Figure 3 — What fires a CPC-ground charge

flowchart LR
  G[Grounding set] --> TRACE[Attribution: spans/citations]
  TRACE --> USED{Used in answer?}
  USED -->|Yes| BILL[CPC-ground]
  USED -->|No| NOCHG[No charge]

Worked examples

A. Enterprise comparison

Query: “compare enterprise LLM gateways with per-request policy and cost controls.” Parameters: $K = 12$ , cap $ρ = 0.2$ so at most 2 sponsored items, $γ = 0.5$ , floor $τ = 0.50$ . Three vendors bid with $\hat{u} = {0.62, 0.55, 0.44}$ and b = {$4.00, $5.50, $8.00}

Scores from (2):

$s_{1} = 4.00 \cdot \sqrt{0.62} = 3.1496$
$s_{2} = 5.50 \cdot \sqrt{0.55} = 4.0789$
$s_{3} = 8.00 \cdot \sqrt{0.44} = 5.3066$

Eligibility by (1): suppose $θ = 2.5$ . All three clear the gate, but vendor 3 fails the floor $\hat{u} < τ$ (0.44 < 0.50) and drops out. Two eligible remain: vendor 2 ahead of vendor 1.

Price for the winner (2) via (3):

{Price}_{2} = \frac{s_{1}}{\sqrt{0.55}} = \frac{3.1496}{0.7416} \approx $ 4.25,

charged on CPC-ground only when the evidence is actually used. Vendor 1 pays against the next competitor or the reserve in (4).

What to notice The quality floor vetoed the biggest bid. Money helps good candidates get considered; it cannot smuggle in weak ones.

B. Local service steer

Intent: book a technician. Two follow-ups compete with bids b={$1.10, $1.60} and predicted engagement $\hat{p} = {0.23, 0.18}$ . Take $γ = 0.6$ .

Scores from (5):

$1.10 \cdot {0.23}^{0.6} = 1.10 \cdot 0.4140 = 0.4554$
$1.60 \cdot {0.18}^{0.6} = 1.60 \cdot 0.3574 = 0.5719$

Second wins. GSP price:

Price = \frac{0.4554}{{0.18}^{0.6}} = \frac{0.4554}{0.3574} \approx $ 1.27,

charged on CPC-engage when clicked.

C. Shopping with a sponsored citation

Query: “ultralight daypacks under 20L that fit under an airline seat.” The retriever proposes $K = 12$ . Two brands pass eligibility and the floor, so with $ρ = 0.2$ both may be in $G$ . The answer summarizes volumes, weights, and seat-fit rules, cites three organic sources and one clearly labeled sponsored source, and offers a single sponsored follow-up that opens a size-guide tool. Charges: CPC-ground for admitted brand docs when used, CPC-cite for the sponsored citation click, and CPA-tool for running the guide.

What’s happening in the market, mapped to this model

Incumbents are folding composed answers into results and experimenting with ads inside those answers. The healthy version of that looks exactly like the math above: a tight cap on paid items entering $G$ , a visible sponsored block that co-exists with organic citations, and one optional next step.
Answer engines are converging on sponsored follow-ups as the natural ad unit for chat. It’s easy to explain, measurable, and honest about what it is: a suggestion you can ignore.
Publishers will push hard for clear attribution and revenue share when their content grounds a sponsored block. That’s reasonable; the mechanism above makes it natural.
Advertisers will evolve toward two lines they actually care about: CPC-ground (pay when you were in the set that shaped the answer) and Cost-per-Steer (pay when your follow-up moved the task).

None of this requires exotic theory. It’s the classic sponsored-search playbook, moved from “slot on a page” to “slot in the working set.”

What each side gets

Users get answers first, with citations, and one optional nudge. Money never edits the wording.
Publishers get paid when their verified content grounds a sponsored block and are rewarded for useful, current material.
Advertisers buy presence at the decision point, priced by impact. Investing in better content reduces their effective price via ${\hat{u}}^{γ}$ .
Builders get a mechanism that is easy to explain, debuggable, and tunable via a few interpretable knobs.

Parameters worth starting with

K              = 12                       # organic candidates
ρ (rho)        = 0.20                     # cap: ≤ 20% of G may be sponsored
τ (tau)        = 0.50                     # minimum predicted utility
θ (theta)      = per-intent gate
β (beta)       = per-intent bid weight
γ (gamma)      = 0.5                      # squashing in ranking
g(bid)         = log(1 + bid)             # diminishing returns
h(intent,d)    = [0..1]                   # intent match
Units          = CPC-ground, CPC-engage, optional CPA-tool

Closing

The old game sold position on a page. The new game sells bounded presence in the model’s working set and one transparent steer. Keep the floor high, keep the cap small, keep the labels obvious. If we hold that line, answer-first search can stay useful and still fund itself.

Thanks for reading. If you made it this far, you earned the right to dunk on my choice of $γ$ .