Phone-in-hand demos: the $28K/day AI ad format
The ads spending $28,000 a day on this format aren’t using real people. Phone-in-hand. Tap, scroll, react. Three different "creators". A brunette at a restaurant, a fitness girl in her bedroom, an older guy in his living room. All rendered.
Phone-in-hand UGC that passes as organic content in a paid feed. 3-agent stack: flow agent scripts the phone demo sequence (what screen to show, when to tap, when to look up); face agent renders a different persona per demographic; screen agent swaps the phone content per product and keeps hand-to-screen interaction realistic. One format, dozens of demographic-matched variations per product. $28K/day in ad spend, no real creators.
Script the phone demo sequence frame by frame
What screen to show, when to tap, when to look up and react. Flow agent scripts the demo sequence beat-by-beat. The talking points sound casual enough to pass as organic, but every gesture is choreographed to land the app feature at the moment of peak attention.
Most operators ship demo ads with talking-head intros and tacked-on screen recordings. This format integrates the demo into the gesture sequence. The hand-tap-screen-react choreography IS the demo, not an interruption to it.
Apply this: Script the gestures, not just the dialogue. Every tap, every scroll, every look-up is part of the demo sequence.
Match persona to demographic per product
Young woman for fashion. Fitness girl for health. Older male for wellness. Face agent renders a different persona per demographic, each in a setting that matches the product. Demographic-product fit drives the conversion math.
Most brands ship one demo to all demographics and let the algorithm sort. This stack ships demographic-matched demos and lets the algorithm reward the match. The CPM advantage shows up in week one.
Apply this: Render one demo per demographic, not one demo total. Demographic-product match is the conversion multiplier.
Swap phone content per product naturally
Screen agent swaps the phone content per product and generates the matching app UI. Hand-to-screen interaction realism stays constant. The touches land on real buttons, the scrolls match real screen layouts.
The interaction realism is what most operators miss. Render a fake screen and the touches land on nothing; the viewer’s brain catches the discontinuity in a fraction of a second. Render the real screen and the touches land on real buttons. The brain doesn’t flag it.
Apply this: Render the actual app screen, not a stylised mockup. The interaction realism is the trust primitive.
Pass as organic in a paid feed
It looks like someone’s friend casually recommending an app. That’s the point. The phone-in-hand format reads as user-discovery-content even when running as paid acquisition. The algorithm filters less aggressively because the format reads as creator content.
Paid placements that read as paid placements get throttled. Paid placements that read as user content get full distribution. This format is engineered to read as user content from the first frame. The paid layer is invisible to both the algorithm and the viewer.
Apply this: Ship paid ads in formats the algorithm reads as user content. The distribution advantage compounds with every dollar of ad spend.
Run dozens of variations per product
One format template, dozens of demographic-matched variations per product. The brand running this for one app gets one ad. The brand running it across 10 demographic-app combinations owns the ad inventory in 10 segments simultaneously.
Real-creator demo ads cost $2K each, capping at 3-5 variations per product. This stack ships 20-30 variations per product per week at cents per render. Saturating every demographic the brand wants to target with no marginal contract cost.
Apply this: Map every product to every demographic. Render the matrix. The format scales because the contract layer collapses.
- "The ads spending $[X] a day on this format aren’t using real people"
- "Phone-in-hand. [Tap, scroll, react]. [N] different creators. All rendered"
- "It looks like someone’s friend casually recommending an app. That’s the point"
- "One format template producing dozens of demographic-matched variations per product"
- "The brands still hiring three creators for three demographics are paying for three schedules. The ones running this playbook are paying for one prompt"
What’s actually running underneath
- Flow agent (Claude) Scripts the phone demo sequence beat-by-beat. What screen to show, when to tap, when to look up and react. Talking points calibrated to pass as casual organic. Every gesture is choreographed.
- Face agent (Seedance 2.0) Renders a different persona per demographic. Young woman for fashion, fitness girl for health, older male for wellness. Each in a setting that matches the product. Demographic-product fit drives conversion.
- Screen agent Swaps the phone content per product. Generates the matching app UI. Keeps hand-to-screen interaction realistic across every variation. Touches land on real buttons, scrolls match real layouts.
- Demographic matrix One product × N demographics = N variations. The brand running this for one app gets one ad. The brand running it across 10 demographic-app combinations owns ad inventory in 10 segments at once.
$28K/day in ad spend running on three rendered AI creators showing three different apps. The same campaign at real-creator rates would cost $6-12K just in creator fees. Before ad spend.
The brands still hiring three creators for three demographics are paying for three schedules, three contracts, and three rounds of feedback. The brands running this playbook are paying for one prompt. And shipping the demographic matrix in an afternoon.
Want this stack running for your brand?
Book a 15-min call. I’ll walk through how it adapts to your funnel.
Book a 15-min call →