The AI-First Product Org: What the Research Says and How to Think About Our Transition
Prepared for internal use  ·  March 2026  ·  Sources cited throughout
Most organizations are deploying AI tools without redesigning how work happens — and that's exactly why 95% of gen AI pilots produce zero P&L impact. The firms winning with AI aren't moving the fastest; they're the ones who redesigned their operating model before scaling their tooling.
Part I

What's Actually Happening at Leading Firms

The most-cited examples of "AI-first" product orgs share a pattern: they restructured around outcomes, not roles. Here's what that looks like in practice.

Duolingo: 10x throughput, but not without cost

In April 2025, Duolingo CEO Luis von Ahn announced a formal AI-first policy: headcount would only be approved if a team could prove it couldn't automate more of its work. The result was striking — their first 100 language courses took 12 years to ship; with AI, they shipped 148 new courses in roughly one year.1

The less-discussed part: significant public backlash, questions about content quality, and a workforce that felt the announcement before it had been properly contextualized. Speed without culture design is fragile.

What to take from it: AI can dramatically expand throughput on content and execution work. The announcement approach matters as much as the strategy.

Airtable: Fast and slow

CEO Howie Liu restructured Airtable's entire organization into two explicit tracks — "slow-thinking" teams for long-range architecture and positioning, and "fast-thinking" teams for iteration and daily shipping. He cited Kahneman directly. Liu is reportedly the company's largest internal user of AI inference credits — modeled the behavior before mandating it.2

What to take from it: The fast/slow split is an honest acknowledgment that not all product work benefits from the same tempo. Start by identifying which work can accelerate, not by assuming all of it can.

Airbnb: AI in the interface, not in headcount reduction

Brian Chesky's January 2026 line is worth quoting directly: AI has not cut Airbnb's workforce "as much as I thought." Their strategy is building a highly personalized AI interface — "an ultimate concierge" — rather than automating their way to fewer people. Their most concrete signal: poaching Ahmad Al-Dahle, former head of Generative AI at Meta, as CTO.3

What to take from it: The firms with the most credible AI strategies are building AI into the product experience, not using it primarily as a cost-reduction mechanism.

Salesforce: New role taxonomy, not just tool adoption

Salesforce's 2024–2025 job postings reveal a structural shift in how they define product roles. Rather than generalist PMs, they're hiring for distinct disciplines:4

What to take from it: "PM" as a monolithic role description is already outdated at AI-native firms. It's worth asking how current role definitions hold up against this — particularly for roles that sit close to research and data workflows.

The counterexample: Microsoft

Microsoft cut ~15,000 jobs through 2025, with AI cited as a driver. Simultaneously, 80% of workers were bringing their own AI tools into the workplace — bottom-up adoption far outpacing top-down rollout.5 The productivity gains and the layoffs happened in parallel, creating cultural damage harder to recover from than the efficiency gains are worth.

What to take from it: Moving fast on both restructuring and tool adoption simultaneously is high-risk. A measured pace isn't timidity — it's reading the failure data correctly.

Part II

What the Research Actually Says

AI hits strategy harder than execution

Counterintuitively, AI most significantly impacts the highest-level PM work — not the administrative tail. Lenny Rachitsky's 2024 analysis identified the top AI-impacted PM activities as:6

  1. Developing product strategy and vision
  2. Setting goals and OKRs
  3. Creating product specifications
  4. Discovery and user research synthesis
  5. Building roadmaps

The skills that become more valuable with AI: stakeholder alignment, team leadership, product judgment, and communication. This is the important inversion. Most organizations deploy AI for ticket writing and release notes. The actual competitive leverage is in strategy synthesis and discovery — the work most PMs say they don't have enough time for.

AI is already competitive with human PMs on core tasks

In a July 2024 blind comparison study, Rachitsky had readers evaluate AI vs. human responses across three PM tasks without knowing which was which:7

Critically: respondents knew AI had won these tasks and preferred it anyway. The researcher's note: "This represents the worst AI will ever be at these tasks."

The productivity numbers are real but unevenly distributed

McKinsey ran a controlled study of 40 product managers across four continents:8

40%
time savings on content-heavy PM tasks
15%
time savings on content-light tasks
self-reported satisfaction with work

Broader knowledge worker data from the St. Louis Fed (February 2025): AI users save approximately 5.4% of work hours weekly. Human-AI teams show 73% higher productivity for certain content tasks.9

Important counterweight: A February 2026 NBER study of 6,000 CEOs found the majority see little measurable operational impact from AI — re-surfacing the Solow Paradox. Individual productivity gains are real. Aggregate enterprise impact is not yet showing up in the data.10 This gap between individual gains and enterprise impact is worth planning for — it's normal, and it means measuring at the right level matters.

PM jobs are not going away — but they're polarizing

As of March 2026, there are 7,300+ open PM roles globally — the highest in over three years, up 75% from the 2023 low. AI PM roles are "absolutely exploding."11 But the distribution is shifting into a K-shape:12

Execution PO roles — the backlog manager, the sprint facilitator, the acceptance criteria writer — are the most structurally at risk. Not immediately, but directionally.

What endures in the PM role

Reforge, SVPG, and a March 2026 analysis of six years of practitioner content converge on the same list of AI-resistant PM competencies:13

Three new competencies now required:

  1. Evaluations (evals) — the ability to rigorously assess whether AI outputs are right. "The defining skill for AI PMs in 2025."
  2. Prototyping fluency — "If you're not prototyping, you're doing it wrong" (Chennapragada)
  3. Natural Language Experience (NLX) — designing AI conversations deliberately, the way UX designers approach visual interfaces. "NLX is the new UX."

Part III

The Three Failure Modes

Understanding what kills AI transitions matters more than knowing what success looks like — because the failure modes are more predictable.

Failure Mode 1: Pilots that never scale

MIT's 2025 NANDA study puts it starkly: 95% of gen AI pilots produced no business impact. Only 5% of organizations have successfully integrated AI into production at scale.14 The core problem: there is no repeatable path from proof-of-concept to standard operating procedure. HBR (March 2026) calls this the "proliferation of pilots" — isolated wins that never become how the org works.15

Worth considering: Any AI initiative — however promising in pilot — benefits from having an explicit owner of the path to standard practice. The question to ask early: who is responsible for making this how we work, not just a working demo?

Failure Mode 2: Optimizing one node without redesigning the system

McKinsey's March 2025 State of AI survey is clear: workflow redesign is the single biggest predictor of EBIT impact from AI. Only 21% of organizations have fundamentally redesigned at least one workflow.16 BCG's 2025 AI at Work survey (11 countries, 10,600 respondents) describes most companies as stuck in "Deploy" mode when the value is in "Reshape."17

HBR illustrates the trap with a car manufacturer that boosted software development productivity with AI but saw zero overall performance gain — the hardware production line became the bottleneck. You can't optimize one node without redesigning the system.18

Worth considering: Adding AI to existing workflows is a start, but the bigger question is which workflows themselves need to change. For us, that might mean the research synthesis process, the data review pipeline, or the PM-to-engineering handoff — nodes where redesign could compound the gains from tooling.

Failure Mode 3: AI amplifying weak judgment

Marty Cagan identifies the failure mode most specific to product orgs: "I'm especially excited by the combination of someone with very strong judgment and generative AI tools. But I'm also worried about the prospect of providing those same tools to people that do not have the necessary product foundation."19

AI doesn't make weak product thinking stronger. It makes it faster and more confident. An execution-focused PM who doesn't have strong product sense will use AI to produce confident, polished, wrong decisions more quickly.

Worth considering: Current org interviews aren't just org design input — they're also a baseline read on which PMs have the judgment foundation that makes AI acceleration safe. It may be worth sequencing expanded AI access based on that foundation, rather than rolling it out uniformly.

Part IV

How to Think About This in Our Context

The examples above — Duolingo, Airbnb, Salesforce — are useful reference points, but context shapes what "AI-first" should actually mean for a given organization. A few considerations specific to financial services and investment research are worth naming.

The trust bar is different. In financial services, our competitive position is built on methodological credibility — ratings, research quality, data integrity. An AI error in a consumer product strategy doc is embarrassing. An AI error in investment-relevant research influences decisions with real financial consequences. The governance rigor required is higher here, and that's a feature, not a bug — it's why AI credibility, once established in this context, tends to be durable in a way that consumer apps' won't be.

The analyst may be the product. At most tech companies, PMs build for users. In investment research, the analysts' judgment and methodology is what clients pay for. AI augmenting that judgment is a different proposition than AI replacing a task a PM was doing — it's a change to the core product itself. That deserves a different level of deliberateness about what gets automated and what stays human.

The regulatory context constrains speed, productively. Financial services firms operate under investment adviser regulations, FINRA requirements, and evolving SEC guidance on AI use in financial analysis. Governance infrastructure isn't optional here — which creates a forcing function to build it right rather than fast. Organizations that treat this as a constraint to route around tend to create more risk than they resolve.

Proprietary data is an underused asset. Firms in this space often have decades of structured investment data, rating histories, and domain-specific methodologies that most AI providers don't have access to. Fine-tuning models on that data — rather than relying solely on general-purpose foundation models — is a meaningful strategic option worth exploring explicitly.


Part V

A Way to Think About the Transition

The research points to a consistent sequencing pattern: deploy broadly before designing, redesign workflows before restructuring, restructure only once there's evidence. The diagram below shows how that progression might look — and what changes at each stage.

Today
Pre-AI Baseline
  • PM/PO roles distinct
  • Backlog-driven delivery
  • AI use informal, uneven
  • No governance model
Execution PO
AI in workflowLow
Phase 1 · 0–6 mo
Foundation
  • Standardized AI toolset
  • Time audit completed
  • Governance drafted
  • 5+ hrs training per person
Augmented PO/PM
AI in workflowBuilding
Phase 2 · 6–12 mo
Augment
  • 2–3 workflows redesigned
  • PO scope expanded
  • Fast/slow tracks named
  • Outcome metrics in place
Full-Stack PM
AI in workflowEmbedded
Phase 3 · 12–24 mo
Restructure
  • Outcome-oriented squads
  • AI Platform capability
  • Roles rewritten for AI era
  • Headcount evolves naturally
Orchestrator PM
AI in workflowStructural

Role evolution and AI integration across phases · not a mandate, a model

The phases below describe what to consider at each stage — the research is consistent that trying to skip stages is where most transitions break down.

Phase 1 Foundation Months 1–6
Goal: Get everyone using AI as a baseline, but immediately pair usage with workflow audit — not just tool adoption. Deploy, but only with a redesign mandate attached.
  • Standardize access — every PM and PO on the same AI toolset (Claude/ChatGPT for writing and synthesis, a coding assistant for PMs who touch technical specs). Standardization before customization, per BCG: the "shadow AI" cost of undertooling is that people find their own tools anyway.17
  • Baseline the time audit — understanding where PMs actually spend their time today is essential before deploying AI. McKinsey's data shows the biggest time savings land on content-heavy tasks — writing PRDs, synthesizing research, drafting roadmaps. Mapping that before deployment helps direct AI where it will have the most impact.8
  • Establish governance before it's needed — one option is to define what AI can and cannot do in customer-facing work, research outputs, and regulated content before those questions arise in practice. HBR's 5Rs framework offers a useful starting structure: assign Roles, Responsibilities, Rituals, Resources, and Results measures to each AI capability before it ships.20 The organizations that achieved 50–60% faster delivery times did it because they built this infrastructure early.
  • 5 hours of training as the floor — BCG's survey shows regular AI adoption is sharply higher for employees who receive at least 5 hours of training plus access to in-person coaching.17 It sounds obvious, but most organizations skip it — and the shadow AI cost of undertooling is that people find their own tools anyway.
  • Resist restructuring before there's data. Organizations that restructure before they have real AI usage data are making design decisions on assumption. A few months of real usage — combined with qualitative interviews on where time actually goes — gives a much stronger basis for org design choices.
What success looks like: Every PM is using AI weekly for at least one workflow task. You have a baseline time allocation per role. You have a governance document that is live and tested.
Phase 2 Augment Months 6–12
Goal: Identify the 2–3 highest-leverage workflow redesigns and execute them fully, rather than adding AI to 10 workflows at 10% depth. Reshape — redesign specific workflows, not just add tools.
  • Pick your highest-leverage workflows and fully redesign them. Based on the research, the candidates at Morningstar are likely: (1) PM discovery synthesis — from raw user interviews and analyst feedback to opportunity brief; (2) PRD and spec creation — from strategic intent to structured requirements; (3) data quality escalation — identifying and triaging discrepancies in underlying research data. This is where the 40% time savings lives.8
  • Shift PO scope deliberately. Execution-only PO work (acceptance criteria, sprint board management, release notes) is where AI displacement risk is highest. Use this phase to expand PO scope — more discovery access, more direct customer contact, more ownership of metrics. The goal is to evolve the role before external pressure forces it. Reforge's "Full-Stack PM" trajectory is the right direction.21
  • Introduce the fast/slow distinction. Some squads run BAU work with high execution volume — that's the "fast" track. Some squads are working on 12-month strategic bets — that's the "slow" track. The AI tooling, oversight model, and PM profile for each are different. Name this distinction explicitly.2
  • Measure outcomes, not adoption. Track: time from insight to spec, cycle time from spec to deployed feature, PM time spent on strategic vs. administrative work. McKinsey found workflow redesign is the biggest predictor of EBIT impact — not AI tool usage rates.16
What success looks like: 2–3 workflows are genuinely redesigned (not just tool-augmented). You have measurable before/after data. PO roles are expanding, not contracting.
Phase 3 Restructure Months 12–24
Goal: Make structural decisions based on data from Phases 1 and 2, not on aspiration. Let the org design follow the evidence.
  • Reorganize around outcomes, not features. If Phase 2 worked, you'll have evidence of which squads are operating outcome-oriented and which are still feature factories. The structural move is to make outcome-orientation the default, eliminate the PM/PO handoff waste, and give Full-Stack PMs accountability for the full customer journey.22
  • Consider whether to build internal AI Platform capability. Depending on scale and data assets, there may be a case for an internal AI Platform team — not to build foundation models, but to manage AI capabilities specific to our research methodology, data, and analyst workflow. That team would own fine-tuning, evals, and governance at the model level.
  • Rewrite role expectations for PMs. The new competency stack — evals, prototyping fluency, NLX design — should be in job descriptions and performance frameworks. Gartner predicts 75% of hiring processes will include AI proficiency testing by 2027.23 Move on this earlier than required, not after.
  • Address the headcount question honestly. Duolingo's experience is instructive: the question is not whether AI changes headcount ratios — it does — but whether you lead with that message before you've earned the culture trust to deliver it. The transition away from execution-heavy PO roles should happen through natural attrition, internal role evolution, and retraining investment — not through a restructuring announcement. BCG's data shows the biggest adoption killer is employees perceiving AI as a job threat.17
What success looks like: The org design reflects how work actually happens with AI embedded. PMs spend the majority of their time on strategy, discovery, and judgment — not documentation and coordination.

What Doesn't Change

Marty Cagan's most under-discussed point: "The underlying principles of great product work remain valid." AI is an enabling technology — it enhances productivity, reduces the cost of testing ideas in discovery, and raises the bar on product decisions. It does not change what a great product org is trying to do.24

For us specifically, what doesn't change:

The companies failing at AI transitions are the ones treating AI as a strategy. AI is a capability. The strategy is still the same thing it always was: build something people need, build it well, and keep building it better.
Appendix

Interview Question Guide

Structured questions for PM/PO interviews. Designed to surface time allocation, AI readiness, and decision-making scope — the inputs needed to inform org design options. Print or use as a digital form.

A — Day-to-Day & Time Allocation

Listen for: how much is strategy/vision vs. backlog management vs. stakeholder coordination. Probe if the answer is vague.

Listen for: high execution + high stakeholder usually signals an Execution PO profile. High discovery + strategy = Full-Stack or Strategic PM.

Often reveals the gap between current state and what they believe their highest-value work is.

B — Focus & Decision-Making

Listen for: whether they led with data, intuition, or consensus-building. Whether they actually owned it or facilitated others owning it.

Time horizon is a strong signal of role scope. POs tend to operate in "now." Strategic PMs tend to operate in "later."

Discovery muscle is one of the most important signals. Low customer contact = execution-focused, internal-facing role.

C — AI Usage & Readiness

Don't prompt with tool names — let them list what they're using. "Occasionally" almost always means less than once a week.

Specificity is the signal here. Vague answers ("it helps me write things") vs. specific workflow integration ("I paste my user interview transcripts and ask it to extract themes") are very different maturity signals.

This surfaces both imagination and current frustrations. Also distinguishes people who've thought about it from those who haven't.

This is the trust question. Listen for: threat, curiosity, excitement, skepticism. The answer shapes what kind of support this person needs.

D — Org & Structure

Work initiation is a direct proxy for how much strategic ownership the PM/PO has vs. how reactive they are.

Often surfaces the most useful data in the whole interview. Let them talk.

Interviewer Assessment (fill in after)
Sources
1.TechCrunch, "Duolingo launches 148 courses created with AI after sharing plans to replace contractors," April 30, 2025. techcrunch.com
2.Organizational Physics (Lex Sisney), "When It Comes to AI Transformation, Airtable Is Half Right," September 21, 2025. organizationalphysics.com
3.CNBC, "Airbnb poaches former Meta GenAI leader to be new technology chief," January 14, 2026. cnbc.com; Skift, "Airbnb's Chesky on AI Cutting Jobs: 'Not as Much as I Thought,'" January 14, 2026. skift.com
4.Salesforce Careers, Agentforce PM role postings, Q4 2024–Q1 2025. careers.salesforce.com
5.CNBC, "AI job cuts: Amazon, Microsoft and more cite AI for 2025 layoffs," December 21, 2025. cnbc.com
6.Lenny Rachitsky, "How AI Will Impact Product Management," Lenny's Newsletter, April 9, 2024. lennysnewsletter.com
7.Lenny Rachitsky, "How Close Is AI to Replacing Product Managers?" Lenny's Newsletter, July 9, 2024. lennysnewsletter.com
8.McKinsey, "How Generative AI Could Accelerate Software Product Time to Market," May 2024. mckinsey.com
9.St. Louis Fed, "The Impact of Generative AI on Work Productivity," February 2025. stlouisfed.org
10.Fortune, "Thousands of CEOs just admitted AI had no impact on employment or productivity" (citing NBER study of 6,000 executives), February 17, 2026. fortune.com
11.Lenny Rachitsky, "State of the Product Job Market in Early 2026," Lenny's Newsletter, March 24, 2026. lennysnewsletter.com
12.Mike Clark, "The Great Reshuffling: How AI is Polarizing PM Roles," AgentsToday #16, August 17, 2025. agentstoday.substack.com
13.David Haberlah, "What 638 Practitioner Voices Reveal About PMs' AI Transformation," Medium, March 2026. medium.com
14.Eira Hayward, Mind the Product, "Why Most AI Products Fail: Key Findings from MIT's 2025 AI Report," August 22, 2025. Source study: MIT/NANDA, "The GenAI Divide: The State of AI in Business 2025." mindtheproduct.com
15.Karim R. Lakhani, Jared Spataro, and Jen Stave, "The 'Last Mile' Problem Slowing AI Transformation," HBR, March 9, 2026. hbr.org
16.McKinsey QuantumBlack (Alex Singla, Alexander Sukharevsky, Lareina Yee), "The State of AI: How Organizations Are Rewiring to Capture Value," March 2025. mckinsey.com
17.BCG, "AI at Work 2025: Momentum Builds, but Gaps Remain," Third Edition, June 2025. bcg.com
18.Jin Li, Feng Zhu, and Pascal Hua, "Overcoming the Organizational Barriers to AI Adoption," HBR, November 11, 2025. hbr.org
19.Marty Cagan, SVPG, "AI Product Management 2 Years In," December 30, 2024 (updated May 27, 2025). svpg.com
20.Ayelet Israeli and Eva Ascarza, "Most AI Initiatives Fail. This 5-Part Framework Can Help," HBR, November 20, 2025. hbr.org
21.Reforge, "How AI Changes Product Management: Same Role, New Possibilities." reforge.com
22.Reforge, "AI Native Product Teams." reforge.com
23.Gartner, "Top Predictions for IT Organizations and Users in 2026 and Beyond," October 21, 2025. gartner.com
24.Marty Cagan, SVPG, "INSPIRED in the Generative AI Era," May 5, 2025. svpg.com
Additional sources: IMD AI Maturity Index 2025 (Yokoi, Wade, Shan) — imd.org · Gartner AI Maturity Survey, June 2025 — gartner.com · Springer MRQ, "Where Does AI Play a Major Role in NPD and PM?" 2025 — springer.com · TechCrunch, "Notion launches AI agents at $500M ARR," Sept 2025 — techcrunch.com