Case Study 1: Planning an AI-Accelerated MVP

A Six-Week Sprint to Build a Customer Feedback Platform

The Scenario

Mira Chen is the CTO of Insightful, a startup that has just closed a seed round of $1.2 million. The investors have given the team a clear mandate: build and launch a minimum viable product (MVP) of their customer feedback platform within six weeks. The MVP must be functional enough to onboard three pilot customers who have signed letters of intent.

Mira has a team of three developers (including herself), a product designer, and a part-time product manager. All three developers are experienced with AI coding tools -- Mira uses Claude Code as her primary tool, one developer uses Cursor, and the third uses a combination of GitHub Copilot and Claude Code. The product designer uses AI for generating UI mockups, and the product manager uses AI for requirements documentation.

The customer feedback platform, called FeedbackLoop, must support the following core features for the MVP:

  1. Customer Portal: A branded web interface where end-users submit feedback (feature requests, bug reports, praise)
  2. Admin Dashboard: An internal dashboard for Insightful's clients to view, categorize, and prioritize feedback
  3. Email Integration: Automated email notifications for new feedback, status changes, and weekly digests
  4. Analytics: Basic charts showing feedback volume, categories, sentiment trends, and response times
  5. User Management: Multi-tenant architecture with role-based access control
  6. API: A REST API for third-party integrations

Six weeks. Three developers. Six major features. Without AI tools, this would be a 12-16 week project. With AI tools, Mira believes it is possible -- but only with meticulous planning.


Phase 1: Requirements and Task Decomposition (Week 0 -- Days 1-3)

Mira begins by using AI to synthesize the requirements from three sources: the investor pitch deck, the pilot customer interviews (conducted by the product manager), and the competitive analysis.

She feeds the product manager's interview notes into Claude Code with the following prompt:

I have interview notes from three pilot customers for our feedback platform.
Please:
1. Identify the top 10 features mentioned across all interviews
2. Flag contradictions between customers
3. Categorize features as Must-Have, Nice-to-Have, or Future
4. Generate user stories for each Must-Have feature

The AI identifies a critical contradiction: Customer A wants all feedback to be public (visible to other customers), while Customer C insists that feedback must be private by default. Mira schedules a call with both customers to resolve this before development begins. They agree on a configurable privacy setting -- a compromise that would have been discovered much later without the AI-assisted synthesis.

With validated requirements in hand, Mira decomposes the six features into AI-atomic tasks. She uses the task decomposer tool (see code/example-01-task-decomposer.py) to structure the decomposition:

Customer Portal (14 AI-atomic tasks) - Feedback submission form with rich text editor - File attachment upload (images, screenshots) - Feedback list view with filtering and sorting - Individual feedback detail view - Status tracking for submitted feedback - Branded theming system (CSS variables) - Responsive mobile layout - ...and 7 more subtasks

Admin Dashboard (12 AI-atomic tasks) - Feedback queue with bulk actions - Category management (create, edit, merge) - Priority assignment and sorting - Status workflow (New, In Review, Planned, In Progress, Done) - Quick response templates - ...and 7 more subtasks

In total, she identifies 68 AI-atomic tasks across the six features. She then classifies each task by AI acceleration tier:

Feature Total Tasks Tier 1 Tier 2 Tier 3
Customer Portal 14 10 3 1
Admin Dashboard 12 8 3 1
Email Integration 10 7 2 1
Analytics 8 4 3 1
User Management 12 6 4 2
API 12 9 2 1
Total 68 44 17 7

The classification reveals that 65% of tasks are Tier 1 (highly AI-acceleratable), 25% are Tier 2, and only 10% are Tier 3. This is a favorable mix for an AI-accelerated project.


Phase 2: Estimation (Week 0 -- Days 3-4)

Mira uses the Three-Point AI Estimation Method for each feature group. For the Customer Portal, she estimates:

Scenario Without AI With AI
Optimistic 80 hours 18 hours
Realistic 120 hours 35 hours
Pessimistic 180 hours 70 hours
PERT Estimate 123 hours 37 hours

She repeats this for all six features:

Feature Traditional PERT AI-Augmented PERT Acceleration
Customer Portal 123 hours 37 hours 3.3x
Admin Dashboard 110 hours 32 hours 3.4x
Email Integration 65 hours 18 hours 3.6x
Analytics 85 hours 30 hours 2.8x
User Management 95 hours 35 hours 2.7x
API 80 hours 20 hours 4.0x
Total 558 hours 172 hours 3.2x

With three developers working 40-hour weeks for six weeks, they have 720 available hours. The AI-augmented estimate of 172 implementation hours uses only 24% of available capacity. This leaves significant room for:

  • Code review: 40 hours (estimated)
  • Testing and QA: 50 hours
  • Design review and iteration: 30 hours
  • Deployment and DevOps: 25 hours
  • Buffer for AI underperformance: 50 hours
  • Stakeholder communication: 20 hours

Total planned utilization: 387 hours (54% of capacity). Mira deliberately keeps utilization below 60% to account for the inherent uncertainty in AI-augmented estimates.


Phase 3: Sprint Planning (Week 0 -- Day 5)

Mira organizes the six weeks into three two-week sprints:

Sprint 1 (Weeks 1-2): Foundation and Core Features - Database schema and data models (Tier 1) - Authentication and multi-tenant architecture (Tier 2) - API scaffolding for all endpoints (Tier 1) - Customer Portal: feedback submission and list view (Tier 1) - Admin Dashboard: feedback queue and categorization (Tier 1) - Key decision: resolve architecture for multi-tenancy before AI generates code

Sprint 2 (Weeks 3-4): Feature Completion - Email integration (Tier 1) - Analytics charts and dashboard (Tier 2) - User management and role-based access (Tier 2) - API completion and documentation (Tier 1) - Customer Portal: remaining features (Tier 1/2) - Admin Dashboard: remaining features (Tier 1/2)

Sprint 3 (Weeks 5-6): Polish, Testing, and Launch - Comprehensive testing (integration, E2E) (Tier 1 for generation, Tier 3 for strategy) - Performance optimization (Tier 2) - Security audit (Tier 3) - Pilot customer onboarding (Tier 3) - Bug fixes and polish (Tier 2) - Deployment to production (Tier 3)

Notice that Sprint 1 is heaviest on Tier 1 tasks, Sprint 2 mixes tiers, and Sprint 3 is heaviest on Tier 3 tasks. This is deliberate: Mira wants to front-load the AI-acceleratable work to build momentum and identify integration issues early.


Phase 4: Risk Management

Mira identifies the following risks and mitigations:

Risk Probability Impact Mitigation
AI generates inconsistent multi-tenant code High High Document tenant isolation pattern before starting; include in every prompt
Email deliverability issues Medium High Use established service (SendGrid); allocate testing time in Sprint 2
Pilot customer requirements change Medium Medium Weekly check-in calls; protect Sprint 3 for changes
AI tool outage during critical sprint Low High All developers trained on at least two AI tools; manual coding as fallback
Performance issues at scale Medium Medium Load testing in Sprint 3; defer optimization if needed
Team burnout from compressed timeline Medium High Strict 40-hour weeks; buffer time in plan; celebrate sprint completions

The highest risk is architectural inconsistency from AI-generated code. Mira addresses this by spending the first day of Sprint 1 writing a comprehensive CLAUDE.md file that documents: - Database naming conventions - API response format standards - Error handling patterns - Authentication middleware structure - Multi-tenant data isolation approach

This file becomes the "architectural constitution" that every developer includes in their AI prompts.


Phase 5: Execution and Adaptation

Sprint 1 Results: The team completes all planned work plus two tasks originally scheduled for Sprint 2. The AI acceleration factor for Sprint 1 is 3.8x -- higher than the estimated 3.2x, primarily because the Tier 1 tasks were even more AI-friendly than expected. Code review takes longer than planned (48 hours versus 25 estimated) because the sheer volume of code generated exceeds expectations. Mira adjusts Sprint 2 planning to allocate more review time.

One significant issue emerges: the three developers' AI tools generate subtly different patterns for error handling. Despite the CLAUDE.md file, Claude Code tends to raise exceptions while Cursor-generated code tends to return error dictionaries. Mira calls a 30-minute alignment meeting and updates the architectural document with explicit error-handling examples. From Sprint 2 onward, all generated code follows the same pattern.

Sprint 2 Results: Velocity is slightly lower than Sprint 1 (as expected, given more Tier 2 tasks), but all planned work is completed. The email integration requires more manual work than estimated because the SendGrid API has edge cases that AI handles poorly. The analytics feature, conversely, is completed faster than estimated because AI excels at generating chart configurations and data aggregation queries.

Mira sends the following status update to investors:

"We are on track for MVP launch at the end of Week 6. Sprint 1 and Sprint 2 are complete with all core features functional. Sprint 3 will focus on testing, security, and pilot customer onboarding. Current AI acceleration is tracking at 3.4x our baseline estimates, giving us comfortable buffer for Sprint 3 polish."

Sprint 3 Results: The security audit reveals three vulnerabilities in AI-generated code: an SQL injection risk in a search query (the AI used string formatting instead of parameterized queries), an authorization bypass in the admin API (the AI forgot to check tenant isolation on one endpoint), and exposed stack traces in error responses. All three are fixed within two days thanks to the buffer time Mira built into the plan.

Pilot customer onboarding proceeds on schedule. Two of the three customers request minor customizations that the team implements using AI in under a day each.


Outcome

FeedbackLoop launches on Day 42, one day ahead of the six-week deadline. All three pilot customers are onboarded and providing feedback through the platform. The final metrics:

Metric Planned Actual
Total development hours 172 158
Code review hours 40 62
Testing hours 50 45
Buffer hours used 50 22
Defects found in testing -- 18
Defects found post-launch -- 4
Overall acceleration factor 3.2x 3.5x

Key Lessons:

  1. Front-load architectural decisions. The CLAUDE.md file prevented most (but not all) consistency issues. Investing the first day in architecture documentation paid dividends throughout the project.

  2. Plan for code review expansion. Code review took 55% more time than estimated. In future projects, Mira will allocate 1 hour of review for every 3 hours of AI-assisted development.

  3. AI acceleration varies by feature. The API and email integration (mostly boilerplate) hit 4x acceleration. Analytics (requiring custom logic) hit only 2.5x. Tier-based estimation correctly predicted this variance.

  4. Security requires extra vigilance. Three security vulnerabilities in AI-generated code reinforced the need for dedicated security review, regardless of the time pressure.

  5. Buffer time is essential. Even with better-than-expected acceleration, the 50 hours of buffer proved necessary for code review overflow and security fixes. Mira used 22 of the 50 buffer hours, confirming that her 54% utilization target was appropriate.

  6. The biggest risk was consistency, not speed. AI delivered on speed but required active management to maintain architectural consistency across three developers using three different AI tools.


Discussion Questions

  1. If the team had only two developers instead of three, would the six-week timeline still be feasible? What adjustments would you make?

  2. Mira chose to keep utilization below 60% of available capacity. In a traditional project, this might seem wasteful. Was this the right decision? What would have happened if she planned at 80% utilization?

  3. The security vulnerabilities were caught in Sprint 3 testing. What could the team have done to catch them earlier? How would you modify the sprint plan to address security continuously rather than at the end?

  4. How would you adapt this plan if one of the three developers had no experience with AI coding tools?

  5. The investors were given a single timeline (six weeks). How would you have presented the Three Timelines to them, and what would your aggressive and conservative estimates have been?