Automated Experimentation System Design

Most product teams run 2–5 experiments per month. Companies that compound their growth the fastest run 20–100. The difference is not team size — it's a system. An automated experimentation system handles the repetitive parts of the experiment lifecycle so PMs and engineers can focus on hypothesis quality and decision-making. This skill designs that system.

---

Context

The experiment lifecycle (what can be automated vs. what requires human judgment):

Stage	Automatable?	What automation does
Hypothesis generation	Partially	AI surfaces anomalies and patterns that suggest experiment ideas
Experiment design	Partially	Automated sample size calculation, duration recommendation
Instrumentation check	Yes	Validates that required events are being logged before launch
Traffic allocation	Yes	Automated random assignment, exposure logging
Significance monitoring	Yes	Tracks p-value, flags when significance is reached
Early stopping	Partially	Alerts when guardrail metrics are violated; human decides to stop
Result analysis	Partially	Calculates stats, segments, generates report draft
Decision	No	Human must decide — automation presents the evidence
Learning capture	Partially	AI extracts and stores the learning; human validates

---

Step 1 — Define the experimentation system scope

Ask:

What types of experiments does the team run?

How many experiments does the team run per month today?

What is the current bottleneck?

What analytics infrastructure exists?

What experiment tooling exists?

What is the target experiment velocity?

Step 2 — Design the hypothesis pipeline

AI-powered idea sources: metric anomaly detection, churn signal mining, user feedback clustering, competitor change monitoring.

Step 3 — Design the pre-launch automation

Automated instrumentation check, sample size calculation, and deterministic assignment.

Step 4 — Design the in-flight monitoring

Significance monitoring, guardrail metric alerts, peeking protection, and SRM detection.

Step 5 — Design the result analysis automation

Auto-generated experiment reports with AI interpretation. PM confirms the decision.

Step 6 — Design the learning capture system

Experiment knowledge base with AI-powered retrieval to prevent duplicate experiments and compound learnings.

Step 7 — Output the automated experimentation system design

Implementation roadmap:

Phase 1 (30 days): Pre-launch automation

Phase 2 (60 days): In-flight monitoring

Phase 3 (90 days): Result analysis automation

Phase 4 (120 days): Learning capture

Quality check before delivering

Bottleneck is identified — system is designed to fix the actual constraint

Instrumentation check is a hard BLOCK — not a warning

Peeking protection is explicit — early stopping rules are defined

SRM detection is included

AI interpretation of results is a draft — PM decision is not automated

Learning capture includes retrieval — not just storage

Suggested next step: Build the instrumentation check first. It eliminates the most common cause of failed experiments permanently.