Model Card Review and Writing
A model card is the technical and ethical documentation of an AI model — what it was trained on, what it's good at, what it isn't, what risks it carries, and how it should and shouldn't be used. For PMs, model cards serve two purposes: evaluating a third-party model before building on it, and documenting an internally-built or fine-tuned model before sharing it. This skill covers both.
---
Context
Why model cards matter for PMs:Third-party model cards tell you what a model was trained on and tested against — which tells you where it will fail for your specific use case. An LLM trained on English text will underperform on multilingual tasks even if the provider's benchmarks look strong. A model evaluated on academic benchmarks may not reflect your domain.
Internally, a model card is the accountability document that describes what you built, who it was built for, what it was tested on, and what the known limitations are. Without one, no one can make an informed decision about using or deploying the model.
The standard model card sections (from Mitchell et al., 2019 — the original paper):Model details / Intended use / Factors / Metrics / Evaluation data / Training data / Quantitative analyses / Ethical considerations / Caveats and recommendations
---
Step 1 — Mode selection
This skill has two modes:
MODE A — REVIEW an existing model card (from a third-party provider or another team)→ Go to Step 2
MODE B — WRITE a model card (for an internally-built or fine-tuned model)→ Go to Step 3
---
Step 2 — Review an existing model card (MODE A)
Use this checklist to evaluate a model card for completeness and relevance to your use case:
\\\
MODEL CARD REVIEW CHECKLIST: [Model name]
Reviewer: [PM name] Date: [date] Use case: [Your specific use case]
SECTION 1 — MODEL IDENTITY:
[ ] Model name and version are explicitly stated
[ ] Model architecture type is described
[ ] Provider / creator is identified
[ ] Release date or version date is stated
[ ] License terms are clearly stated
SECTION 2 — INTENDED USE:
[ ] Primary intended use cases are listed
[ ] Out-of-scope uses are listed
[ ] User population is described
SECTION 3 — TRAINING DATA:
[ ] Training data sources are described
[ ] Training data cutoff date is stated
[ ] Known biases in training data are disclosed
SECTION 4 — EVALUATION:
[ ] Benchmark datasets are named
[ ] Performance metrics are reported
[ ] Disaggregated metrics by subgroup are reported
SECTION 5 — LIMITATIONS:
[ ] Known failure modes are disclosed
[ ] Known biases are disclosed
[ ] Context length limitations are stated
SECTION 6 — RISKS AND MITIGATIONS:
[ ] Foreseeable misuses are described
[ ] Recommended mitigations are described
PM ASSESSMENT:
Gaps identified: [What the model card doesn't tell you that you need to know]
Concerns for your use case: [Specific limitations or risks]
Decision: [Suitable / Suitable with mitigations / Not suitable]
\\\
---
Step 3 — Write a model card (MODE B)
Use this template to document an internally-built, fine-tuned, or adapted model covering: Model details, Intended use, Training and fine-tuning data, Evaluation, Limitations and risks, Recommended mitigations, Caveats, and Changelog.
Quality check before delivering
MODE A (review):
MODE B (write):