Embedding Model Selection and Specification
Embedding models convert text (or other content) into vectors — the numerical representations that power semantic search, RAG retrieval, and similarity matching. Choosing the wrong model means rebuilding your entire index when you switch. This skill selects the right embedding model for the task and writes the specification that makes the choice stick.
---
Context
What embedding models do:An embedding model takes a piece of text and outputs a fixed-length array of numbers (a vector) that represents its meaning. Similar texts produce similar vectors.
The embedding model decision axes:---
Step 1 — Define the embedding use case
Ask: content type, query pattern, languages, domain, expected volume, and downstream task.
Step 2 — Evaluate candidate models
API-based: OpenAI text-embedding-3-small (1536d, $0.02/1M tokens), OpenAI text-embedding-3-large (3072d, $0.13/1M tokens), Cohere Embed v3 (1024d, best multilingual). Open-source: nomic-embed-text-v1.5 (768d, long context, free), BAAI/bge-m3 (1024d, 100+ languages).Step 3 — Define the model selection criteria and decision
Primary factors: multilingual requirement, data sovereignty, volume cost, context length needs, quality vs. cost tradeoff.
Step 4 — Write the embedding model specification
Include: model name and version (pinned), input preprocessing, batching configuration, embedding storage schema, similarity metric and thresholds, and refresh policy.