Glossary

What is embeddings?

Embeddings are numerical vectors that capture the meaning of text or other data, so that items with similar meaning sit close together in the vector space.

← All glossary terms

An embedding is a list of numbers — a vector — that represents the meaning of a piece of data such as a sentence, a document, an image, or a product. A model produces the embedding so that semantically similar items land close together in the vector space and dissimilar items land far apart. "Cancel my subscription" and "how do I close my account" end up near each other even though they share no words, because the embedding captures meaning rather than surface text.

Embeddings are produced by a model trained for the purpose, separate from the generative model that writes answers. You pass text in and get a fixed-length vector out; the distance between two vectors (often cosine similarity) measures how related their meanings are. Because the representation is numerical, you can do mathematics on meaning — find nearest neighbours, cluster related items, detect duplicates, or measure how far a new input sits from anything the system has seen before.

In production, embeddings are the foundation of semantic search and retrieval-augmented generation: you embed your documents, store them in a vector database, and at query time embed the question to find the most relevant passages. The same mechanism powers recommendations, deduplication, classification, and anomaly detection. Practical concerns include choosing an embedding model suited to your domain and languages, keeping embeddings consistent (re-embedding everything if you change models), and managing the cost of embedding large corpora.

Embeddings matter because they are how machines compare meaning at scale, which is what makes retrieval, search, and grounding possible in the first place. They are also a common, quiet failure point: if the embedding model doesn't capture the distinctions that matter in your domain — say, the difference between two similar legal clauses — retrieval will surface near-misses and the answers built on them will be subtly wrong. Choosing and evaluating the embedding model is therefore as consequential as choosing the language model it feeds.

RelatedWhat is a vector database?

RelatedWhat is RAG?

RelatedData & platform engineering

RelatedAI consulting

ReferenceThe applied-AI glossaryEvery term, defined for production — agents, RAG, evals, embeddings, and more.

ServiceAI consultingStrategy and production engineering in one continuous engagement.

From definition to deployment

Understanding the term is step one. Bring us the problem and we'll build the system that solves it — and prove it moved the number.

Start a conversation

See our work