Vector Embedding

A way of converting the meaning of text into numbers that an AI system can compare and search.

Vector embedding, embedding, vector representation

Definition

A vector embedding is a numerical representation of text, an image, or other data as a sequence of numbers, where the position in that space captures meaning and relationships to other items.

What is it?

A vector embedding is a conversion of text or other data into a sequence of numbers: a vector in a mathematical space. The numbers are not arbitrary; they are calculated by a language model that has learned which words, sentences, and concepts are semantically similar to each other. Texts with similar meanings are positioned close together in that vector space; texts with very different meanings are positioned far apart.

Embeddings are the technical layer that makes semantic search, RAG, and recommendation systems possible. Without embeddings, a computer treats text as a sequence of characters; with embeddings, it treats text as a point in a space of meaning.

Why it matters for SMEs

Embeddings are the reason AI systems understand text rather than just matching it. For SMEs, that matters as soon as you want to use AI for searching, comparing, or retrieving information based on content.

  • Embeddings allow an AI agent to compare documents, candidate profiles, or emails based on how similar their content is, without requiring you to use the exact same words as the source.
  • They form the basis of RAG systems: by storing business knowledge as embeddings in a vector database, an AI model can give current and specific answers rather than generic ones.
  • Quality control and classification of unstructured input, such as complaints, quote requests, or support tickets, become automatable because embeddings recognise semantic categories.

As an SME owner you will not typically work with embeddings directly, but every AI tool that searches, compares, or retrieves based on meaning uses them under the hood.

How it works

An embedding is created by passing a piece of text through an embedding model: a neural network trained to capture meaning relationships. The output is a vector, typically consisting of hundreds to thousands of numbers.

  1. A piece of text (word, sentence, paragraph, or document) is passed to the embedding model.
  2. The model converts the text into a fixed sequence of numbers: the embedding vector.
  3. This vector is stored in a vector database or memory.
  4. When a comparison or query is made, the new text is converted in the same way.
  5. The distance between the two vectors is calculated, typically as cosine similarity: the smaller the angle, the more closely the meanings align.

Commonly used embedding models include OpenAI embeddings (text-embedding-3-small, text-embedding-3-large), Cohere Embed, and open-source models from Hugging Face. The choice of embedding model partly determines how well semantic similarity is recognised for your specific language and domain.

Example in practice

Picture a housing association that wants tenants to be able to ask questions about their tenancy agreement through a chat interface. The administrator loads all tenancy agreements into the system: each contract is split into clauses and each clause is converted into an embedding. When a tenant asks "what is the notice period?" that question is also converted into an embedding. The system finds the clauses whose embeddings are closest to the query's embedding, locates the relevant notice provisions, and passes them to a language model that formulates a clear answer.

Comparison and misconceptions

An embedding is the vector representation of a specific piece of text; a vector database is the system that stores embeddings and makes them searchable. The difference is between a single data point and the library that manages all data points. Both are needed for a working semantic search system.

Frequently asked questions

What is a vector embedding?
A vector embedding is a numerical representation of a piece of text where the meaning is captured in a sequence of numbers. Texts with similar meaning have vectors that sit close together. Embeddings are the technology behind semantic search and RAG.
How are embeddings created?
You send text to an embedding model via an API (such as text-embedding-3-small from OpenAI). The model converts the text to a vector of thousands of numbers. You store that vector in a vector database. With every search query, the query is also converted to a vector and compared with the stored vectors.
What is the difference between an embedding and a token?
A token is the smallest text unit a model processes: a piece averaging four characters. An embedding is a mathematical representation of a longer piece of text that summarizes the semantic meaning in a single vector. Tokens are used for generation; embeddings for comparison and search.
From insight to impact

Curious what AI
can do for your processes?

In a free intro call we look at where AI saves you the most time, and what a connected setup looks like.