What is it?
An embedding is a numerical vector, a sequence of numbers, that captures the meaning of a piece of text. Texts with similar content get vectors that sit close together in mathematical space; texts with different meanings sit further apart. This makes it possible to search by meaning rather than exact word match.
Embeddings are generated by language models such as those from OpenAI or Cohere and stored in a vector database, such as Pinecone, Weaviate, or pgvector. They form the basis of semantic search, document grounding (RAG), and recommendation systems. For SMEs they become relevant as soon as you want to search large volumes of unstructured text: files, CVs, contracts, or emails.
Why it matters for SMEs
Traditional search functions fail when users phrase things differently from what is in the document. Embeddings solve that problem by comparing the meaning of text, not the letters. That is why modern document assistants, candidate matching systems, and knowledge bases in SMEs search more reliably than a classic Ctrl+F.
- Search by meaning, not by words. A question like 'Which contract clause covers liability for delays?' finds the right passage even if the word 'liability' does not appear verbatim.
- Candidate matching without exact keywords. A recruiter searching for 'financial controller with ERP experience' also retrieves candidates who write 'SAP user' or 'accountant with systems experience' in their CV.
- Less manual document searching. Embeddings make it possible to render dozens or hundreds of documents semantically searchable without manual tagging or categorisation.
For any SME working with large volumes of text that need to be searchable and usable for AI, embeddings are a core technical building block.
How it works
Embeddings are generated by a language model that converts text into a vector of hundreds or thousands of numbers. That vector is stored in a vector database. When a search query arrives, the query is also converted to a vector, and the database finds the stored vectors that are closest, returning the most relevant documents first.
- Process text: documents, CVs, contracts, or emails are split into manageable fragments (chunks).
- Generate embedding: each fragment is passed through an embedding model, which returns a numerical vector.
- Fill the vector database: the vectors are stored in a system like Pinecone, Weaviate, or pgvector, linked to the original text.
- Search: an incoming question or search term is also converted to a vector and compared with the stored vectors.
- Retrieve results: the fragments with the smallest distance to the query vector are returned as search results or as context for a language model.
The quality of the embeddings depends on the chosen model and the quality of the source data. Good chunking and clean documents matter at least as much as the technical implementation.
Example in practice
Picture an accounting firm that has built up hundreds of client files over multiple years, filled with PDF reports, correspondence, and VAT summaries. Through embeddings all these documents are made searchable by content. A staff member types: 'Which clients applied for a deferral on their corporation tax return in 2024?' The system compares the query vector with the document vectors and retrieves the relevant files, even when those files do not always contain the word 'deferral' literally.
Comparison and misconceptions
Embeddings are the building block for semantic search; a vector database is the storage for those embeddings. You need both: embeddings determine how accurately meaning is captured, the vector database determines how quickly and efficiently you can search them.

