What is it?
A token is the smallest unit of text that an AI model processes. In most English-language models, a token averages around four characters or three-quarters of a word. A ten-word sentence typically consists of thirteen to fifteen tokens.
Tokens are the basic unit of everything a language model does: every word you type is split into tokens, and every word the model produces is a token. The limit of the context window, the cost of API usage, and the speed of a model are all expressed in tokens.
Why it matters for SMEs
For an SME owner using AI tools, tokens determine two practical things: how much fits into a conversation and what it costs.
- The context window of a model, its maximum working memory per session, is capped in tokens. Long documents, extended email threads, or large datasets need to be split or summarised to stay within that limit.
- API usage for AI services is billed per token: the more text in and out, the higher the cost. Concise, well-targeted instructions are therefore also economically sensible.
- Speed and capacity planning depend on token volume: an agent processing thousands of invoices daily generates a high token volume that needs to be accounted for in the architecture.
You do not need to count tokens to work effectively with AI, but the concept helps explain why a model cannot process a long document in one pass, or why a particular task costs more than expected.
How it works
Language models work not with letters or words but with tokens, which are determined by a tokeniser: an algorithm that splits text into pieces the model recognises from its training data.
- The text you enter is split by the tokeniser into tokens, each a chunk the model treats as a single unit.
- Each token is assigned a numerical ID that the model uses when processing the input.
- The model generates output token by token, with each newly generated token influencing the next.
- The generated token IDs are translated back into readable text by the tokeniser.
Because the tokeniser differs by model, the same piece of text can contain a different number of tokens in GPT-4 than in Claude or Gemini. That matters when comparing budgets or limits across different AI services.
Example in practice
Picture an accounting practice processing dozens of purchase invoices monthly via an AI agent. The agent reads each invoice, extracts the supplier, amount, and due date, and posts that data to the accounting package. Each A4 invoice contains roughly 400 to 600 tokens. At fifty invoices a day, the agent processes tens of thousands of tokens daily. That volume is useful to know when choosing an AI service and estimating monthly costs.
Comparison and misconceptions
A token is not the same as a word, but that is the fastest mental model for everyday use. More precisely, a word averages around one and a half tokens in English text. Tokens are the unit of measurement for AI models; words are the unit of measurement for people.

