What is it?
Top-p, also known as nucleus sampling, is a setting that guides an AI model's word selection by establishing a dynamic threshold. Rather than using a fixed number of candidate words, the system recalculates for each position which words qualify: all words whose cumulative probability reaches the set value together form the nucleus.
A top-p of 0.9 means the model chooses from the most probable words that together account for 90 per cent of the probability mass. Everything else is excluded, producing a filtered but not overly restricted selection.
Why it matters for SMEs
Top-p is a technical setting that most SME users never adjust manually, but it does influence the tone and variation of AI output. Understanding what it does helps with diagnosing problems and choosing the right pre-configured AI tool.
- A top-p that is too low makes the model repetitive: it cycles through the same phrasings, which is noticeable in texts that should sound lively, such as client communications or marketing copy.
- A top-p that is too high allows the model to make broader choices, increasing creativity but also raising the chance of unusual or incoherent sentences.
- In combination with temperature, both settings together define the character of the output: temperature sharpens or softens the probability distribution, top-p determines how many options are considered at all.
For most business applications, the default settings are sufficient and top-p should only be adjusted when temperature alone does not produce the desired result.
How it works
Top-p works by calculating a threshold for each token position based on the cumulative probability of the most likely tokens.
- The model calculates a probability distribution across all possible tokens for the next position.
- Tokens are ranked from most probable to least probable.
- The model adds up probabilities from the top down until the sum reaches the set top-p value.
- All tokens outside that boundary are excluded from selection.
- From the remaining tokens, the model picks the next one, with the choice still influenced by the temperature setting.
The result is a dynamic window that is narrower when the most probable option is already dominant, and broader when probabilities are more evenly spread. This keeps output coherent where a choice is obvious, while giving the model more latitude where multiple options are genuinely plausible.
Example in practice
Picture a real estate agent using an AI tool to write property descriptions. With a standard top-p of 0.9, the model produces readable, varied texts without generating incoherent sentences. If the owner lowers top-p to 0.5, the texts become safer but also more repetitive: the model returns to the same descriptive phrases. Raising it to 0.99 can make the texts more creative but occasionally introduces an odd word choice that needs manual correction.
Comparison and misconceptions
Temperature shifts the probabilities of all tokens up or down like a global dial; top-p trims away the least likely options like a selection threshold. Together they define the model's creative range. The recommended starting point: adjust temperature first and leave top-p at its default until that proves insufficient.

