Fine-tuning

What is it?

With fine-tuning you take an already trained language model, such as GPT-4o or an open-source variant like Mistral or Llama, and train it further on a limited set of examples that are typical of your task or domain. The model learns the desired output style, terminology, tone, or classification logic without losing its general knowledge.

Fine-tuning is not the same as training a model from scratch. The starting point is an existing, already capable model; fine-tuning refines its behaviour for a specific purpose with considerably less data and computing power. For SMEs, fine-tuning is relevant when you want a model to always write in a fixed style, consistently classify invoice lines, or format CV summaries according to your own template.

Why it matters for SMEs

A generic language model does many things well, but does not always perform consistently on specialised or highly structured tasks. If the model answers similarly worded questions differently each time, or if the tone or format does not match your house style, fine-tuning is an option to sharpen that. It is the choice for behaviour and style rather than knowledge storage.

Consistent output for repeating tasks. A fine-tuned model gives similar output for similar input, which is critical for invoice classification, summary generation, or standard correspondence.
Domain-specific terminology without explanation. When you write 'VVE contribution', 'inlener tariff', or 'ERP booking', a fine-tuned model immediately understands the context without you describing it in every prompt.
Less prompt engineering needed. By training behaviour into the model rather than directing it every time through long instructions, prompts become shorter and more reliable.

Fine-tuning is the right approach when the task is stable and repeating and you do not need continuously updated document sources. As soon as the answer depends on current or client-specific information you want to trace to a source, document grounding (RAG) is the better choice.

How it works

Fine-tuning requires a training set of input-output pairs that demonstrate the desired model behaviour. The more and higher-quality examples, the better the result. The process typically runs through a platform or API from the model provider, such as OpenAI's fine-tuning API or open-source tools like Hugging Face.

Compile data: gather hundreds to thousands of examples of the task, each consisting of an input and the ideal output the model should learn to produce.
Clean data: remove inconsistencies, correct errors, and ensure the examples are representative of all situations the model will encounter.
Start the training run: the data is passed through the base model via the fine-tuning API or training tool, adjusting the model parameters.
Evaluate: the fine-tuned model is tested on a separate evaluation set to assess whether the desired behaviour has been learned and whether it has not overfitted on the training data.
Deploy and monitor: the fine-tuned model replaces or supplements the generic model in the production environment, and performance is tracked to determine when retraining is needed.

Retraining is needed as soon as the task changes or the data becomes stale. Fine-tuning is not a one-off intervention but a maintenance cycle.

Example in practice

Picture a staffing agency that wants its AI assistant to always produce CV summaries in a fixed structure: key qualifications first, then relevant work experience, then availability. A generic model writes this differently each time. After fine-tuning on a few hundred approved summaries from the internal database, the model consistently delivers the correct format in the agency's house style, without the recruiter having to describe the structure in every prompt.

Comparison and misconceptions

Fine-tuning adjusts the model's behaviour and style based on training data; document grounding (RAG) lets the model answer from current, supplied documents. Fine-tuning is better for consistent formatting and stable task logic; RAG is better when the answer depends on changing or client-specific information you want to trace back to a source.

Frequently asked questions

What is fine-tuning and how does it differ from RAG?

Fine-tuning adjusts the model itself on your data; RAG gives the model your data as context with each question. Fine-tuning changes how the model thinks and writes; RAG changes what the model knows at the moment of the question. For business-specific knowledge that changes regularly, RAG is almost always the better choice.

When is fine-tuning worth it for an SME?

When you need a consistent writing style, terminology, or reasoning approach that you cannot reliably enforce through prompt instructions alone. Think of legal language, a strongly recognizable brand voice, or domain-specific classifications. Fine-tuning takes time, data, and budget; start only when the standard approach falls short.

How expensive and time-consuming is fine-tuning?

That depends on the model and the volume of training data. Fine-tuning smaller models via the OpenAI API is affordable; larger models and more data cost more. Budget a few hundred euros for a first experiment, plus the time to prepare the training data. The latter is often the most labour-intensive part.

What is it?

Why it matters for SMEs

How it works

Example in practice

Comparison and misconceptions

Frequently asked questions

Curious what AI
can do for your processes?

Stay up to date with the latest news
and developments in Agentic AI

Fine-tuning

What is it?

Why it matters for SMEs

How it works

Example in practice

Comparison and misconceptions

Frequently asked questions

Explore related terms

Curious what AI can do for your processes?

Stay up to date with the latest news and developments in Agentic AI

Curious what AI
can do for your processes?

Stay up to date with the latest news
and developments in Agentic AI