What is a deterministic model in hiring and why does it matter?

2026-07-02

Fredrik Östgren

Imagine rejecting a candidate on Monday who would have passed on Tuesday, with nothing changed but luck and timing. That is exactly what happens when the wrong kind of AI scores your candidates. The difference comes down to one word recruiters are about to hear a lot more often: deterministic. Here is what it means, in plain language, and why it should shape your next vendor conversation.

Learn more

What is a deterministic model in hiring?

A deterministic model scores every candidate response against fixed, predefined criteria, so identical answers always receive identical scores. Think of it like a scoring rubric that has been turned into software. The criteria are set in advance, the weighting of each criterion is explicit, and the model applies them the same way for candidate number 1 and candidate number 10,000.

That predictability is the point. If a candidate asks why they scored what they did, there is a real answer: this response, measured against this criterion, produced this score. Nothing is left to chance, and nothing changes based on the time of day or which server handled the request.

What is a probabilistic model?

A probabilistic model, which includes every large language model, generates its output by predicting likely words rather than applying fixed rules. That is what makes LLMs brilliant at conversation: they are creative, flexible, and human-sounding. But it also means their output naturally varies. Ask an LLM to score the same interview answer twice and you can get two different numbers, each accompanied by a confident-sounding justification.

For writing a job ad, that variability is harmless. For deciding which humans move forward in a hiring process, it is a serious problem.

What happens when you score candidates with an LLM?

When you score candidates with an LLM, the same person can pass or fail depending on luck. This is not theoretical. In June 2026, engineer Dan Kinsky independently tested a widely shared open-source AI screening tool built on LLM scoring. He ran his own CV through it 100 times. The scores ranged from 66 to 99 out of 100. With a pass threshold of 85, the identical CV would have been rejected roughly two thirds of the time. He described the result as hiring turned into a "luck filter."

Two details from his test matter for recruiters. First, lowering the model's randomness setting, even to zero, did not make the scores consistent; the variability is built into how LLMs work, not a settings problem. Second, when he switched to a more powerful commercial model, the scores clustered tighter but still varied enough that the same CV kept landing on both sides of a realistic cutoff. Academic research has found the same pattern: LLMs produce meaningfully different rankings for identical candidate inputs across runs.

Why does this matter for bias and fairness?

Inconsistent scoring is unfair by definition, because a candidate's outcome no longer depends only on their answers. When identical input can produce different results, the process is arbitrary, and arbitrary processes cannot be audited for bias. You cannot measure whether a system treats groups of candidates differently if it does not even treat the same candidate consistently.

There is a second, subtler problem: explainability. When an LLM is asked to justify a score, it generates a plausible-sounding explanation after the fact. That narrative is not necessarily the actual reason behind the number. Deterministic models are different; the explanation shown to a recruiter is the exact logic that produced the score. Faithful, not just plausible.

Why does this matter legally?

Consistency and explainability are regulatory requirements, not nice-to-haves, because the EU AI Act classifies AI used in recruitment as high-risk. High-risk systems must be transparent enough for humans to interpret their outputs, maintain an appropriate level of accuracy, and operate under effective human oversight. A scoring system that cannot repeat its own results, or explain them faithfully, makes those obligations very hard to meet. If a rejected candidate or a regulator asks why a decision was made, "the model felt differently that day" is not a defensible answer.

Does deterministic mean the interview feels robotic?

No, the interview experience and the scoring are two separate layers, and only the scoring needs to be deterministic.

At Hubert, conversational AI handles the dialogue, so candidates get a warm, natural interview in chat or voice, while deterministic proprietary models handle the assessment behind it. The conversation feels human; the scoring is auditable. You do not have to choose between the two. Candidates seem to agree: structured AI interviews built this way earn 9/10 candidate satisfaction on average.

What should recruiters ask their AI vendor?

Ask one simple question first: if the same candidate gave the same answers twice, would they get the same score? If the vendor cannot give you a straight yes, assume the scoring is probabilistic, and that every promise about fairness, auditability, and compliance is built on unstable ground. Then follow up with two more questions: can you show me the exact criteria and weightings behind a score, and does a human always make the final decision?

Deterministic scoring is not a technical detail. It is the difference between a hiring process you can stand behind and one you have to hope nobody looks at too closely.