Text vs. voice AI screening: Which format is right for your hiring?

2026-06-03

Patricia Hyde

Text or voice? It sounds like a simple choice, but the format of your AI screening tool shapes candidate experience, completion rates, and recruiter workflow. Neither is universally better but the right choice depends on your roles, your candidates, and what you need the process to do.

Learn more

Why does the format of your AI screening tool matter?

The format you choose directly affects how many candidates complete the screening and how well the process fits their situation. When a candidate receives an automated screening invitation, what happens next depends entirely on format. A text-based tool opens a chat on their phone or laptop. A voice agent conducts a spoken interview by phone or browser audio. Both deliver structured, consistent screening at scale. They just do it differently and for different contexts.

How does text-based conversational AI work in hiring?

Text-based conversational AI runs a structured interview through a written chat where candidates read each question and type their answer, at their own pace, on any device. There is no phone call to miss and no fixed time slot required. A candidate working a night shift can complete their interview at 11pm on their phone, and a scored shortlist is ready for the recruiter the next morning.

This format tends to work well for high-volume roles where candidates are balancing shift work, family commitments, or multiple jobs. Across Hubert text-based deployments, 96% of candidates complete their interview, around 70% on mobile, and more than 60% outside traditional office hours. Text also produces a verbatim record of every answer with a clean, auditable trail that makes it straightforward to show exactly what was asked, what was said, and how it was scored.

How do AI voice screening agents work and when are they the better fit?

Voice screening agents conduct interviews through a spoken exchange, either by phone or browser audio. The candidate speaks and the system assesses their response. For roles where verbal communication is the job such as contact centre agents, sales staff, and customer-facing roles, a voice screen is a direct test of what the role actually requires. Hearing how a candidate communicates under real conditions is genuinely useful data.

Voice is also a strong fit when candidates find written responses less natural. For example where the interview language is not their first language, or where accessibility for candidates with reading difficulties is a priority. In those contexts, voice removes barriers that text would introduce, and the conversation can feel more human and less transactional.

Which format produces better candidate experience?

Both formats can deliver strong candidate experience. However, what matters most is whether the format fits the candidate's situation. Hubert records a 9/10 average candidate satisfaction score across deployments in both chat and voice. Candidates consistently value being assessed on what they actually said, receiving clear questions, and completing the process on their own terms.

Text tends to score well with candidates who need flexibility and those who cannot take a call at a set time or prefer to think through written answers. Voice tends to score well with candidates who find speaking more natural than typing, or who are applying for roles where they expect a conversation. Matching the format to the candidate population is the most reliable way to keep experience scores high across the board.

What do regulators expect from AI screening tools?

Regardless of format, recruitment AI is classified as high-risk under the EU AI Act, and both text and voice tools need to meet the same core requirements. Legal defensibility rests on consistent assessment, explainable scoring, and a complete audit trail.

Text-based AI produces a verbatim written record by design, which makes the audit trail straightforward to construct. Voice-based AI can meet the same standard when the transcription and scoring architecture is robust. The key is ensuring that what the candidate said is accurately captured and that the scoring is tied directly to that record. Whichever format you use, the underlying assessment needs to be deterministic: same input, same output, every time, with full explainability at the point of decision.

How do you choose the right AI screening format for your hiring?

Start with your candidates and your roles. Thinking about the format that fits them best is the right one. For high-volume roles where candidates need maximum scheduling flexibility, text tends to drive higher completion rates. For roles where verbal communication is the core competency, voice provides more direct evidence of job-relevant skills. Some organizations use both such as text for one role type, voice for another, and adjust as they learn what works.

Hubert supports both chat and voice screening across 30+ languages, which means you are not locked into one format. The same structured, competency-based assessment runs across both, scored by deterministic models, delivered directly into your ATS as a ranked, auditable shortlist.

If you want to see how each format performs with your roles and candidate population, book a demo with the Hubert team.