VOICE AI CANDIDATE SCREENING

Screen 300 Candidates Per HourWith Adaptive Follow-Up Questioning

Ultravox speech-to-speech, Twilio Voice, and a LangGraph dialogue policy that branches on each candidate response. ATS-native scoring across Greenhouse, Lever, Workday, Ashby, and 4 more.

300 calls / hour
22 languages
Sub-300ms turn latency
Evidence-cited scoring
WHY RECRUITING OPS BREAKS AT SCALE

The Five Failure Modes of Phone Screening

We've worked with high-volume retail recruiters, BPO talent ops, and tech-screening teams. The breakdowns are the same.

Screening Backlog Crushes Time-to-Hire

Recruiters take 15-25 minutes per phone screen and triage 40-60 applicants per role

Inconsistent Phone Screens Across Recruiters

Each recruiter improvises questions, applies different rubrics, and writes free-form notes

Multilingual Pipelines Bottleneck on Recruiter Language Coverage

Roles in retail, BPO, and hospitality see candidates across 5-10 languages with no bench depth

Async Video Screens Have Brutal Drop-Off

Pre-recorded one-way video assessments see 45-60% candidate abandonment

High Cost-Per-Screen on High-Volume Roles

Retail, customer support, and seasonal hiring need thousands of screens per month

Screening Backlog Crushes Time-to-Hire

The Pain Point

Recruiters take 15-25 minutes per phone screen and triage 40-60 applicants per role

Business Impact

Top candidates accept other offers before reaching the hiring manager

Real Cost

$8K-15K per role lost to slow time-to-shortlist

Bottom Line: Manual phone screening doesn't scale with applicant volume, and async video screening trades scale for candidate experience.

THE STACK

Engineered for Phone-Quality Audio at Scale

Six named components, each picked for a specific failure mode of the typical voice-AI stack.

Ultravox Unified Speech-to-Speech

Sub-300ms turn-taking on real phone audio. No transcription hop in the critical path, so the conversation actually feels human.

Twilio Voice Telephony

Outbound and inbound calls with carrier-grade quality, MOS telemetry per call, and automatic fallback when packet loss spikes.

LangGraph Adaptive Dialogue Policy

Dialogue is a state machine, not a fixed script. Senior engineers get deeper system-design probes; entry-level candidates stay on shorter branches.

Anthropic Claude Evaluator

Every scored criterion cites the exact candidate utterance that supports it. The PDF report is defensible against scorecard disputes and audit requests.

22-Language Native Support

Speech-to-speech in 22 languages including Spanish, Mandarin, Arabic, Hindi, Urdu, Tagalog, Portuguese, German, French, Japanese. Code-switching tracked mid-call.

ATS-Native Writeback

Scorecards, transcripts, and PDF reports push directly into Greenhouse, Lever, Workday Recruiting, BambooHR, iCIMS, JazzHR, Ashby, and SmartRecruiters.

ARCHITECTURE

The 7-Stage Screening Pipeline

From recruiter rubric to compliance-ready PDF — the actual flow, not a marketing diagram.

1

Recruiter Defines Role & Rubric

Recruiter posts the role and a weighted scoring rubric with expected evidence patterns per criterion. Templates exist for common role families.

2

Calendar Agent Auto-Schedules

OpenAI-powered scheduling agent reads candidate availability via Google Calendar or Outlook and books the slot without recruiter touch.

3

Twilio Voice Places the Call

Outbound dial with carrier-grade quality. MOS, jitter, and packet loss monitored per call; degraded calls trigger callback or channel switch.

4

Ultravox Conversation Loop

Unified speech-to-speech runs the conversation with sub-300ms turn latency. Whisper available as fallback STT; ElevenLabs for premium brand TTS personas.

5

LangGraph Adaptive Follow-Up

Each response routes through an evaluator node. Strong answers branch deeper; weak signals trigger probing follow-ups; off-topic answers re-anchor to the rubric.

6

Real-Time Claude Scoring

Anthropic Claude scores against the rubric live, with every score tied to a cited candidate utterance. No post-hoc transcription review.

7

Compliance-Ready PDF + ATS Writeback

PDF report with evidence citations writes to the ATS. Slack notifies the recruiter when a candidate clears threshold. Sentry tracks production errors end-to-end.

Integration surface

Greenhouse, Lever, Workday Recruiting, BambooHR, iCIMS, JazzHR, Ashby, SmartRecruiters, Google Calendar, Outlook Calendar, Slack for recruiter notifications. Sentry for production error tracking across the Ultravox, Twilio, and LangGraph layers.

WHERE IT FITS

Four Hiring Pipelines Where Voice AI Wins

Not every role needs voice AI. These four pipelines are where the math works.

High-Volume Recruiting

Retail, hospitality, BPO, and customer support pipelines screening thousands of applicants per month per region.

Multilingual screens in 8-12 languages, availability and reliability scoring, automated reschedule flows for no-shows.

Tech Screening (Senior+)

Competency-based screens for senior backend, ML, and platform engineers before they hit the hiring manager loop.

System design probes, language and framework depth checks, behavioral evidence on prior shipped systems.

Professional Services Hiring

Consulting, accounting, and legal-ops associate pipelines where structured behavioral scoring drives the decision.

Case-style follow-ups, client-facing communication scoring, structured evidence for partner review.

Entry-Level Talent Pipelines

University recruiting, apprenticeships, and early-career programs with thousands of applicants per cohort.

Lower-anxiety conversational format outperforms async video on completion and candidate NPS by 30+ points.

MEASURED OUTCOMES

What Recruiting Ops Actually Sees

Numbers from the production pipeline, not pilot demos.

300
candidates / hour
Throughput at full concurrency across the Twilio + Ultravox pipeline
85%
reduction
Time-to-shortlist vs. recruiter-led phone screens
22
languages
Native speech-to-speech with per-language rubric calibration
+12
candidate NPS
Versus asynchronous one-way video screening tools
IMPLEMENTATION

3-Week Pilot, 8-Week Production

Cognilium handles dialogue engineering, ATS integration, and production monitoring. You bring the rubric and the candidates.

Week 1

Discovery & Rubric Design

We capture your scoring criteria, evidence patterns, calibration examples, and ATS configuration. Recruiter interviews to surface implicit rubric.

1
Weeks 1-3

Pilot — One ATS, One Role, One Language

50-100 real candidate screens with side-by-side recruiter shadow scoring. Calibration report against rubric drift and adverse-impact baseline.

2
Weeks 4-8

Production Rollout

Multi-ATS integration, multilingual rubric pass with native-speaker calibration, Slack and calendar wiring, Sentry production monitoring.

3
Continuous

Ongoing Tuning

Monthly adverse-impact audit, rubric drift detection, latency and MOS regression monitoring, and quarterly model upgrades.

4
ENGINEERING FAQ

The Questions Engineers Actually Ask

Latency, integration, bias, compliance, fallback behavior — all of it.

The dialogue policy runs as a LangGraph state machine. Each candidate response feeds an evaluator node powered by Anthropic Claude, which decides whether the answer cleared the rubric criterion, needs a probing follow-up, or should trigger a different competency branch. A senior backend engineer gets harder system-design follow-ups; an entry-level retail candidate stays on a shorter availability branch.
READY FOR A 3-WEEK PILOT?

Bring Us Your Rubric. We'll Bring the Voice AI.

50-100 real candidate screens in week three. Side-by-side recruiter calibration. ATS-native scorecards from day one.

50+ projects delivered 96% client satisfaction Clients in US, UAE & Pakistan