Question 1

How does the voice AI handle adaptive follow-up questions instead of a fixed script?

Accepted Answer

The dialogue policy runs as a LangGraph state machine. Each candidate response feeds an evaluator node powered by Anthropic Claude, which decides whether the answer cleared the rubric criterion, needs a probing follow-up, or should trigger a different competency branch. The graph routes to one of several next-question nodes based on signal strength, so a senior backend engineer is asked harder system-design follow-ups while an entry-level retail candidate stays on a shorter availability-and-scheduling branch.

Question 2

Why Ultravox instead of a traditional STT plus LLM plus TTS stack?

Accepted Answer

Ultravox is unified speech-to-speech: audio in, audio out, no transcription hop in the critical path. That gets sub-300ms turn-taking latency, which is the threshold below which candidates stop noticing the AI is non-human. A separate Whisper plus LLM plus ElevenLabs pipeline typically lands at 800-1400ms turn latency on phone-quality audio. We keep Whisper available as fallback STT for audit transcripts and edge cases, and ElevenLabs for premium TTS personas when the role calls for a specific brand voice.

Question 3

Which ATS systems do you integrate with for scheduling and scoring writeback?

Accepted Answer

Native integrations with Greenhouse, Lever, Workday Recruiting, BambooHR, iCIMS, JazzHR, Ashby, and SmartRecruiters. Candidate stage transitions, structured scorecards with evidence citations, and PDF transcripts are written back via each ATS API. Scheduling pulls from Google Calendar or Outlook depending on the recruiter, and recruiter notifications fire into Slack when a candidate clears the scoring threshold.

Question 4

How accurate is the scoring against our rubric, and how is bias controlled?

Accepted Answer

Scoring is rubric-driven, not vibes-driven. The recruiter defines criteria with weights and expected evidence patterns up front. The Claude evaluator must cite the exact candidate utterance that supports each score, and the final PDF report shows every score with its evidence citation. We strip demographic signals from the evaluation prompt, run an adverse-impact audit on aggregated scores monthly, and let the human recruiter override any score before the candidate advances.

Question 5

What does the 22-language support actually mean in practice?

Accepted Answer

Ultravox handles speech-to-speech directly in 22 languages including Spanish, Mandarin, Arabic, Hindi, Urdu, Tagalog, Portuguese, French, German, and Japanese. The same rubric and evaluator prompts are translated and validated per language with a native-speaker calibration pass during the 8-week production rollout. Candidates can also code-switch mid-call and the system tracks the dominant language without dropping the conversation.

Question 6

How do you handle call quality issues, drop-offs, and candidate reschedules?

Accepted Answer

Twilio Voice gives us inbound quality telemetry per call. If MOS drops below threshold or packet loss spikes, the system offers to call back or switch to a different channel. Drop-offs trigger an automated reschedule flow via the calendar agent, and candidates who request a human are routed to a recruiter queue with the partial transcript attached. Sentry tracks production errors across the Ultravox, Twilio, and LangGraph layers so issues surface before they affect candidate experience.

Question 7

What is the pilot scope and how long until production?

Accepted Answer

Three-week pilot with one ATS, one role family, and one language, ending with 50-100 real candidate screens and a calibration report. Eight-week production rollout adds multi-ATS, multilingual support, full rubric library, Slack and calendar integrations, and the compliance-ready PDF reporting pipeline. Cognilium handles all infrastructure, dialogue engineering, and ATS integration work.

Question 8

How is candidate data stored and what compliance posture do you support?

Accepted Answer

Audio recordings, transcripts, and evaluation artifacts are stored encrypted at rest with configurable retention windows per region. The pipeline supports EEOC-compliant audit logs in the US, GDPR data-subject access and deletion in the EU, and UAE PDPL requirements for our regional clients. PII can be redacted from transcripts before they hit the ATS if the customer requires it.

Screen 300 Candidates Per HourWith Adaptive Follow-Up Questioning

The Five Failure Modes of Phone Screening

Screening Backlog Crushes Time-to-Hire

Inconsistent Phone Screens Across Recruiters

Multilingual Pipelines Bottleneck on Recruiter Language Coverage

Async Video Screens Have Brutal Drop-Off

High Cost-Per-Screen on High-Volume Roles

Screening Backlog Crushes Time-to-Hire

The Pain Point

Business Impact

Real Cost

Engineered for Phone-Quality Audio at Scale

Ultravox Unified Speech-to-Speech

Twilio Voice Telephony

LangGraph Adaptive Dialogue Policy

Anthropic Claude Evaluator

22-Language Native Support

ATS-Native Writeback

The 7-Stage Screening Pipeline

Recruiter Defines Role & Rubric

Calendar Agent Auto-Schedules

Twilio Voice Places the Call

Ultravox Conversation Loop

LangGraph Adaptive Follow-Up

Real-Time Claude Scoring

Compliance-Ready PDF + ATS Writeback

Integration surface

Four Hiring Pipelines Where Voice AI Wins

High-Volume Recruiting

Tech Screening (Senior+)

Professional Services Hiring

Entry-Level Talent Pipelines

What Recruiting Ops Actually Sees

3-Week Pilot, 8-Week Production

Discovery & Rubric Design

Pilot — One ATS, One Role, One Language

Production Rollout

Ongoing Tuning

The Questions Engineers Actually Ask

Bring Us Your Rubric. We'll Bring the Voice AI.