About the Role

Cognilium AI is a production-first AI product engineering company building reliable, scalable AI systems for startups and enterprises. Founded in 2019, we specialize in agentic AI, enterprise-grade RAG/NL2SQL, document intelligence, voice AI, and cloud-native data platforms — engineered for real users, real scale, and measurable ROI. We're hiring a Full Stack AI Engineer to design, build, and operate production-grade AI systems end-to-end — not prototypes. You'll architect and ship multi-agent orchestration platforms, RAG pipelines, document intelligence systems, and AI-driven SaaS products using Python, FastAPI, Next.js, and React. You'll integrate LLMs from major providers, design retrieval and knowledge systems, and build full-stack interfaces — all with a strong focus on production reliability, cost control, and operational stability. You'll work closely with the founder and the engineering team to turn business problems into reliable, scalable AI solutions. If you enjoy shipping AI that actually runs in production — and taking ownership beyond just writing code — this role is built for you. Hiring process: 30-minute technical conversation → 1-week paid trial → offer. Decision in 2 weeks from the first conversation. No LeetCode. No whiteboards.

Responsibilities

Design, build, and deploy end-to-end generative AI applications

Implement multi-agent orchestration with supervisor routing, specialist agents, and tool use

Develop enterprise-grade RAG pipelines with hybrid retrieval, reranking, citations, and grounding checks

Build document intelligence pipelines for extraction, classification, and validation across structured and unstructured documents

Build high-performance, asynchronous APIs using Python and FastAPI

Design scalable microservices to expose AI capabilities

Engineer multi-tenant SaaS architectures with zero-trust isolation, RBAC, and audit trails

Build production frontends with Next.js, React, and TypeScript

Ship LLM features into existing tools — Office Add-ins, browser extensions, embedded iframes, dashboards

Implement real-time UX with SSE streaming, async polling, and live state synchronization

Integrate LLMs from OpenAI, Anthropic, Google, AWS Bedrock, and Azure OpenAI

Design prompts that minimize hallucinations and control latency and cost

Work with vector databases to power retrieval and semantic search

Implement LLM-as-judge evaluation, grounding checks, and anti-hallucination safeguards

Implement circuit breakers, two-tier retry, dead letter queues, and graceful degradation

Implement observability, structured logging, distributed tracing, and monitoring across AI services

Deploy AI services to cloud environments (AWS, GCP, or Azure) and containerize with Docker

Apply MLOps best practices around evaluation, monitoring, rollback, and cost governance

Manage infrastructure as code with Terraform, Bicep, or CDK

Translate business requirements into robust technical implementations and participate in architecture reviews

Requirements

2–4 years of professional engineering experience, with at least 1 year shipping production AI/LLM systems

Strong backend experience with Python (FastAPI, async patterns) at production scale

Frontend experience with Next.js (15+) and React (18+), TypeScript

Practical experience developing applications using Large Language Models (LLMs) from at least 2 providers (OpenAI, Anthropic, Google Gemini, AWS Bedrock, or Azure OpenAI)

Proven experience building generative AI or RAG-based systems in production

Hands-on experience with at least one agent framework: LangChain, LangGraph, Google ADK, CrewAI, or AWS Bedrock AgentCore

Solid understanding of RESTful API design, service architecture, and engineering best practices

Experience with vector databases (Pinecone, Qdrant, Weaviate, Chroma, AstraDB, Milvus, or pgvector)

Production deployment experience on at least one cloud (AWS, GCP, or Azure)

Comfort with Docker and CI/CD pipelines (GitHub Actions or similar)

Nice to Have

+Multi-cloud experience — shipped on 2 of AWS / GCP / Azure

+Voice AI integration (Twilio, Ultravox, Vapi, LiveKit, ElevenLabs, Whisper, Deepgram)

+Office Add-in development (Office.js, Word, Excel, Outlook plugins)

+Browser extension development (Chrome MV3, WXT framework)

+Workflow orchestration with n8n, Apache Airflow, or Temporal

+Knowledge graph work (Neo4j, GraphRAG, entity resolution)

+Authentication systems (JWT, OAuth, Clerk, Firebase Auth, Cognito)

+Production RAG with citation grounding, anti-hallucination checks, and LLM-as-judge evaluation

+Web scraping at scale (Selenium, Playwright, Camoufox, anti-detection techniques)

+Open source contributions in the AI / agent space

Benefits & Perks

Competitive salary aligned with Pakistan's top tier for production AI engineers

Direct mentorship from a builder-led founding team

Real production work — your code runs for real users, real scale, day one

Multi-cloud, multi-stack exposure across diverse industries

Annual learning budget for AI conferences, courses, and certifications

Growth path to senior and lead architect roles

A collaborative and innovative work environment

Hiring decision in 2 weeks from first conversation — no endless rounds, no LeetCode

Full Stack AI Engineer

About the Role

Responsibilities

Requirements

Nice to Have

Benefits & Perks

Interested in this role?