Retail AI: sub-10% MAPE forecasts, BOPIS-aware replenishment, return fraud caught at the POS
We build inference systems for omnichannel retailers, DTC brands, grocery chains and QSR multi-unit operators - integrated with Shopify Plus, Adobe Commerce, Salesforce Commerce Cloud, NCR Voyix and Toast, writing back into Manhattan Active Omni, Oracle Retail RPAS, Blue Yonder and RELEX. PCI-DSS scope stays small. Tokens, not PANs, hit the lake.
Platforms we integrate with
Six AI systems we ship to omnichannel retail & DTC
Each is a production inference service wired into the merchant's commerce, POS, OMS and merchandising stack - not a notebook hand-off. Forecasts write back to RPAS / Blue Yonder / RELEX. Personalisation lands in Shopify Plus or SCC. Loss-prevention scores show up in the LP queue, not a CSV.
Demand forecasting at SKU × store × week
Sub-10% MAPE on mature SKUs
- Hierarchical Bayesian backbone (state-space at SKU level, partial pooling at category-store-cluster) + gradient-boosted residual on weather, promo, holiday, competitor pricing, and event signals
- New-item cold start: lookalike-SKU embedding on attribute vectors (category, price band, fabric, vendor, lifecycle), prior is the weighted posterior of k nearest neighbours, updated on the first 2-4 weeks of actuals
- Hierarchical reconciliation across SKU → category → store → region so the planner's roll-up and the SKU-level forecast never disagree
- Writes to Oracle Retail RPAS, Blue Yonder Demand & Fulfillment, RELEX, o9 Solutions, or Aptos via their native APIs - no CSV exports, no Friday-night sneakernets
Markdown, promo & price optimisation
Causal lift, not naive YoY
- Markdown cadence optimiser balances sell-through, GMROI, and end-of-season residual risk under RIM and weighted-average-cost views so finance sees the P&L impact pre-approval
- Promo lift estimated with geo / store-cluster holdouts (synthetic control or DiD with parallel-trends pretests); Bayesian structural time series where holdouts are politically impossible
- Dynamic pricing online with explicit margin floors, competitor-price guardrails (where legally permissible) and rate-limited movement so the price doesn't yo-yo on the PDP
- Promotion eligibility resolved through Talon.One or native Salesforce Promotion Engine - the model recommends, the rules engine adjudicates
Post-cookie personalisation & recommendations
Identity-stitched, clean-room safe
- Two-tower retriever (content + collaborative) with a learning-to-rank head; behavioural signals stitched to identity in the CDP (Segment, mParticle, Lytics, or a native Iceberg store at scale)
- Cold-visitor fallback to bestseller + content-based - we don't pretend cookie-less anonymous traffic gets the same lift; it doesn't, and conflating them inflates uplift numbers
- Clean-room joins (LiveRamp, AWS Clean Rooms, Habu) for retail-media-network and CTV co-targeting on hashed PII rather than third-party cookies
- Privacy Sandbox Topics / Protected Audiences hooks for long-tail open-web inventory
Conversational shopping grounded in inventory
No hallucinated stock, no phantom promos
- Tool-using agent: every price, stock, promo and shipping claim is a fresh call to Shopify Storefront API, SCC OCAPI, Adobe Commerce GraphQL, or the merchant OMS - 30-second TTL on reference data, zero TTL on availability
- Strict catalog-schema output validator strips any SKU the model invented before the message leaves the server
- Deterministic policy engine re-checks pricing, eligibility, and shipping windows post-LLM - the model proposes, the policy engine disposes
- Disclosure injection for regulated categories (alcohol age-gate, dietary-supplement claims, medical-retail FDA UDI)
Loss prevention & return-fraud detection
Precision-tuned, calibrated to recovery economics
- POS exception monitoring: sweethearting, void abuse, refund-without-return, post-void scan, manager-override clustering - built on NCR Voyix, Toast, Lightspeed and Oracle Symphony event streams
- Return-fraud scoring: wardrobing (high-AOV apparel returned tag-attached within wear window), serial-returner identity graphs, empty-box / receipt-mismatch, channel-arbitrage between online and store
- Operating point tuned against a calibrated cost matrix - flag the top 2-3% for review where expected recovery exceeds 8x review labour cost; soft-decline the top 0.3%; hard decline only on corroborated ORC patterns
- Shrink analytics joining POS exceptions to WMS receiving and store-cycle-count variances to localise loss to register, shift, or vendor
Visual search & catalog enrichment
Image-in, SKU-out - and clean PDPs at scale
- CLIP-style multimodal embeddings on product imagery + textual attributes; ANN index (FAISS, Vespa, or Pinecone) for sub-100ms search across 10M+ SKU catalogs
- Attribute extraction from supplier imagery - colour family, neckline, fit, heel height, material - to backfill the long tail of PDP facets that merchandisers never have time to tag
- Duplicate / near-duplicate detection across vendor feeds and marketplace listings to deduplicate the catalog before it hits search
- Image moderation and PDP-quality scoring (background, lighting, model presence) to enforce brand standards on UGC and dropshipped feeds
Retail-grade data infrastructure
Retail data is not generic event data. A unit on hold for BOPIS is not sellable. A return in transit is not on-hand. A markdown taken aggressively at end-of-season distorts the RIM cost ratio. Here is what we actually build.
Commerce & POS event ingestion
Native adapters for the systems that actually move stock.
- Shopify Plus (Storefront API, Admin GraphQL, webhooks, Bulk Operations API), Adobe Commerce (Magento 2 GraphQL + REST), Salesforce Commerce Cloud (OCAPI + B2C Commerce APIs), Commercetools, BigCommerce, Saleor
- NCR Voyix Aloha & Counterpoint, Toast (Orders / Inventory / Loyalty APIs), Square (Catalog / Inventory / Payments), Lightspeed Retail X-Series + Restaurant, Clover, Oracle Symphony / MICROS - event streams off the POS journal, not screen-scrape
- GS1 / GTIN / UPC barcode normalisation; NRF ARTS retail data model alignment for cross-platform analytics
- Backfill-safe: Bulk Operations for Shopify, journal replay for POS, watermark-based idempotency so a six-month restate doesn't double-count revenue
Customer Data Platform & identity stitch
First-party identity that survives the cookie sunset.
- Deterministic stitch on logged-in email + phone + loyalty ID; probabilistic stitch on device-graph and session-graph fallback only for the explicitly anonymous tail
- Reference deploys on Segment, mParticle, Lytics, RudderStack - or an Iceberg + dbt-native CDP when scale or per-event cost makes a vendor uneconomic
- Clean-room readiness: LiveRamp, AWS Clean Rooms, Habu, Snowflake Data Clean Rooms - hashed-PII joins for retail-media co-targeting without leaking the customer table
- Consent state is a first-class column: CCPA / CPRA opt-out, GDPR consent, push / SMS / email channel granularity - every downstream activation honours the latest state
Order, inventory & OMS truth
Where is the unit? Available to whom? The answer is one query.
- Manhattan Active Omni as OMS source of truth - available-to-promise, inventory pool model, BOPIS / BORIS / ship-from-store rules surfaced as a queryable view, not a series of webhook handlers
- Oracle Retail RPAS, Blue Yonder Demand & Fulfillment, RELEX, o9 Solutions, Aptos - bidirectional integration for forecast write-back and on-hand read
- Real-time inventory across stores + DCs + 3PL + in-transit + reserved-for-BOPIS - single materialised view, sub-second freshness on hot SKUs
- Returns + RTV (return-to-vendor) flow normalised so returns-in-transit don't show as sellable until they're checked in
Clickstream, search & session analytics
From page-view to revenue with lineage you can defend.
- Server-side event capture (GA4 Measurement Protocol, native first-party endpoints) - ad-blocker-resilient, hashed-PII compatible, consent-aware
- Search behaviour: query → click → cart → purchase joined on session_id, fed to the relevance ranker and zero-result-query backfill queue
- Session replay sampling (FullStory, LogRocket, or Rrweb-native) joined to revenue cohorts so UX investigations have an outcome anchor
- Attribution: multi-touch + Markov-chain + media-mix model side-by-side so finance and growth argue from the same data, not three rival dashboards
Product catalog, PIM & enrichment
A merchandisable catalog at marketplace scale.
- PIM integrations: Akeneo, Salsify, Inriver, Plytix - bidirectional sync so attribute updates from ops land in Shopify Plus / SCC / Adobe Commerce without manual exports
- Attribute inference from imagery and copy to backfill the long tail - colour family, neckline, fit, material, room, age-range
- Deduplication and near-duplicate clustering across vendor feeds; price-and-availability conflict resolution rules per channel
- Localisation pipelines: copy + size charts + currency + tax + compliance copy per market, with translation memory and human-in-the-loop QA on regulated SKUs
PCI-DSS scope containment & deployment
Tokens reach the lake. PANs do not.
- P2PE-validated terminal flow where available; network tokenisation (Visa VTS, Mastercard MDES) elsewhere - analytics and ML see tokens, never PAN or CVV
- Cardholder data environment is a thin enclave: POS, gateway, KMS. Inference services that score the transaction live in CDE; everything else (forecasting, personalisation, returns) lives outside
- Reference deploys on AWS (PrivateLink + KMS + Macie), Azure (Private Link + Key Vault HSM), GCP (VPC-SC + CMEK) with customer-managed keys
- CSRD / SB-253 / SB-261 emissions data captured at SKU and shipment level for Scope 3 reporting - sourced from the same lake, not a separate ESG spreadsheet
Retail AI questions we get from retailers
The questions a thoughtful VP of Merchandising, a CFO, or a head of LP actually asks before signing.
Ship retail AI that writes back into your stack - not a slide deck
Forecasts that land in Oracle Retail RPAS, Blue Yonder or RELEX. Replenishment that respects Manhattan Active Omni inventory pools. Personalisation grounded in your CDP and clean rooms. Loss-prevention scores in the LP queue. Talk to an engineer about your stack, your timelines, and your scope.