Production ML & GenAI Workloads at Scale

The Challenge

Enterprise retail decision-making requires diverse ML approaches - no single model type solves forecasting, personalization, pricing, and customer lifecycle management. The organization needed a production ML practice that could support a wide range of modeling techniques while maintaining proper governance, experiment tracking, and deployment lifecycle management.

My Approach

I built the production ML practice with a focus on selecting the right modeling technique for each business problem, with MLflow as the backbone for experiment tracking and model lifecycle management.

Modeling Techniques

The practice supports several advanced approaches:

GraphRAG: Combining knowledge graph relationships with retrieval-augmented generation - the graph provides structured context that enriches LLM responses far beyond what flat document retrieval can achieve
Causal Modeling: Understanding cause-effect relationships for pricing and promotion decisions. Instead of just correlating price changes with sales, causal inference identifies the true impact of interventions
Bandit Optimization: Multi-armed bandit approaches for dynamic decision-making under uncertainty - used for real-time personalization where exploration and exploitation must be balanced
Survival Modeling: Time-to-event analysis for customer lifecycle and churn prediction, modeling not just whether a customer will churn, but when

MLOps Foundation

Every production model follows a consistent lifecycle:

Experiment tracking: All training runs, hyperparameters, and metrics tracked in MLflow
Model registry: Central source of truth for model versions, staging, and production promotion
Monitoring: Drift detection and performance degradation alerts
Governance: Model cards documenting purpose, training data, limitations, and owners

Key Decisions & Trade-offs

Right technique for the problem: Rather than defaulting to deep learning for everything, each use case gets the most appropriate approach. Causal models for pricing, bandits for real-time decisions, survival models for lifecycle - this pragmatism improves both accuracy and interpretability.

GraphRAG over vanilla RAG: Adding the knowledge graph layer to retrieval-augmented generation was more complex to build, but the structured relationships it provides dramatically improve response quality for domain-specific queries.

Impact

The production ML practice provides QVC Group with a governed, scalable approach to deploying diverse AI capabilities. Each technique is chosen for its fit to the business problem, not for its novelty.