LLMs
Demystifying LLM Architecture: From Attention to Production
A deep dive into transformer internals, attention mechanisms, KV-cache optimization, and serving LLMs at scale.
Deep dives into AI engineering, ML systems, and production machine learning
A deep dive into transformer internals, attention mechanisms, KV-cache optimization, and serving LLMs at scale.
From collaborative filtering to deep learning recommenders — architecture patterns for millions of users.
Why Bayesian forecasting wins in business — uncertainty quantification and real deployment patterns.
Battle-tested patterns for ML systems — feature stores, model serving, and monitoring infrastructure.