Are you ready to meet the demands of tomorrow? Evaluate your approach to skilling.

Design, Cost-Model & Ship Machine-Learning Systems Like a Staff Engineer
12 Weeks. Live Online Classes. Instructor-led.
Our Partners
Master blue-green & canary roll-outs, observability, and failure playbooks
Draft production design docs that balance accuracy, latency & cost
Build feature stores, offline/online parity and reliability budgets
What you will learn?
Think like a Staff Engineer. Over 12 intensive weeks you’ll translate product goals into SLIs/SLOs, draft cost-aware architecture diagrams, and implement blue-green and canary roll-outs with Argo Rollouts. From feature stores and data lineage to drift dashboards and GDPR audit trails, you’ll learn the frameworks that underpin reliable, scalable machine-learning in production—then prove it with a peer-reviewed design doc and monitored service.
-
Translate product goals into ML feasibility studies, SLIs, SLOs, and target latency/throughput
Build a cost model comparing serverless, VM, and GPU cluster options
-
Stand up Feast or Delta Feature Store; guarantee offline/online parity
Enforce data tests & freshness SLAs with Great Expectations and OpenLineage
-
Draft high-level + low-level diagrams, scaling strategies, and failure modes
Run design-doc reviews, apply RFC feedback loops used at FAANG and scale-ups
-
Implement blue-green, canary, and shadow deployments via Argo Rollouts & Istio
Automate CI/CD with GitHub Actions, MLflow Registry, and Terraform IaC
-
Instrument services with Prometheus + Grafana; set alert thresholds
Add drift detection, rollback playbooks, and audit trails for SOC-2 / GDPR
-
Team project: write a 10-page design doc & ship a costed, monitored ML service
Present to senior engineers & recruiters; receive actionable feedback and reference letter
Who Should Enrol?
-
You’re shipping models already but need to design production-grade pipelines that hit uptime, latency and cost targets.
-
Become the architect of CI/CD, feature-store and observability layers instead of the firefighter who inherits them.
-
Add modern ML reference architectures to your toolbox and learn to cost-out GPU, feature-store and inference workloads.
-
Move from notebooks and PoCs to designing services that scale, auto-retrain and pass compliance audits.
-
Fast-track into a high-leverage ML systems role by mastering design docs, trade-off analysis and platform tooling.
Engineers and data scientists who must own the architecture, reliability, and cost of ML solutions—not just the code.
Prerequisites
Python & Git fluency, basic ML deployment experience. Our free Systems Thinking Sprint and Docker-K8s Mini-Camp bridge badges are included for all Intermediate-track graduates.
Career Pathways
-
Design end-to-end ML platforms that balance accuracy, latency, security and cost.
-
Own critical ML services, mentor teams and define technical direction.
-
Build observability, alerting and automated rollback for model pipelines at scale.
-
Integrate feature stores, vector DBs and auto-retraining into cohesive, cloud-native stacks.
-
Translate business goals into costed ML architectures and roadmaps that engineering can execute.
You’ll graduate with a publish-ready design doc, a costed architecture, and a live, monitored ML service—matching the deliverables senior-level job ads demand.
Amir Charkhi
Technology leader | Adjunct Professor | Founder
With 20 + years across energy, mining, finance, and government, Amir turns real-world data problems into production AI. He specialises in MLOps, cloud data engineering, and Python, and now shares that know-how as founder of AI Tech Institute and adjunct professor at UWA, where he designs hands-on courses in machine learning and LLMs.
Advanced: ML System Design
12 Weeks. Live Online Classes. Next Cohort 2nd September
Frequently Asked Questions
-
Beginner courses: none— we start with Python basics.
Intermediate & Advanced: ability to write simple Python scripts and use Git is expected. -
Plan on 8–10 hours: 2× 3-hour live sessions and 2–4 hours of project work. Advanced tracks may require up to 10 hours for capstone milestones.
-
All sessions are recorded and posted within 12 hours. You’ll still have access to Slack/Discord to ask instructors questions.
-
New intakes launch roughly every 8 weeks. Each course page shows the exact start date and the “Apply-by” deadline.
-
Just a laptop with Chrome/Firefox and a stable internet connection. All coding happens in cloud JupyterLab or VS Code Dev Containers—no local installs.
-
Yes. 100 % refund until the end of Week 2—no questions asked. After that, pro-rata refunds apply if you need to withdraw for documented reasons.
-
Absolutely. We issue invoices to companies and offer interest-free 3- or 6-month payment plans.
-
Live Q&A in every session, 24-hour Slack response time from instructors, weekly office-hours, and code reviews on your GitHub pull requests.