Machine Learning Systems Engineering

Data Science
255

3 units

Course Description

This course provides learners hands-on data management and systems engineering experience using containers, cloud, and Kubernetes ecosystems based on current industry practice. The course will be project-based with an emphasis on how production systems are used at leading technology-focused companies and organizations. During the course, learners will build a body of knowledge around data management, architectural design, developing batch and streaming data pipelines, scheduling, and security around data including access management and auditability. We’ll also cover how these tools are changing the technology landscape.

Student Learning Outcomes

  • Construct, measure, and identify metrics relating to performance of a system in order to optimize costs and latency of serving inferences for machine learning models.

  • Demonstrate understanding of Kubernetes for management of machine learning models.

  • Describe the difference between a monolithic and microservice architecture, assess and select appropriate use cases for each.

  • Describe the differences between a development and production system particularly for Machine Learning where the boundaries are blurry.

  • Know when to leverage a cache for serving machine learning models to reduce load on production systems.

  • Understand continuous integration and continuous delivery (CI/CD) pipeline for automated code deployment, particularly for ML models.

  • Understand how stateful systems add complexities to systems engineering.

  • Understand how to serve machine learning models over an API in real-time.

Course Designer

Prerequisites

Data Science 205 and 207. MIDS students only. Familiarity with: generating predictions from a trained machine learning model; command line (Bash), Python, and Git; and networking concepts such as DNS. Working knowledge of SSH and ports.
Last updated: November 12, 2021