Course Information

This is a project-based class where students will learn how to develop machine learning models for execution in resource constrained environments such as embedded systems. The primary target is embedded devices such as Arduino, Raspberry PI, Jetson, or Edge TPUs.

The class is broken into lectures/readings, labs/assignments, and a final project. Throughout this class, students will learn techniques such as model quantization, knowledge distillation and hybrid (embedded/cloud inferencing), which are instrumental for building efficient machine learning models that can run on power or resource-constrained devices. This class differs from other machine learning classes offered at Stanford due to the focus of applying these models for applications that require running on embedded hardware or other resource-constrained environments.

Instructors and TAs

Zain Asgar

Office hours: TBA

Pete Warden

Office hours: TBA

Sachin Katti

Office hours: TBA

Hang Qiu

Office hours: TBA

Keyi Zhang

Gates 356
Office hours: TBA

Course Schedule

Week 1

Intros, objectives, etc.
Course logistics. Introduction on machine learning and its standard training and deployment practice. Introduction to TFLite.

Week 2

Introduction to CNN and LSTM
Introduction to CNN and LSTM and their mathematical implementation. Various applications based on CNN and LSTM and their general architecture

Week 3

DNN on hardware
Model quantization. Discussion on efficient model design, such as MobileNet and YOLO.

Week 4

DNN on hardware (continued)
Hardware related acceleration, such as SIMD.

Week 5

Efficient model design
Model compression and pruning. Introduction on fixed-point, N-nary, FP16, and BFloat.

Week 6

Edge accelerator
Introduction to edge machine learning accelerator, such as edge TPU. Discussion on accelerator design and its impact on model design and performance.

Week 7

Guest Lecture

Week 8

Guest Lecture

Week 9

Project Demo

Homework Assignments

HW Assignment 1: Machine Learning with Arduino and Edge TPU

We will build a speech recognition model on Arduino as will as an image detector on Google Edge TPU.

HW Assignment 2: Micro DNN (udnn) framework with SIMD

We will build a feature-rich DNN framework from scratch with SIMD support and deploy it on ARM CPU.


Homework Assignments - 30%

  • Two Homework Sets

Reading Assignments/Class Participation - 20%

  • Reading Reflections
  • Lead group discussions on each paper-

Final Project - 50%

  • Proposal - 10% (2% for initial, 8% for final)
  • Write up & Presentation - 20%
  • Results - 20% (relative to proposed)

Important Dates

DateWhat's Due
Week 4 (Oct 11) Preliminary final project proposal due
Week 6 (Oct 25) Final project proposal due
Week 7 (Nov 1) In class show and tell about project progress
Week 9 (Nov 29) Final project presentations due (last class)
Week 12 (Dec 6) Final project report due