=================
About this course
=================

:Authors:
    Cao Tri DO <caotri.do88@gmail.com>
:Version: 2025-09

.. admonition:: Objectives
    :class: important

    This article is intended to give you an overview of the course, its objectives, and the topics covered.

source: https://maven.com/marvelousmlops/mlops-with-databricks

Introduction
============

Do you want to know the right way to do MLOps on Databricks? This course is for you!

Course overview
===============

Implementing MLOps practices elevates data scientists and speeds up time to production. We've seen it through our careers. MLOps is not about what tools you use, it is about how you use them to follow MLOps principles.

For any given machine learning model run/deployment in any environment, it must be possible to look up unambiguously:

- corresponding code/commit on git;
- infrastructure used for training and serving;
- environment used for training and serving;
- ML model artifacts;
- what data was used to train the model.

We teach you how to follow these principles using Databricks and develop on Databricks following the best software engineering practices.

We spent the last 3 years working with Databricks and figuring it out with new features appearing all the time (such as Unity catalog, model serving, feature serving, Databricks Asset Bundles). It was not straightforward due to lacking documentation and notebook-first available training materials.

In this course, we share all the knowledge we gained during our journey.

Prerequisites: Python experience, basic knowledge of git, CI/CD.


Topics covered
==============

- MLOps principles and components
    - MLOps toolbelt
    - Principles behind MLOps
    - Databricks MLOps components
    - Developing on Databricks

- Developing in Python: best software development principles
    - Dbconnect & VS code extension
    - Databricks Folders
    - From a notebook to production-ready code
    - Databricks asset bundles (DAB)

- What is DAB?
    - Asset bundles components
    - Defining complex workflow in asset bundles
    - Using private packages in asset bundles
    - Git branching strategy & Databricks environments

- Databricks'recommended approach
    - CI/CD pipeline with GitHub actions and Asset Bundles
    - MLflow experiment tracking & registering models in Unity Catalog

- MLflow components
    - Track experiments & search for experiments
    - Custom models in MLflow
    - Registering models in Unity Catalog
    - Model serving architectures

- Overview of architectures and use cases
    - Feature serving
    - Model serving (with automatic feature lookup)
    - Inference tables and lakehouse monitoring

- What are inference tables
    - Setting up model evaluation pipeline
    - Data/model drift detection and lakehouse monitoring