Getting Started with Devbox for Data Science

1. What is Devbox and Why Use It?

  • Devbox is a command-line tool that allows you to create isolated, reproducible, and portable development environments without relying on Docker or virtual machines.

  • It is powered by Nix, but without the complexity of Nix language.

  • For a data scientist, this means:

    • Easy sharing of dependencies (Python, ML libraries, etc.) in a single devbox.json file.

    • Consistency across machines: every teammate works with the same versions.

    • Ability to experiment with tools without polluting your global system.

2. Installing Devbox

  1. Open your terminal (Linux, macOS, or Windows with WSL2).

  2. Run the following command (non-root user):

    curl -fsSL https://get.jetify.com/devbox | bash
    
  3. If Nix is not installed, Devbox will install it automatically.

3. Creating Your First Devbox Environment

  1. Initialize Devbox in your project directory:

    devbox init
    

    This generates a devbox.json file.

  2. Search for packages:

    devbox search uv
    
  3. Add uv (a fast Python package installer and environment manager) and data science libraries:

    devbox add uv pandas
    
  4. Start your environment:

    devbox shell
    
  5. Create and manage a Python environment with uv:

    uv venv .venv
    source .venv/bin/activate
    uv pip install numpy scikit-learn
    
  6. Exit the environment:

    exit
    

4. Automating with Scripts and Hooks

You can customize workflows in devbox.json:

{
  "packages": [
    "uv",
    "pandas"
  ],
  "shell": {
    "init_hook": [
      "uv pip install -r requirements.txt"
    ],
    "scripts": {
      "run-notebook": "uv run jupyter notebook"
    }
  }
}
  • init_hook: runs commands when entering devbox shell.

  • scripts: custom commands launched with:

    devbox run run-notebook
    

Example: automatically install dependencies or start Jupyter Notebook when entering the environment.

5. Managing Global Packages

Some tools are useful across all projects (like git or ripgrep). Install them globally:

devbox global add ripgrep git

To make them available outside of devbox shell, add this to your .bashrc or .zshrc:

eval "$(devbox global shellenv --init-hook)"

6. Best Practices for Data Scientists

| Step | Recommendation | | - | | | 1 | Always version-control devbox.json and devbox.lock. | | 2 | Use uv to manage Python environments and package installations. | | 3 | Use init_hooks to automate dependency installation with uv pip. | | 4 | Use scripts for tasks like launching Jupyter or running tests. | | 5 | Leverage devbox global for tools you need everywhere. | | 6 | Explore Devbox templates (like Jupyter setups) to kickstart projects quickly. |

Summary

Using Devbox with uv provides a modern, fast, and reproducible way to manage Python environments for data science. This workflow combines the reliability of Devbox with the speed of uv, ensuring smooth and consistent setups across projects.