# Getting Started with Devbox for Data Science ## 1. What is Devbox and Why Use It? * **Devbox** is a command-line tool that allows you to create **isolated, reproducible, and portable development environments** without relying on Docker or virtual machines. * It is powered by Nix, but without the complexity of Nix language. * For a **data scientist**, this means: * Easy sharing of dependencies (Python, ML libraries, etc.) in a single `devbox.json` file. * Consistency across machines: every teammate works with the same versions. * Ability to experiment with tools without polluting your global system. ## 2. Installing Devbox 1. Open your terminal (Linux, macOS, or Windows with WSL2). 2. Run the following command (non-root user): ```bash curl -fsSL https://get.jetify.com/devbox | bash ``` 3. If Nix is not installed, Devbox will install it automatically. ## 3. Creating Your First Devbox Environment 1. Initialize Devbox in your project directory: ```bash devbox init ``` This generates a `devbox.json` file. 2. Search for packages: ```bash devbox search uv ``` 3. Add **uv** (a fast Python package installer and environment manager) and data science libraries: ```bash devbox add uv pandas ``` 4. Start your environment: ```bash devbox shell ``` 5. Create and manage a Python environment with `uv`: ```bash uv venv .venv source .venv/bin/activate uv pip install numpy scikit-learn ``` 6. Exit the environment: ```bash exit ``` ## 4. Automating with Scripts and Hooks You can customize workflows in `devbox.json`: ```json { "packages": [ "uv", "pandas" ], "shell": { "init_hook": [ "uv pip install -r requirements.txt" ], "scripts": { "run-notebook": "uv run jupyter notebook" } } } ``` * **init\_hook**: runs commands when entering `devbox shell`. * **scripts**: custom commands launched with: ```bash devbox run run-notebook ``` Example: automatically install dependencies or start Jupyter Notebook when entering the environment. ## 5. Managing Global Packages Some tools are useful across all projects (like `git` or `ripgrep`). Install them globally: ```bash devbox global add ripgrep git ``` To make them available outside of `devbox shell`, add this to your `.bashrc` or `.zshrc`: ```bash eval "$(devbox global shellenv --init-hook)" ``` ## 6. Best Practices for Data Scientists | Step | Recommendation | | - | | | 1 | Always version-control `devbox.json` and `devbox.lock`. | | 2 | Use `uv` to manage Python environments and package installations. | | 3 | Use **init\_hooks** to automate dependency installation with `uv pip`. | | 4 | Use **scripts** for tasks like launching Jupyter or running tests. | | 5 | Leverage `devbox global` for tools you need everywhere. | | 6 | Explore **Devbox templates** (like Jupyter setups) to kickstart projects quickly. | ## Summary Using **Devbox** with **uv** provides a modern, fast, and reproducible way to manage Python environments for data science. This workflow combines the reliability of Devbox with the speed of `uv`, ensuring smooth and consistent setups across projects.