Using pre-commit hooks with uv for Data Science

1. What is a pre-commit hook?

A hook is a small script that runs automatically at certain stages of using Git. A pre-commit hook runs right before you save your changes with git commit.

The idea: automate checks or fixes before your code enters Git history.

Examples:

  • check Python code style (PEP8),

  • automatically format code,

  • prevent committing large data files,

  • clean up Jupyter notebooks before committing.

2. Why should data scientists care?

When working in a team, clean code and reproducibility matter. Hooks help you:

  • avoid silly mistakes (typos, forgotten formatting),

  • keep the repo clean (notebooks without huge outputs),

  • save time (problems are fixed before review),

  • collaborate smoothly with engineers.

3. Install and initialize uv

If you don’t have uv yet, install it:

curl -LsSf https://astral.sh/uv/install.sh | sh

Then create and enter a project:

uv init my-data-project
cd my-data-project

This creates a virtual environment and a pyproject.toml.

4. Add pre-commit with uv

Instead of pip install pre-commit, use:

uv add --dev pre-commit

This installs pre-commit as a development dependency in your pyproject.toml.

5. Initialize hooks in Git

Inside your project, run:

uv run pre-commit install

From now on, every git commit will trigger the hooks you configure.

6. Configure .pre-commit-config.yaml

At the root of your repo, create a .pre-commit-config.yaml. Here’s a setup useful for data scientists:

repos:
  # Basic checks
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-merge-conflict
      - id: check-added-large-files

  # Python code formatter (Black)
  - repo: https://github.com/psf/black
    rev: 23.9.1
    hooks:
      - id: black
        language_version: python3

  # Organize imports (isort)
  - repo: https://github.com/pycqa/isort
    rev: 5.12.0
    hooks:
      - id: isort

  # Clean Jupyter notebooks
  - repo: https://github.com/kynan/nbstripout
    rev: 0.6.1
    hooks:
      - id: nbstripout

What these do:

  • trailing-whitespace, end-of-file-fixer: keep files tidy,

  • black: auto-format Python code,

  • isort: clean and order imports,

  • nbstripout: remove execution outputs from Jupyter notebooks.

7. Run hooks manually

To check all files at once:

uv run pre-commit run --all-files

8. Best practices for data scientists

  • Keep pre-commit in dev dependencies only.

  • Document in your README:

    uv run pre-commit install
    

    so teammates also enable hooks.

  • Regularly update hooks:

    uv run pre-commit autoupdate
    

Conclusion

With uv and pre-commit hooks, you get:

  • faster installations than pip,

  • reproducible environments,

  • automatic code quality checks before commits.