Using pre-commit hooks with uv for Data Science¶
1. What is a pre-commit hook?¶
A hook is a small script that runs automatically at certain stages of using Git.
A pre-commit hook runs right before you save your changes with git commit.
The idea: automate checks or fixes before your code enters Git history.
Examples:
check Python code style (PEP8),
automatically format code,
prevent committing large data files,
clean up Jupyter notebooks before committing.
2. Why should data scientists care?¶
When working in a team, clean code and reproducibility matter. Hooks help you:
avoid silly mistakes (typos, forgotten formatting),
keep the repo clean (notebooks without huge outputs),
save time (problems are fixed before review),
collaborate smoothly with engineers.
3. Install and initialize uv¶
If you don’t have uv yet, install it:
curl -LsSf https://astral.sh/uv/install.sh | sh
Then create and enter a project:
uv init my-data-project
cd my-data-project
This creates a virtual environment and a pyproject.toml.
4. Add pre-commit with uv¶
Instead of pip install pre-commit, use:
uv add --dev pre-commit
This installs pre-commit as a development dependency in your pyproject.toml.
5. Initialize hooks in Git¶
Inside your project, run:
uv run pre-commit install
From now on, every git commit will trigger the hooks you configure.
6. Configure .pre-commit-config.yaml¶
At the root of your repo, create a .pre-commit-config.yaml.
Here’s a setup useful for data scientists:
repos:
# Basic checks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-merge-conflict
- id: check-added-large-files
# Python code formatter (Black)
- repo: https://github.com/psf/black
rev: 23.9.1
hooks:
- id: black
language_version: python3
# Organize imports (isort)
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
# Clean Jupyter notebooks
- repo: https://github.com/kynan/nbstripout
rev: 0.6.1
hooks:
- id: nbstripout
What these do:
trailing-whitespace,end-of-file-fixer: keep files tidy,black: auto-format Python code,isort: clean and order imports,nbstripout: remove execution outputs from Jupyter notebooks.
7. Run hooks manually¶
To check all files at once:
uv run pre-commit run --all-files
8. Best practices for data scientists¶
Keep
pre-commitin dev dependencies only.Document in your README:
uv run pre-commit install
so teammates also enable hooks.
Regularly update hooks:
uv run pre-commit autoupdate
Conclusion¶
With uv and pre-commit hooks, you get:
faster installations than pip,
reproducible environments,
automatic code quality checks before commits.