Introduction to Bandit for Data Science¶
1. What is Bandit?¶
Bandit is a static analysis tool for Python that automatically detects potential security issues. It scans your codebase for insecure patterns such as:
hardcoded passwords,
unsafe Python functions (
eval,exec, …),weak cryptographic practices,
possible code injection.
Its main goal is to improve your code security before production.
2. Installation with uv¶
Instead of using pip, it’s recommended to use uv (a modern, fast Python package manager).
You can install Bandit as a development dependency:
uv add --dev bandit
This ensures Bandit is available for development but won’t be installed in production environments.
3. First Security Scan¶
To analyze a single file:
bandit -r my_script.py
To analyze an entire project:
bandit -r .
-rmeans recursive: Bandit scans all.pyfiles in the directory.
4. Practical Example¶
Let’s consider a file example.py:
import os
# Bad practice: storing a password in plain text
password = "1234"
# Bad practice: using exec()
code = "print('Hello World')"
exec(code)
Run Bandit:
bandit -r example.py
Expected output:
Warning about
password(hardcoded secret).Warning about
execusage (potential code injection).
5. Understanding the Bandit Report¶
Bandit outputs a report with several fields:
Severity: issue severity (
LOW,MEDIUM,HIGH).Confidence: detection confidence (
LOW,MEDIUM,HIGH).Issue: description and recommendations.
Location: file and line of the issue.
Example:
Severity: HIGH
Confidence: HIGH
Issue: [B102: exec_used] Use of exec detected.
Location: example.py:7
6. Output Formats¶
By default, Bandit prints plain text. You can also generate JSON or HTML reports for CI/CD pipelines:
bandit -r . -f json -o report.json
bandit -r . -f html -o report.html
7. Why Bandit Matters in Data Science¶
As a data scientist, you often handle:
sensitive data (medical, financial, personal),
machine learning models that may go into production,
automated scripts for ETL or APIs.
Bandit helps you catch common security issues before sharing or deploying your code, such as:
avoiding hardcoded credentials,
ensuring proper use of cryptographic libraries,
scanning exported notebooks (
.pyfiles).
8. Integrating Bandit into a Pre-Commit Hook¶
To prevent committing insecure code, you can run Bandit automatically using pre-commit.
Install pre-commit:
uv add --dev pre-commit
Create a
.pre-commit-config.yamlfile at the root of your repo:
repos:
- repo: https://github.com/PyCQA/bandit
rev: 1.7.9 # use the latest stable version
hooks:
- id: bandit
args: ["-r", "."]
Install the git hook:
pre-commit install
Now Bandit will run automatically on staged files before every commit. If it finds issues, the commit will be blocked until they are fixed.
9. Integrating Bandit into CI/CD¶
GitHub Actions¶
Create .github/workflows/security.yml:
name: Security Scan
on:
push:
branches: [ main ]
pull_request:
jobs:
bandit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.11"
- name: Install uv
run: pip install uv
- name: Install dependencies
run: uv sync
- name: Run Bandit
run: bandit -r . -f html -o bandit-report.html
- name: Upload Bandit report
uses: actions/upload-artifact@v3
with:
name: bandit-report
path: bandit-report.html
GitLab CI¶
Create .gitlab-ci.yml:
stages:
- security
bandit_scan:
stage: security
image: python:3.11
script:
- pip install uv
- uv sync
- bandit -r . -f html -o bandit-report.html
artifacts:
paths:
- bandit-report.html
when: always
expire_in: 1 week
This will generate a Bandit report as a downloadable artifact in GitLab.
10. Best Practices with Bandit¶
Run Bandit early in development (shift-left security).
Include Bandit scans in pre-commit hooks.
Integrate Bandit into your CI/CD pipeline (GitHub or GitLab).
Review warnings carefully instead of ignoring them.
Store reports (
HTML/JSON) as artifacts for easy review.
Conclusion¶
Bandit is a lightweight and powerful tool for Python security analysis.
Installing it with uv keeps dependencies clean and isolated.
Integrating Bandit with pre-commit ensures insecure code never lands in git history.
Adding Bandit to CI/CD pipelines provides automated and repeatable security checks.