Introduction to Bandit for Data Science¶

1. What is Bandit?¶

Bandit is a static analysis tool for Python that automatically detects potential security issues. It scans your codebase for insecure patterns such as:

hardcoded passwords,
unsafe Python functions (eval, exec, …),
weak cryptographic practices,
possible code injection.

Its main goal is to improve your code security before production.

2. Installation with `uv`¶

Instead of using pip, it’s recommended to use uv (a modern, fast Python package manager). You can install Bandit as a development dependency:

uv add --dev bandit

This ensures Bandit is available for development but won’t be installed in production environments.

3. First Security Scan¶

To analyze a single file:

bandit -r my_script.py

To analyze an entire project:

bandit -r .

-r means recursive: Bandit scans all .py files in the directory.

4. Practical Example¶

Let’s consider a file example.py:

import os

# Bad practice: storing a password in plain text
password = "1234"

# Bad practice: using exec()
code = "print('Hello World')"
exec(code)

Run Bandit:

bandit -r example.py

Expected output:

Warning about password (hardcoded secret).
Warning about exec usage (potential code injection).

5. Understanding the Bandit Report¶

Bandit outputs a report with several fields:

Severity: issue severity (LOW, MEDIUM, HIGH).
Confidence: detection confidence (LOW, MEDIUM, HIGH).
Issue: description and recommendations.
Location: file and line of the issue.

Example:

Severity: HIGH
Confidence: HIGH
Issue: [B102: exec_used] Use of exec detected.
Location: example.py:7

6. Output Formats¶

By default, Bandit prints plain text. You can also generate JSON or HTML reports for CI/CD pipelines:

bandit -r . -f json -o report.json
bandit -r . -f html -o report.html

7. Why Bandit Matters in Data Science¶

As a data scientist, you often handle:

sensitive data (medical, financial, personal),
machine learning models that may go into production,
automated scripts for ETL or APIs.

Bandit helps you catch common security issues before sharing or deploying your code, such as:

avoiding hardcoded credentials,
ensuring proper use of cryptographic libraries,
scanning exported notebooks (.py files).

8. Integrating Bandit into a Pre-Commit Hook¶

To prevent committing insecure code, you can run Bandit automatically using pre-commit.

Install pre-commit:

uv add --dev pre-commit

Create a .pre-commit-config.yaml file at the root of your repo:

repos:
  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.9   # use the latest stable version
    hooks:
      - id: bandit
        args: ["-r", "."]

Install the git hook:

pre-commit install

Now Bandit will run automatically on staged files before every commit. If it finds issues, the commit will be blocked until they are fixed.

9. Integrating Bandit into CI/CD¶

GitHub Actions¶

Create .github/workflows/security.yml:

name: Security Scan

on:
  push:
    branches: [ main ]
  pull_request:

jobs:
  bandit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.11"

      - name: Install uv
        run: pip install uv

      - name: Install dependencies
        run: uv sync

      - name: Run Bandit
        run: bandit -r . -f html -o bandit-report.html

      - name: Upload Bandit report
        uses: actions/upload-artifact@v3
        with:
          name: bandit-report
          path: bandit-report.html

GitLab CI¶

Create .gitlab-ci.yml:

stages:
  - security

bandit_scan:
  stage: security
  image: python:3.11
  script:
    - pip install uv
    - uv sync
    - bandit -r . -f html -o bandit-report.html
  artifacts:
    paths:
      - bandit-report.html
    when: always
    expire_in: 1 week

This will generate a Bandit report as a downloadable artifact in GitLab.

10. Best Practices with Bandit¶

Run Bandit early in development (shift-left security).
Include Bandit scans in pre-commit hooks.
Integrate Bandit into your CI/CD pipeline (GitHub or GitLab).
Review warnings carefully instead of ignoring them.
Store reports (HTML/JSON) as artifacts for easy review.

Conclusion¶

Bandit is a lightweight and powerful tool for Python security analysis.
Installing it with uv keeps dependencies clean and isolated.
Integrating Bandit with pre-commit ensures insecure code never lands in git history.
Adding Bandit to CI/CD pipelines provides automated and repeatable security checks.