Introduction to Bandit for Data Science

1. What is Bandit?

Bandit is a static analysis tool for Python that automatically detects potential security issues. It scans your codebase for insecure patterns such as:

  • hardcoded passwords,

  • unsafe Python functions (eval, exec, …),

  • weak cryptographic practices,

  • possible code injection.

Its main goal is to improve your code security before production.

2. Installation with uv

Instead of using pip, it’s recommended to use uv (a modern, fast Python package manager). You can install Bandit as a development dependency:

uv add --dev bandit

This ensures Bandit is available for development but won’t be installed in production environments.

3. First Security Scan

To analyze a single file:

bandit -r my_script.py

To analyze an entire project:

bandit -r .
  • -r means recursive: Bandit scans all .py files in the directory.

4. Practical Example

Let’s consider a file example.py:

import os

# Bad practice: storing a password in plain text
password = "1234"

# Bad practice: using exec()
code = "print('Hello World')"
exec(code)

Run Bandit:

bandit -r example.py

Expected output:

  • Warning about password (hardcoded secret).

  • Warning about exec usage (potential code injection).

5. Understanding the Bandit Report

Bandit outputs a report with several fields:

  • Severity: issue severity (LOW, MEDIUM, HIGH).

  • Confidence: detection confidence (LOW, MEDIUM, HIGH).

  • Issue: description and recommendations.

  • Location: file and line of the issue.

Example:

Severity: HIGH
Confidence: HIGH
Issue: [B102: exec_used] Use of exec detected.
Location: example.py:7

6. Output Formats

By default, Bandit prints plain text. You can also generate JSON or HTML reports for CI/CD pipelines:

bandit -r . -f json -o report.json
bandit -r . -f html -o report.html

7. Why Bandit Matters in Data Science

As a data scientist, you often handle:

  • sensitive data (medical, financial, personal),

  • machine learning models that may go into production,

  • automated scripts for ETL or APIs.

Bandit helps you catch common security issues before sharing or deploying your code, such as:

  • avoiding hardcoded credentials,

  • ensuring proper use of cryptographic libraries,

  • scanning exported notebooks (.py files).

8. Integrating Bandit into a Pre-Commit Hook

To prevent committing insecure code, you can run Bandit automatically using pre-commit.

  1. Install pre-commit:

uv add --dev pre-commit
  1. Create a .pre-commit-config.yaml file at the root of your repo:

repos:
  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.9   # use the latest stable version
    hooks:
      - id: bandit
        args: ["-r", "."]
  1. Install the git hook:

pre-commit install

Now Bandit will run automatically on staged files before every commit. If it finds issues, the commit will be blocked until they are fixed.

9. Integrating Bandit into CI/CD

GitHub Actions

Create .github/workflows/security.yml:

name: Security Scan

on:
  push:
    branches: [ main ]
  pull_request:

jobs:
  bandit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.11"

      - name: Install uv
        run: pip install uv

      - name: Install dependencies
        run: uv sync

      - name: Run Bandit
        run: bandit -r . -f html -o bandit-report.html

      - name: Upload Bandit report
        uses: actions/upload-artifact@v3
        with:
          name: bandit-report
          path: bandit-report.html

GitLab CI

Create .gitlab-ci.yml:

stages:
  - security

bandit_scan:
  stage: security
  image: python:3.11
  script:
    - pip install uv
    - uv sync
    - bandit -r . -f html -o bandit-report.html
  artifacts:
    paths:
      - bandit-report.html
    when: always
    expire_in: 1 week

This will generate a Bandit report as a downloadable artifact in GitLab.

10. Best Practices with Bandit

  • Run Bandit early in development (shift-left security).

  • Include Bandit scans in pre-commit hooks.

  • Integrate Bandit into your CI/CD pipeline (GitHub or GitLab).

  • Review warnings carefully instead of ignoring them.

  • Store reports (HTML/JSON) as artifacts for easy review.

Conclusion

  • Bandit is a lightweight and powerful tool for Python security analysis.

  • Installing it with uv keeps dependencies clean and isolated.

  • Integrating Bandit with pre-commit ensures insecure code never lands in git history.

  • Adding Bandit to CI/CD pipelines provides automated and repeatable security checks.