CI/CD Linter and Policy Scanner DevSecOps

CI/CD is the backbone of modern software delivery — but too often, it's treated as an afterthought when it comes to validation and security.

We lint our code.
We test our deployments.
But our GitHub Actions YAMLs? They're often pushed and merged with minimal checks.

This blog explores how I designed and built a modular, policy-driven CI/CD YAML validator using:

Python for custom linting
Open Policy Agent (OPA) and Rego for structured rule evaluation
GitHub Actions for automation and CI integration

💡 The Motivation

I've worked with teams that heavily rely on GitHub Actions, and I’ve repeatedly seen the same misconfigurations cause issues:

Using @latest or unpinned versions of actions
Omitting permissions: fields, resulting in full token access
Hardcoded secrets in environment variables
Using unsafe commands like set-env, which were deprecated due to vulnerabilities

These are small mistakes, but they can lead to serious problems — from leaking credentials to introducing backdoors. Code reviewers often miss them, especially when they're tucked away in YAML files.

So I asked myself:

“What if we could treat our CI/CD configuration like code — with linting, policy enforcement, and test coverage?”

That’s where this project began.

🎯 The Vision

Build a lightweight, extendable scanner that can:

✅ Parse and lint GitHub Actions YAML files
✅ Enforce best practices through policy-as-code
✅ Integrate directly into GitHub pull requests
✅ Be testable, reusable, and maintainable

🛠️ The Stack

Tool	Role
Python	Custom YAML parsing and static linting
OPA (Open Policy Agent)	Policy engine for security rules
Rego	DSL for writing structured policy logic
GitHub Actions	Runs the validator in CI pipelines
pytest	Python unit test framework
opa test	Policy test framework

🗂️ Folder Structure

Here's a simplified view of how the repo is organized:

ci-cd-linter-validator-scanner/
├── scanner/                     # Python CLI logic
│   ├── linter.py                # YAML anti-pattern checker
│   ├── opa_runner.py           # Invokes OPA
│   └── utils.py                # Common helpers
│
├── policies/                   # Rego rule sets
│   ├── base/                   # Best-practice policies
│   ├── strict/                 # Org-enforced rules
│   └── custom/                 # Add-your-own policies
│
├── examples/                   # Good and bad YAMLs
├── tests/                      # pytest + opa test files
├── .github/workflows/          # GitHub Actions for validation
├── action.yml                  # Reusable GitHub Action interface
└── README.md

🧪 Writing the Python Linter

The Python linter uses PyYAML to parse the workflow YAML and then checks for:

@latest or branch-based actions (uses: actions/checkout@master)
Missing permissions: at the job level
Insecure environment variable patterns
Use of deprecated GitHub Action commands

Example check for @latest:

def find_issues(yaml_data):
    issues = []
    jobs = yaml_data.get("jobs", {})
    for job_name, job in jobs.items():
        for step in job.get("steps", []):
            uses = step.get("uses", "")
            if "@latest" in uses or "@master" in uses:
                issues.append(f"Step '{uses}' in job '{job_name}' uses an unpinned action version.")
    return issues

🛡️ Defining Policy Rules in Rego

OPA policies let you formalize your org’s security posture. Each rule looks for specific violations and outputs a list of messages (deny[msg]).

Example Rego rule (unversioned actions):

package ci_cdpipeline

deny[msg] {
  some job
  some step
  input.jobs[job].steps[_] = step
  step.uses
  not contains(step.uses, "@")
  msg := sprintf("Action '%s' in job '%s' is not version pinned.", [step.uses, job])
}

OPA is invoked from Python using subprocess and passed the YAML as input. The result? You get flexible policy enforcement layered over your linter.

🔁 Using It as a GitHub Action

A major goal was to make this reusable as a drop-in GitHub Action:

- uses: amankc-neo/ci-cd-linter-validator-scanner@v1
  with:
    target_file: ".github/workflows/deploy.yml"
    strict_mode: "true"

This works by referencing the action.yml definition:

inputs:
  target_file:
    description: "Path to YAML to scan"
    required: true

runs:
  using: "composite"
  steps:
    - run: python scanner/linter.py ${{ inputs.target_file }}

You can even integrate this into your organization’s CI pipeline templates or GitHub App logic later on.

🧪 Testing the Rules

Tests were essential for building trust in this tool:

✅ Python tests via pytest
✅ Rego rule tests via opa test
✅ CI validation with real bad/good YAMLs

Sample Python test:

def test_latest_tag_violation():
    data = {
        "jobs": {
            "build": {
                "steps": [{"uses": "docker/build-push-action@latest"}]
            }
        }
    }
    issues = find_issues(data)
    assert any("unpinned" in i for i in issues)

Sample Rego policy test:

test_unpinned_action_violation {
  input := {
    "jobs": {
      "build": {
        "steps": [
          {"uses": "actions/setup-node@latest"}
        ]
      }
    }
  }

  deny[_] == "Action 'actions/setup-node@latest' in job 'build' is not version pinned."
}

🔮 Future Roadmap

Here's where I see this project going:

🔄 scan_all flag to lint all workflows at once
🧩 Plugin model to support GitLab CI, Bitbucket Pipelines
📊 GitHub App or CLI tool for rich reporting
🔐 Secrets detection via entropy analysis
🎯 Rule categories: security, performance, maintainability

📎 GitHub Repo

🔗 https://github.com/amankc-neo/ci-cd-linter-validator-scanner

💬 Final Thoughts

This project was a deep dive into DevSecOps, GitHub Actions internals, and policy-as-code tooling. It’s also a reminder that even in the CI layer, we need structure, repeatability, and enforcement.

Security and quality should never be an afterthought — especially not in the pipelines that ship your code.

Whether you're a solo developer or running a platform team, I hope this tool inspires you to bring the same level of rigor to your pipelines that you bring to your codebase.

If you’ve worked on anything similar, or have rules you’d love to see included — drop a comment or raise a PR.

Thanks for reading 🙏

How I Built a CI/CD Linter with Python, OPA, and Rego to Secure GitHub Workflows