Reclaiming CSPM: How I Learned to Stop Worrying and Query the Cloud

Matt BrownMatt Brown
13 min read

In Part 1, we kept things simple. We used Steampipe with AWS to ask basic questions like “Is this bucket public?” or “Does Matt have MFA?”. We did it so that we could produce meaningful misconfig detections and not use CLI hackery.

Simple checks only get you so far. The real concern I see in cloud security isn’t just if a thing is misconfigured — it’s what else that thing is connected to. It is looking at the bigger picture. The basics are important but the next level is both quite important and quite fun, especially DIY.

  • Who owns it?
  • What policies are attached?
  • Is it exposed in more than one way?

This is where CSPM gets interesting and where Steampipe’s full power starts to shine.

In this post, we’ll go beyond standalone checks and start building context by joining tables together. Then we'll move into Powerpipe, which will let us get into more coolness. Yes, coolness.

Let’s get to it. And before you ask, dashboards and reporting and those cool things are still to come.


What Makes These Queries “Advanced”?

Let’s be real: these questions aren't particularly difficult linguistically.

  • Are any Lambda functions using admin roles?

  • Which EC2 instances sit behind public load balancers?

These are reasonable questions you’d expect to ask to evaluate your cloud security posture. But the problem is that answering them isn’t straightforward. This could be for a variety of reasons. In the examples we'll see one where AWS doesn’t always expose these relationships cleanly.

You end up doing things like:

  • Joining across multiple services (Lambda → IAM → policy)

  • Unpacking nested JSON blobs (hello target_health_descriptions)

  • Filtering arrays inside your WHERE clause using jsonb sorcery (a bit of the fun of Steampipe)

This isn’t just about SQL exploration. It’s about creating structure from disorganization — building a poor man's non-graph knowledge graph. We’re building basic connections between IAM roles, policies, resources, and exposure paths.

The difference between basic and advanced here isn’t difficulty — it’s intent. These queries model relationships that matter from a security perspective. They help uncover weak spots that a flat inventory never will and are just way more fun.

And that’s the kind of “CSPM” we’re after once we move past the compliance type ones (which are important!)

Disclaimer: Some of these queries were co-written by GPT, because past-me didn’t major in SQL and present-me is not equipped to major in anything beyond Python and Stack Overflow.


Taking free to the limits

We'll run through three advanced queries.

Example 1: Lambda Functions with Overly-Permissive IAM Roles

Let’s kick things off with a classic — runtime privilege risk.

Here’s the query:

select
  lf.name as lambda_name,
  r.name as role_name,
  r.attached_policy_arns
from
  aws_lambda_function lf
  join aws_iam_role r on lf.role = r.arn
where
  r.attached_policy_arns @> '["arn:aws:iam::aws:policy/AdministratorAccess"]'::jsonb;

You should see a result like:

+----------------+-----------------+-------------------------------------------------+
| lambda_name    | role_name       | attached_policy_arns                            |
+----------------+-----------------+-------------------------------------------------+
| OverPrivLambda | LambdaAdminRole | ["arn:aws:iam::aws:policy/AdministratorAccess"] |
+----------------+-----------------+-------------------------------------------------+
What this query is doing (in plain English):
  • Find every Lambda function in your AWS account

  • Find the IAM role it uses

  • Check if has "arn:aws:iam::aws:policy/AdministratorAccess", unrestricted access gt

  • If so, show: Lambda function name, Role name, Policy

This doesn’t catch every risky role or wildcard-filled policy — but AdministratorAccess is basically "Action": "*".

Simple as you like it.

Example 2: Public-Facing ALBs with Direct EC2 Targets

Let’s go one level deeper and ask: Which of my publicly exposed Application Load Balancers (ALBs) directly forward traffic to EC2 instances?

That’s not necessarily bad — but if those instances aren’t meant to be exposed, that’s an easy target.

Here’s the query:

select
  alb.name as alb_name,
  alb.dns_name,
  ec2.instance_id,
  ec2.public_ip_address
from
  aws_ec2_application_load_balancer alb
  join aws_ec2_target_group tg
    on tg.load_balancer_arns @> format('[\"%s\"]', alb.arn)::jsonb
  join aws_ec2_instance ec2
    on ec2.instance_id = (
      select
        jsonb_array_elements(tg.target_health_descriptions)->'Target'->>'Id'
      limit 1
    )
where
  alb.scheme = 'internet-facing'

You should see a result like:

+----------+-------------------------------------------------+---------------------+-------------------+
| alb_name | dns_name                                        | instance_id         | public_ip_address |
+----------+-------------------------------------------------+---------------------+-------------------+
| test-alb    | <dns_name>                                     | <instance_id>     | <public_ip>           |
+----------+-------------------------------------------------+---------------------+-------------------+
What this query is doing (in plain English):
  • It finds any ALB with that is internet facing.

  • It looks for target groups that reference that ALB.

  • From there, it parses the target instance IDs listed in the target health descriptions.

  • Then it joins back to the EC2 instance metadata to fetch the instance_id and public_ip_address.

An ALB that is internet-facing can of course be problematic, especially if you had WAF requirements that failed. We’re not checking that here… but it’d be a smart next step.

Example 3: EBS Volumes Without Recent Snapshots

You can't restore what you never saved. Ugh all those pictures? Guess that one could be a good thing or a bad thing. In AWS however you could consider it an operational gap. Forgetting to back up critical volumes. This query surfaces every EBS volume attached to an EC2 instance that hasn’t had a snapshot taken in the last 30 days — or ever. So it is very easy to test by just creating an EC2 instance with an EBS volume and not doing anything. You could add two EBS volumes and then snapshot one in the UI to test it out.

Here’s the query:

with latest_snapshots as (
  select
    volume_id,
    max(start_time) as last_snapshot
  from
    aws_ebs_snapshot
  group by
    volume_id
)
select
  i.instance_id,
  i.tags,
  v.volume_id,
  ls.last_snapshot
from
  aws_ebs_volume v
  join lateral jsonb_array_elements(v.attachments) as a(attachment) on true
  join aws_ec2_instance i on i.instance_id = a.attachment->>'InstanceId'
  left join latest_snapshots ls on v.volume_id = ls.volume_id
where
  ls.last_snapshot is null
  or ls.last_snapshot < current_date - interval '30 days'
order by
  ls.last_snapshot nulls first;

You should see a result like:

+---------------------+------------------------+-----------------------+---------------+
| instance_id         | tags                   | volume_id             | last_snapshot |
+---------------------+------------------------+-----------------------+---------------+
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-1111f5bdce1319421 | <null>        |
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-111188227733b4e41 | <null>        |
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-111198b4100e0cde6 | <null>        |
+---------------------+------------------------+-----------------------+---------------+
What this query is doing (in plain English):
  • It starts with all EBS volumes that are currently attached to EC2 instances.

  • It finds the most recent snapshot per volume if one even exists.

  • It filters for volumes where the latest snapshot is older than 30 days. Again if one exists.

  • It shows one row per volume.

Snapshot hygiene might not be important for you, but if so this is one way to automate checking it.


Using Steampipe Queries Effectively

So far, if you’ve followed along, you’ve been running everything as ad hoc queries. That’s great for experimenting — but the queries we’ve built are too useful to leave in a scrollback buffer. Good luck doing history in steampipe query.

To get real value from Steampipe, you need structure. Start by saving your queries as .sql files. It’s simple, but it’s the first step toward repeatability, sharing, and automation.

Let's say you saved the check for EBS Volumes Without Recent Snapshots in an ebs_volumes_without_recent_snapshots.sql.

steampipe query ebs_volumes_without_recent_snapshots.sql

All of the examples in this post could easily live in .sql files that are version-controlled, reviewed, and shared. That can allow you to write and refine your more advanced checks in a matter you're familiar with.

These are a few advantages:

  • Track query changes over time

  • Share detections across teams or repos

  • Plug them into scheduled jobs

Ok onto how we can really level up with Powerpipe.


Introducing Powerpipe: Dashboards, Benchmarks, and Custom Controls

Powerpipe makes it easy to turn your Steampipe setup into a full-fledged CSPM engine — complete with dashboards, compliance benchmarks, and custom controls.

Once Steampipe is configured, running Powerpipe mods is quite easy. The Powerpipe Hub has over 65 mods you can plug in, covering four major categories:

  • Compliance

  • Detections

  • Cost Management

  • Asset Inventory & Insights

Some categories (like Detections) use tools like Tailpipe to analyze VPC Flow Logs or CloudTrail events. That’s more cloud threat detection than CSPM — interesting stuff, but outside the scope of this post. For now, we’ll focus on Compliance in AWS.

The AWS Compliance mod gives you a rich set of benchmarks right out of the box — SOC 2, NIST CSF, CIS, and others — all powered by Steampipe queries under the hood. And it just takes a couple commands to get up and running.

But here’s what makes this powerful: you can create your own controls and dashboards using the same .sql queries we wrote earlier. That means we can take our “EBS volumes without recent snapshots” query and turn it into a proper control with scoring, summaries, and status views.


Setting Up Powerpipe

Dead simple set up.

Step 1: Install Powerpipe

As is usual, if you’re on macOS and use Homebrew:

brew install powerpipe

Well that was exactly the same as steampipe, just a new type of pipe...

If you're not using macOS, just follow the appropriate instructions.

Step 2: Add AWS Compliance Mod

With Powerpipe installed you can now add the AWS Compliance Mod, which will pair well with what we've been working on with AWS Steampipe queries:

powerpipe mod init
powerpipe mod install github.com/turbot/steampipe-mod-aws-compliance

You should be able to see a mod.pp file.

It will look different if you have added other mods or have not even installed a mod.

matt.brown@matt steampipe % cat mod.pp
mod "local" {
  title = "steampipe"
  require {
    mod "github.com/turbot/steampipe-mod-aws-compliance" {
      version = "*"
    }
    mod "github.com/turbot/steampipe-mod-aws-well-architected" {
      version = "*"
    }
  }

Step 3: Spin Up the Local Server

It is very easy to run this locally. Just run the following:

powerpipe server

And you'll see something like this:

[ Wait    ] Starting WorkspaceEvents Server
[ Message ] WorkspaceEvents loaded
[ Message ] Initialization complete
[ Ready   ] Dashboard server started on 9033 and listening on local
[ Message ] Visit http://localhost:9033
[ Message ] Press Ctrl+C to exit

If you never ran the mod init you'll see:

matt.brown@matt steampipe % powerpipe server
Error: This command requires a mod definition file (mod.pp) - could not find in the current directory tree.

You can either clone a mod repository or install a mod using powerpipe mod install and run this command from the cloned/installed mod directory.
Please refer to: https://powerpipe.io/docs/build#powerpipe-mods

Then just hit the server and you should see the AWS dashboard:

Cool, you get CSPM out-of-the-box for AWS. Obviously you could do the same for other cloud providers.


Exploring the AWS Compliance Dashboard

With the Powerpipe server running and the AWS Compliance Mod loaded, you can now see how well your environment stacks up against various standards. One I like in particular is AWS Foundational Security Best Practices — it’s a solid baseline that blends well with both CIS and real-world operational needs.

Here’s an example dashboard view based on a previously ad hoc query. This one checks for unencrypted EBS volumes:

You can explore the structure of these controls in the steampipe-mod-aws-compliance GitHub repo. Each control maps to a specific query. After doing some digging I found you could see the structure of the EBS volume encryption control here. Then you can dive in and find the actual query here.

Tweaking the query to the following should give you the same result. It took me awhile to get this deep, but it is a nice set up.

> select
  arn as resource,
  case
    when state != 'in-use' then 'skip'
    when encrypted then 'ok'
    else 'alarm'
  end as status,
  case
    when state != 'in-use' then volume_id || ' not attached.'
    when encrypted then volume_id || ' encrypted.'
    else volume_id || ' not encrypted.'
  end as reason,
  region,
  volume_id,
  state,
  encrypted,
  title
from
  aws_ebs_volume;

+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+
| resource                                                        | status | reason                              | region    | volume_id             | state     | encrypted | title                 |
+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+
| arn:aws:ec2:us-west-2:266735822823:volume/vol-061698b4100e0cde6 | skip   | vol-061698b4100e0cde6 not attached. | us-west-2 | vol-061698b4100e0cde6 | available | false     | vol-061698b4100e0cde6 |
| arn:aws:ec2:us-west-2:266735822823:volume/vol-06daf5bdce1319421 | skip   | vol-06daf5bdce1319421 not attached. | us-west-2 | vol-06daf5bdce1319421 | available | false     | vol-06daf5bdce1319421 |
| arn:aws:ec2:us-west-2:266735822823:volume/vol-0798d9a6b06fab049 | skip   | vol-0798d9a6b06fab049 not attached. | us-west-2 | vol-0798d9a6b06fab049 | available | false     | vol-0798d9a6b06fab049 |
+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+

One thing to note: there’s no obvious way to tweak the default dashboards inside the Powerpipe UI. Your best bet for customization is likely to fork the mod and adjust the queries or control formats directly. Could be a useful exercise to change severities. Overall I cannot think of a strong reason to change the defaults, instead just create a new one and maybe copy some stuff!


Creating a Custom Dashboard

You can view instructions here. Bear with me, this dashboard is quite horrendous. This one has a single control for public S3 buckets.

Step 1: Create Your New Mod

Just jump into a new folder and run your usual init:

powerpipe mod init

Open the mod.pp file and change add the query as such:

mod "local" {
  title = "custom_dash"
}

query "aws_ebs_not_encrypted" {
  sql = <<-EOQ
    select name from aws_s3_bucket where bucket_policy_is_public=true or block_public_acls=false;
    EOQ
  }

Step 2: Create Your Dashboard

Just create a dashboard file like customdash.pp in the same directory. And then make it as follows for just an MVP, that is more like a Bench Player.

dashboard "custom_dash" {
  title = "Basically a Lame Custom Dashboard"

  card {
    sql = query.aws_ebs_not_encrypted.sql
  }
}

Step 3: Fire Up the Server

As usual, run your powerpipe server and make sure it is in the same folder:

powerpipe server

And you'll see:

One thing that is cool is you get easy reloads, sorta like a nodemon style server. So change the dashboard stuff, save, reload, and bam all good to go with your changes. I found this out when everything was wrong, but eventually it all got ironed out.

Taking the query from Part 1 and turning it into a dashboard is quite an accomplishment if you ask me. This was a lot of work, but it is cool to get this customization along with the out-of-the-box goodness.


Wrap-Up: Hella Cool! (Yes I did it)

That’s a wrap — for now — on Steampipe and Powerpipe.

Originally, I set out to write about just Steampipe. But the more I used it, the more it became clear: if you’re not using both Steampipe and Powerpipe, you’re only getting half the CSPM experience. So yeah, excuse the misleading intro.

This all started as a bit of a shot in the dark. I went looking for open source CSPM tools and stumbled onto Steampipe. And honestly? I think it’s one of the most genuinely useful open source tools out there. No “Free” plan that’s secretly a SaaS sales funnel. No sign-up form. Just raw access and real control.

It’s intuitive to set up, surprisingly easy to use (because it’s just SQL), and provides serious value from Day 1. Whether you're inventorying resources, checking for risk, or building out your own controls it just works.

I’ll almost certainly be revisiting this — whether it’s digging into Turbot’s other tools like Tailpipe for threat detection, exploring how well this works for KSPM, or lining it up against other CSPM tooling in the wild.

The only real challenge I ran into was understanding what exactly all these pipes were. I don’t know much about steampipes or powerpipes. So I spent a good chunk of a road trip trying to map it to plumbing I actually understood, which, as you can imagine, goes in a lot of directions.

Here’s the best I’ve got: imagine a four-story triplex. You’ve got one main pipe feeding three units, each with its own water heater. That breaks out into hot and cold pipes that run up to all the apartments — plus some extra lines for hoses, outdoor faucets, and common areas. All those branch even further: showers, toilets, sinks, you name it.

Now try to debug why someone’s shower doesn’t get hot water.

That’s about what this feels like — except the plumbing is cloud APIs, the water pressure is AWS rate limits, and your wrench is a SQL query.

Nevermind.

1
Subscribe to my newsletter

Read articles from Matt Brown directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Matt Brown
Matt Brown

Working as a solutions architect while going deep on Kubernetes security — prevention-first thinking, open source tooling, and a daily rabbit hole of hands-on learning. I make the mistakes, then figure out how to fix them (eventually).