Reclaiming CSPM: How I Learned to Stop Worrying and Query the Cloud


In Part 1, we kept things simple. We used Steampipe with AWS to ask basic questions like “Is this bucket public?” or “Does Matt have MFA?”. We did it so that we could produce meaningful misconfig detections and not use CLI hackery.
Simple checks only get you so far. The real concern I see in cloud security isn’t just if a thing is misconfigured — it’s what else that thing is connected to. It is looking at the bigger picture. The basics are important but the next level is both quite important and quite fun, especially DIY.
- Who owns it?
- What policies are attached?
- Is it exposed in more than one way?
This is where CSPM gets interesting and where Steampipe’s full power starts to shine.
In this post, we’ll go beyond standalone checks and start building context by joining tables together. Then we'll move into Powerpipe, which will let us get into more coolness. Yes, coolness.
Let’s get to it. And before you ask, dashboards and reporting and those cool things are still to come.
What Makes These Queries “Advanced”?
Let’s be real: these questions aren't particularly difficult linguistically.
Are any Lambda functions using admin roles?
Which EC2 instances sit behind public load balancers?
These are reasonable questions you’d expect to ask to evaluate your cloud security posture. But the problem is that answering them isn’t straightforward. This could be for a variety of reasons. In the examples we'll see one where AWS doesn’t always expose these relationships cleanly.
You end up doing things like:
Joining across multiple services (Lambda → IAM → policy)
Unpacking nested JSON blobs (hello
target_health_descriptions
)Filtering arrays inside your WHERE clause using
jsonb
sorcery (a bit of the fun of Steampipe)
This isn’t just about SQL exploration. It’s about creating structure from disorganization — building a poor man's non-graph knowledge graph. We’re building basic connections between IAM roles, policies, resources, and exposure paths.
The difference between basic and advanced here isn’t difficulty — it’s intent. These queries model relationships that matter from a security perspective. They help uncover weak spots that a flat inventory never will and are just way more fun.
And that’s the kind of “CSPM” we’re after once we move past the compliance type ones (which are important!)
Disclaimer: Some of these queries were co-written by GPT, because past-me didn’t major in SQL and present-me is not equipped to major in anything beyond Python and Stack Overflow.
Taking free to the limits
We'll run through three advanced queries.
Example 1: Lambda Functions with Overly-Permissive IAM Roles
Let’s kick things off with a classic — runtime privilege risk.
Here’s the query:
select
lf.name as lambda_name,
r.name as role_name,
r.attached_policy_arns
from
aws_lambda_function lf
join aws_iam_role r on lf.role = r.arn
where
r.attached_policy_arns @> '["arn:aws:iam::aws:policy/AdministratorAccess"]'::jsonb;
You should see a result like:
+----------------+-----------------+-------------------------------------------------+
| lambda_name | role_name | attached_policy_arns |
+----------------+-----------------+-------------------------------------------------+
| OverPrivLambda | LambdaAdminRole | ["arn:aws:iam::aws:policy/AdministratorAccess"] |
+----------------+-----------------+-------------------------------------------------+
What this query is doing (in plain English):
Find every Lambda function in your AWS account
Find the IAM role it uses
Check if has "arn:aws:iam::aws:policy/AdministratorAccess", unrestricted access gt
- If so, show: Lambda function name, Role name, Policy
This doesn’t catch every risky role or wildcard-filled policy — but AdministratorAccess
is basically "Action": "*"
.
Simple as you like it.
Example 2: Public-Facing ALBs with Direct EC2 Targets
Let’s go one level deeper and ask: Which of my publicly exposed Application Load Balancers (ALBs) directly forward traffic to EC2 instances?
That’s not necessarily bad — but if those instances aren’t meant to be exposed, that’s an easy target.
Here’s the query:
select
alb.name as alb_name,
alb.dns_name,
ec2.instance_id,
ec2.public_ip_address
from
aws_ec2_application_load_balancer alb
join aws_ec2_target_group tg
on tg.load_balancer_arns @> format('[\"%s\"]', alb.arn)::jsonb
join aws_ec2_instance ec2
on ec2.instance_id = (
select
jsonb_array_elements(tg.target_health_descriptions)->'Target'->>'Id'
limit 1
)
where
alb.scheme = 'internet-facing'
You should see a result like:
+----------+-------------------------------------------------+---------------------+-------------------+
| alb_name | dns_name | instance_id | public_ip_address |
+----------+-------------------------------------------------+---------------------+-------------------+
| test-alb | <dns_name> | <instance_id> | <public_ip> |
+----------+-------------------------------------------------+---------------------+-------------------+
What this query is doing (in plain English):
It finds any ALB with that is internet facing.
It looks for target groups that reference that ALB.
From there, it parses the target instance IDs listed in the target health descriptions.
Then it joins back to the EC2 instance metadata to fetch the
instance_id
andpublic_ip_address
.
An ALB that is internet-facing can of course be problematic, especially if you had WAF requirements that failed. We’re not checking that here… but it’d be a smart next step.
Example 3: EBS Volumes Without Recent Snapshots
You can't restore what you never saved. Ugh all those pictures? Guess that one could be a good thing or a bad thing. In AWS however you could consider it an operational gap. Forgetting to back up critical volumes. This query surfaces every EBS volume attached to an EC2 instance that hasn’t had a snapshot taken in the last 30 days — or ever. So it is very easy to test by just creating an EC2 instance with an EBS volume and not doing anything. You could add two EBS volumes and then snapshot one in the UI to test it out.
Here’s the query:
with latest_snapshots as (
select
volume_id,
max(start_time) as last_snapshot
from
aws_ebs_snapshot
group by
volume_id
)
select
i.instance_id,
i.tags,
v.volume_id,
ls.last_snapshot
from
aws_ebs_volume v
join lateral jsonb_array_elements(v.attachments) as a(attachment) on true
join aws_ec2_instance i on i.instance_id = a.attachment->>'InstanceId'
left join latest_snapshots ls on v.volume_id = ls.volume_id
where
ls.last_snapshot is null
or ls.last_snapshot < current_date - interval '30 days'
order by
ls.last_snapshot nulls first;
You should see a result like:
+---------------------+------------------------+-----------------------+---------------+
| instance_id | tags | volume_id | last_snapshot |
+---------------------+------------------------+-----------------------+---------------+
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-1111f5bdce1319421 | <null> |
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-111188227733b4e41 | <null> |
| i-11118578e57a79e24 | {"Name":"snapshot-no"} | vol-111198b4100e0cde6 | <null> |
+---------------------+------------------------+-----------------------+---------------+
What this query is doing (in plain English):
It starts with all EBS volumes that are currently attached to EC2 instances.
It finds the most recent snapshot per volume if one even exists.
It filters for volumes where the latest snapshot is older than 30 days. Again if one exists.
It shows one row per volume.
Snapshot hygiene might not be important for you, but if so this is one way to automate checking it.
Using Steampipe Queries Effectively
So far, if you’ve followed along, you’ve been running everything as ad hoc queries. That’s great for experimenting — but the queries we’ve built are too useful to leave in a scrollback buffer. Good luck doing history
in steampipe query
.
To get real value from Steampipe, you need structure. Start by saving your queries as .sql files. It’s simple, but it’s the first step toward repeatability, sharing, and automation.
Let's say you saved the check for EBS Volumes Without Recent Snapshots in an ebs_volumes_without_recent_snapshots.sql
.
steampipe query ebs_volumes_without_recent_snapshots.sql
All of the examples in this post could easily live in .sql
files that are version-controlled, reviewed, and shared. That can allow you to write and refine your more advanced checks in a matter you're familiar with.
These are a few advantages:
Track query changes over time
Share detections across teams or repos
Plug them into scheduled jobs
Ok onto how we can really level up with Powerpipe.
Introducing Powerpipe: Dashboards, Benchmarks, and Custom Controls
Powerpipe makes it easy to turn your Steampipe setup into a full-fledged CSPM engine — complete with dashboards, compliance benchmarks, and custom controls.
Once Steampipe is configured, running Powerpipe mods is quite easy. The Powerpipe Hub has over 65 mods you can plug in, covering four major categories:
Compliance
Detections
Cost Management
Asset Inventory & Insights
Some categories (like Detections) use tools like Tailpipe to analyze VPC Flow Logs or CloudTrail events. That’s more cloud threat detection than CSPM — interesting stuff, but outside the scope of this post. For now, we’ll focus on Compliance in AWS.
The AWS Compliance mod gives you a rich set of benchmarks right out of the box — SOC 2, NIST CSF, CIS, and others — all powered by Steampipe queries under the hood. And it just takes a couple commands to get up and running.
But here’s what makes this powerful: you can create your own controls and dashboards using the same .sql
queries we wrote earlier. That means we can take our “EBS volumes without recent snapshots” query and turn it into a proper control with scoring, summaries, and status views.
Setting Up Powerpipe
Dead simple set up.
Step 1: Install Powerpipe
As is usual, if you’re on macOS and use Homebrew:
brew install powerpipe
Well that was exactly the same as steampipe
, just a new type of pipe...
If you're not using macOS, just follow the appropriate instructions.
Step 2: Add AWS Compliance Mod
With Powerpipe installed you can now add the AWS Compliance Mod, which will pair well with what we've been working on with AWS Steampipe queries:
powerpipe mod init
powerpipe mod install github.com/turbot/steampipe-mod-aws-compliance
You should be able to see a mod.pp
file.
It will look different if you have added other mods or have not even installed a mod.
matt.brown@matt steampipe % cat mod.pp
mod "local" {
title = "steampipe"
require {
mod "github.com/turbot/steampipe-mod-aws-compliance" {
version = "*"
}
mod "github.com/turbot/steampipe-mod-aws-well-architected" {
version = "*"
}
}
Step 3: Spin Up the Local Server
It is very easy to run this locally. Just run the following:
powerpipe server
And you'll see something like this:
[ Wait ] Starting WorkspaceEvents Server
[ Message ] WorkspaceEvents loaded
[ Message ] Initialization complete
[ Ready ] Dashboard server started on 9033 and listening on local
[ Message ] Visit http://localhost:9033
[ Message ] Press Ctrl+C to exit
If you never ran the mod init
you'll see:
matt.brown@matt steampipe % powerpipe server
Error: This command requires a mod definition file (mod.pp) - could not find in the current directory tree.
You can either clone a mod repository or install a mod using powerpipe mod install and run this command from the cloned/installed mod directory.
Please refer to: https://powerpipe.io/docs/build#powerpipe-mods
Then just hit the server and you should see the AWS dashboard:
Cool, you get CSPM out-of-the-box for AWS. Obviously you could do the same for other cloud providers.
Exploring the AWS Compliance Dashboard
With the Powerpipe server running and the AWS Compliance Mod loaded, you can now see how well your environment stacks up against various standards. One I like in particular is AWS Foundational Security Best Practices — it’s a solid baseline that blends well with both CIS and real-world operational needs.
Here’s an example dashboard view based on a previously ad hoc query. This one checks for unencrypted EBS volumes:
You can explore the structure of these controls in the steampipe-mod-aws-compliance GitHub repo. Each control maps to a specific query. After doing some digging I found you could see the structure of the EBS volume encryption control here. Then you can dive in and find the actual query here.
Tweaking the query to the following should give you the same result. It took me awhile to get this deep, but it is a nice set up.
> select
arn as resource,
case
when state != 'in-use' then 'skip'
when encrypted then 'ok'
else 'alarm'
end as status,
case
when state != 'in-use' then volume_id || ' not attached.'
when encrypted then volume_id || ' encrypted.'
else volume_id || ' not encrypted.'
end as reason,
region,
volume_id,
state,
encrypted,
title
from
aws_ebs_volume;
+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+
| resource | status | reason | region | volume_id | state | encrypted | title |
+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+
| arn:aws:ec2:us-west-2:266735822823:volume/vol-061698b4100e0cde6 | skip | vol-061698b4100e0cde6 not attached. | us-west-2 | vol-061698b4100e0cde6 | available | false | vol-061698b4100e0cde6 |
| arn:aws:ec2:us-west-2:266735822823:volume/vol-06daf5bdce1319421 | skip | vol-06daf5bdce1319421 not attached. | us-west-2 | vol-06daf5bdce1319421 | available | false | vol-06daf5bdce1319421 |
| arn:aws:ec2:us-west-2:266735822823:volume/vol-0798d9a6b06fab049 | skip | vol-0798d9a6b06fab049 not attached. | us-west-2 | vol-0798d9a6b06fab049 | available | false | vol-0798d9a6b06fab049 |
+-----------------------------------------------------------------+--------+-------------------------------------+-----------+-----------------------+-----------+-----------+-----------------------+
One thing to note: there’s no obvious way to tweak the default dashboards inside the Powerpipe UI. Your best bet for customization is likely to fork the mod and adjust the queries or control formats directly. Could be a useful exercise to change severities. Overall I cannot think of a strong reason to change the defaults, instead just create a new one and maybe copy some stuff!
Creating a Custom Dashboard
You can view instructions here. Bear with me, this dashboard is quite horrendous. This one has a single control for public S3 buckets.
Step 1: Create Your New Mod
Just jump into a new folder and run your usual init:
powerpipe mod init
Open the mod.pp
file and change add the query as such:
mod "local" {
title = "custom_dash"
}
query "aws_ebs_not_encrypted" {
sql = <<-EOQ
select name from aws_s3_bucket where bucket_policy_is_public=true or block_public_acls=false;
EOQ
}
Step 2: Create Your Dashboard
Just create a dashboard file like customdash.pp
in the same directory. And then make it as follows for just an MVP, that is more like a Bench Player.
dashboard "custom_dash" {
title = "Basically a Lame Custom Dashboard"
card {
sql = query.aws_ebs_not_encrypted.sql
}
}
Step 3: Fire Up the Server
As usual, run your powerpipe server and make sure it is in the same folder:
powerpipe server
And you'll see:
One thing that is cool is you get easy reloads, sorta like a nodemon
style server. So change the dashboard stuff, save, reload, and bam all good to go with your changes. I found this out when everything was wrong, but eventually it all got ironed out.
Taking the query from Part 1 and turning it into a dashboard is quite an accomplishment if you ask me. This was a lot of work, but it is cool to get this customization along with the out-of-the-box goodness.
Wrap-Up: Hella Cool! (Yes I did it)
That’s a wrap — for now — on Steampipe and Powerpipe.
Originally, I set out to write about just Steampipe. But the more I used it, the more it became clear: if you’re not using both Steampipe and Powerpipe, you’re only getting half the CSPM experience. So yeah, excuse the misleading intro.
This all started as a bit of a shot in the dark. I went looking for open source CSPM tools and stumbled onto Steampipe. And honestly? I think it’s one of the most genuinely useful open source tools out there. No “Free” plan that’s secretly a SaaS sales funnel. No sign-up form. Just raw access and real control.
It’s intuitive to set up, surprisingly easy to use (because it’s just SQL), and provides serious value from Day 1. Whether you're inventorying resources, checking for risk, or building out your own controls it just works.
I’ll almost certainly be revisiting this — whether it’s digging into Turbot’s other tools like Tailpipe for threat detection, exploring how well this works for KSPM, or lining it up against other CSPM tooling in the wild.
The only real challenge I ran into was understanding what exactly all these pipes were. I don’t know much about steampipes or powerpipes. So I spent a good chunk of a road trip trying to map it to plumbing I actually understood, which, as you can imagine, goes in a lot of directions.
Here’s the best I’ve got: imagine a four-story triplex. You’ve got one main pipe feeding three units, each with its own water heater. That breaks out into hot and cold pipes that run up to all the apartments — plus some extra lines for hoses, outdoor faucets, and common areas. All those branch even further: showers, toilets, sinks, you name it.
Now try to debug why someone’s shower doesn’t get hot water.
That’s about what this feels like — except the plumbing is cloud APIs, the water pressure is AWS rate limits, and your wrench is a SQL query.
Nevermind.
Subscribe to my newsletter
Read articles from Matt Brown directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Matt Brown
Matt Brown
Working as a solutions architect while going deep on Kubernetes security — prevention-first thinking, open source tooling, and a daily rabbit hole of hands-on learning. I make the mistakes, then figure out how to fix them (eventually).