GitOps for application authorization
Conclusion
Cerbos simplifies application authorization by enabling a GitOps-based workflow that's testable, versionable, and auditable. Developers and software engineers benefit from streamlined authorization logic, allowing them to focus on building great features. The stateless, scalable, and flexible architecture fits well with modern cloud-native applications.
Try Cerbos Hub
To learn about how Cerbos empowers developers to implement, manage, and delegate fine-grained access control in software applications, in a fraction of the time spent building and maintaining it in-house, click here.
Transcript
Taylor: Hi all. Welcome to cloud native live today. I am your host, Taylor Thomas. This is a CNCF event where we basically let everybody dive straight into the different technologies that make up cloud native as it is a CNCF platform and event. The CNCF code of conduct applies. So please just be respectful of everyone and keep your comments and everything that's going on during this conversation, very respectful.
So with that, I'm going to introduce Alex Olivier, who is from Cerbos and I'll let him take it from here.
Alex: Great. Thanks Taylor. Hi everyone. Hope you're having a good start, middle, end of your day. Today we're talking about GitOps for application authorization. Just to make sure we're on the same page, when we talk about authorization, we're talking about the logic that defines who a whether a particular user can do a particular action inside of the system.
So I am a, I am the Chief Product Officer of Cerbos which is an open source project to do application authorization. Not going to dwell too much on this as we're going to go into the jump, jump to code very quickly and talk about how we can bring a GitOps style workflow to actually manage and scale and test a version and deploy ultimately your authorization logic.
So firstly, a bit of background to make sure everyone can understand what we're talking about today. Simple application architecture. I have a couple of example API applications up and running and in terms of how they ultimately fit in using serverless authorization, it looks something a bit like this.
So you have your end users, they're making requests to your application. In this case, our application is running a Kubernetes cluster. That request is coming in through an API gateway. We're using Envoy in this case. And then ultimately a particular request will end up down inside of a particular application pod.
And that service will then be handling that request. So in our example application, we're going to be looking at like tracking expenses. Imagine you've just gone on a work trip or you maybe you just come back to KubeCon and you need to file off your kind of your expenses to get things reimbursed.
And so you're going to be submitting receipts and announcing those kinds of things. So this particular application is an expenses tracking service. And me as an end user, as I'm making API calls to that system and based on those calls, I basically, the system knows two things. Firstly, that call is going to be authenticated.
So the system knows who the user is. In this case, we're just going to be using like an HTTP header variable, but you would plug in an IDP. If you're in the CNCF ecosystem, it might be, say, keycloak. And then you'd have a token, you pass it down. Your application then also knows what resource you're trying to interact with.
So in the case where you want to go and say, look up a particular expense you've submitted to find out its status, you'd go off to the database and go and query out that particular record. So the application now knows what particular resource you're trying to interact with. And the authorization logic in here is defined as.
checking whether a particular user with a particular role or particular attributes can do a particular action on a particular resource. So traditionally, the application code might end up with a load of if statements, where you have if user role equals admin, then allow the action. If user role equals manager, then it'll only allow the action if They're met with the, that user is a manager of the same office or only allow the view action.
If this person is just a regular user, for example, and this is all the kind of the basic permissioning logic, which can be role based. So RBAC is the phrase you've probably used to or attribute based where you're checking individual values or individual bits of logic to determine whether an action should be allowed or not.
So the bit that we're going to talk about today is how you can take that logic and encapsulate that as policy rather than code, and then take a very GitOps based approach to version test, and then ultimately deploy changes to that logic inside of the application. So in an application that's hooked up to use Cerbos in this case as a the open source project that allows you to define This authorization logic as policy rather than the code Inside your application code at the point where you want to check permissions rather than hard coding that fl star logic You would make a api call out to a Cerbos policy decision point Which is this little server cycle on here.
So, that policy decision point runs as a sidecar or just like another service inside of your cluster. And what's loaded into it are those policy definition files. So, those policy files we'll go through in a second, but they're basically your business logic defined as a manifest. They look very similar to CRDs if you're in that kind of world.
But ultimately these are business logic defined as policy rather than application code. So ultimately it's decoupled out. So I'm going to now jump into our application to give you an idea of how this is all wired up. So we have this kind of setup I mentioned, we have a Kubernetes cluster, we have Envoy coming as the gateway, and then based on the path we're making requests to different services, and we're going to look at particularly this expenses service as I mentioned.
And when it comes to checking permissions, it's going to go and fetch, and it's going to check those permissions against the policy decision point that's deployed as just another local cluster service inside of that inside of that namespace. So the policies behind this application, if we're going to jump to the application code, we're going to just look at a very simple request to keep things simple, which is just the get request.
So we give it a ID of a particular resource, so an expense in this case. My application code goes and grabs from the database that particular record. If it can't find that record, we're going to return a 404 in this case. And if we can't find it, we then make an API call out to Cerbos. So this Cerbos dot is allowed method here is just using the Cerbos JavaScript SDK.
But the serverless policy decision point exposes the gRPC or REST API, but we have SDKs for pretty much every language now. And we're passing it three bits of information. We're passing it the principle. So who's making the request. In this case, it's going to be me and I'll show you some calls in a second.
We're passing the actual resource that we're trying to. do a particular action against. So we're looking at an expense resource, it's got an ID, and then we're actually passing the attributes about that particular resource from the database to the core to Cerbos. And then the action is a read in this case, we just want to read it.
So that request goes off to the decision point. The policies are then evaluated and what comes back is a decision and it's just a Boolean allow or deny. So if it's not allowed, we're going to return a 401 with an unauthorized error message, or otherwise we're going to say, okay, and actually return, return that record.
Okay. So to give you an example of what the actual objects are here we, our system has a number of different users. So we have a user with a role ID of admin. They have the roles of user and admin. We've got some attributes about department, name, region. We've got Sally here, who's a user, who's in the sales department in EMEA.
And we've got a couple of managers. So we have a North America manager. So this person has the role of manager, and their region is set to North America. And we have an EMEA manager here. So this person has the role of user or manager and their region is EMEA. So these are kind of, these are the identities we're going to be kind of interacting with inside of our system.
So our business logic states, so you can kind of imagine you've got this nice sort of JIRA ticket, let's say, or GitHub issue from the team that's designing the system and in our basic setup here, we're saying admin should be able to do all actions on the expense resource. User should only be able to it should be allowed to create.
They should be able to read if they are the owner of the resource, but they should not be able to delete. And then if the user has the role of manager, they should not be able to create, they should be able to read and they should not be able to delete. So what does that actually look like in terms of policy?
So in this case this what we're looking at here is a Cerbos resource policy, but it's just a static yaml definition file call it a manifest if you will which defines that business logic So what we have implemented is that table I just showed you so we have an expense resource And we have three different rules set up.
We have this create We have this rule that says if the user has the role of admin Then the create read and delete action should be allowed You If the user has the role of user they should be able to read only if this condition is true So we're actually interrogating the attributes about the resource So the owner id attribute in this case and the id of the person making the request or the printable making request And if that's true, they actually should be allowed and then finally we have this logic that says Okay, if the person has the role of manager then they should be able to read so it's a pretty basic definition And but the key thing here now is this is a static, definition of You What the logic needs to be hopefully you can kind of see where this is going in terms of kind of a github style workflow But now in this case, it's going to actually test that this API does what we expect.
So I'm going to make some API calls to the service, which is now running on my local Minikube cluster. I'm going to make a request as an admin to expense ID number two, send that request, response comes back. I go and pretend to be a manager in EMEA, make the request, request comes back, it's allowed. If I go to North America, request allowed, it comes back.
So if we look at how these policies are actually deployed, we currently have our serverless service running, and we have this folder called policies that we're basically just mounting into the container. If we actually look at our docker file, in this case we're actually copying the actual raw files into the container, and then starting up our serverless instance with those policies baked in.
And this works if you want a very static deployment. But what happens when kind of the requirements change? So, if we go back to our readme, let's pretend we have pretend we're saying we have this new requirement now that actually we should only allow the type of, we should only allow the managers to read that particular expense.
So that API code to fetch a particular expense should only be allowed if the expenses in the same region as the manager. So going back to our managers, they have this attribute called region, Amir, and then our expenses also have this region called this attribute called region as well. So our business logic now needs to say, okay, this action should be allowed or this request should only be allowed if those two values are equal, where the person making the request has the role of manager and that the region value of both the expense and the principle are equal to each other.
Otherwise it should be denied. So we could do is go into our code base. We go and update these policy files. We then have to go and rebuild our app, redeploy the services and go through kind of testing deployments, you know, whatever your rollout plan is. And that's kind of quite involved. And you generally have to get the developers involved at that stage.
But with Cerbos, because this logic is now decoupled on your application, you kind of get this other approach, which is allowing a more of a kind of GitHub style workflow, but particularly for your application policies. So what I've done is I've created a second repository called policies, and inside the repository, I've just copied over.
My policy, my, my policies as they were before and check this into a GitHub repo. So I now have this GitHub repo called, I've got now policies in it. I have my policy folders with those policy definition files. So what I want to do now is rather than having my serverless instances running inside of my cluster, just read those policies, baked into the container.
We want to actually tell it to pull those policies down from that github repo So in our server's configuration when you define it And there's a section where you define a storage and this was basically telling the decision point where to get the policies from So by default here, we're just we're grabbing those policies just from the local path inside the container But i'm now going to swap that out and tell it to go and fetch my policies from a particular git repo So in this case, we're just pointing to a public github repository you can also point it to private repositories, internal git servers however, you're kind of running things and then you define a few different things.
So we're saying okay, here's here's the repo Here's the branch we're interested in so we're going to have our You Quote unquote production environment always pull from main because that's our What what's in production? And we're just telling it also where to go and fetch those policy files inside of that repo which folder subfolder And then we also define what the update interval is so By defining this when the serverless instance starts up rather than trying to read the policy from disk github It's going to clone that repo down locally.
And then it's going to Mount whatever policies are defined in the path that's provided from the branches that provided. So we can actually going to save this. It's going to behind the scenes, going to redeploy my serverless instance. Once it starts up, we'll actually see in the logs.
I forgot to take out one more line. And there we go. So it's going to go and redeploy. It's going to start up the container. And we'll actually see in the logs coming back from Cerbos once it spins up. Deployment's still pending. Ah, the joys of demos. So the container should start up in a, a second now.
Let me just do a bit of a Fresh startup here. There we go. Cool. So this is started up and we can see in the logs that come from our Cirgobs container that it's cloned the repo from github pulled it down and then actually loaded in the policies. We can see here. It says three executable policies So now our deployment is decoupled from the actual files in our code repo and it's pulling them down from our github repo to go and and Pull them in and every time I go and check if a change into this repo after 10 seconds, the instance will pull down those changes.
So, so far, we've kind of got half GitOps, I would say we've got the Git part working where our instance is now putting those policies down from the GitHub repo, loaded them in, and then the application is running. Basically serving based upon those new policies. So now how do we get to the next piece which is around like testing and those sort of things?
So as part of Cerbos, we have published GitHub Actions for Cerbos. So I have a typical GitHub Actions definition file workflow set up here and it has three steps in it. First off, we go and obviously check out the repo as you normally do. Then we're using the Cerbos setup action here to go and set up a Cerbos instance inside of our runner.
And then we're actually going to run the compile and test policy method.
So what this does, Cerbos itself, when it provides a method to basically compile the policies that are on disk to make sure they're valid. So we want to make sure that whatever policy changes we make, they actually You know are valid and it's not going to cause our production environment instance to You know lead into any errors.
And it also supports the concepts of tests, which is what we're going to look at next So I'll get our github. Pipelines already set up up in this project and now let's go back and look at our requirements as before So we wanted to make this change which says if an expense in the region it should only be readable by managers from some the same regions that the expense is created So with Cerbos you can define tests So first thing I do is create a new branch on my repo.
Let's call it ad manager condition So this is specifically on the policy repository the one that just contains our policy files for our service and Here I could now go and start implementing logic, but i'm going to take a test driven development based approach to this so one of the benefits of decoupling authorization using something like Cerbos or one of the other You Authorization systems out there is it's no longer coupled with your application code, which means you can now test it in isolation.
So serverless, we have like a testing framework. So I'm just going to create a test YAML file and to save time, I'm going to copy over some boilerplate and then we'll talk you through it. Cool. So here I've got the outline of my test suite. My test suite has a name it has and then we define the fixtures for it So in this case, we're going to have three principles.
So there's basically three different simulated users for our test suite We have admin with a role of admin We have a north america manager has a role of manager with their region set to north america And we have an emir manager with the role of manager and the region set to emir Funnily enough.
And then we also have our resources. So in this case, we'll keep it simple. There's just like two fixtures or two example resources that we're going to be using for our test suite here. One, both of them are expenses, one's from here, one's North America to kind of match our business logic. So the next step we want to do is actually going to define our actual tests.
So we're going to have an expense read action test. It takes as inputs, those three principles and those two resources that we defined before, and the action we're going to test here is read. And the way you then go on and actually define your expected values is you state the expected conditions.
So I'm going to go and start off with a very basic one. Which says okay if my principal is an admin for this particular resource The action should be allowed if you remember our business logic It says if you're an admin you should be able to do actions And that's now in place and I can do the same thing for The next case when i'm an admin still and i'm going to check whether I have the read action on The other the other resource so the emir resource here and these These you know, there's no change here yet.
These should kind of all make sense but as we start and i'll break copy them over in batch as we start looking at the other the other ones we're now looking at the different managers So our north america manager trying to do the read action on the north america region expense The action is allowed and it should be allowed but our new business logic that says if they're a north america Manager trying to access an emir expense then that should be denied and then the inverse.
So if it's a EMEA manager trying to access the EMEA expense, then it should be allowed. And if they're EMEA tracks North America expense, then it should be denied. So now this is our test defined, defined, and what we can actually do is run serverless locally to check that this is kind of working as it should.
But what I'm going to do is actually going to push this up. To github and let our github runner run in this case So i'm just going to go and check my change add test scenarios Commit that push it up. So now kind of i've made i've set up my tests how I would and i'm sure you're all one step ahead of me and this is Going to go and lead to an issue.
So i'm going to create a pull request against my sort of main my production branch and say, okay Let's say this logic is ready to go Add condition checks for manager region create pull request. Obviously you write notes for both PR messengers everyone knows what's going on. But this is just for this example.
We've kicked off that build so now the serverless workflow is running It goes and close down clone down our repo and it goes around to compile and we've got an error so we can see here tests have failed and we have some test cases that aren't conforming to our expected scenarios So we can see here when the north america manager is trying to read the emir We expect to deny because that's what I knew logic should be but actually it's being allowed So something's not right with our logic.
And in this case is because we forgot to actually put the condition inside of our expenses
So coming back, we go to our expenses policy, and now we need to actually go and put a condition in here. So our condition similar to kind of the one above, we want to check particular attributes about the requests that come in. So for our manager to read the effect to be allowed for the manager role.
When the region of the resource is equal to the region of the user making the request So this should satisfy our tests and we can verify that by running everything locally So i'm going to cli tool and compile policies and it's actually going to say yep That's all compiled executed successfully, etc.
So I can now go and check in that change So i'm going to say add in condition for manager that's now checked in sync it up to github So going back to this pull request. There's another commit That commit is running through, CI was going to kick off. There we go. And that's now all running and unsurprisingly, this is, well, should pass, but you know, live demos and all that.
There we go. Tests executed successfully. We now have our policies meeting our tests, meeting our business rules. That pull request is now green. And now at this point, I kind of have a few, two different options. What you can do with Cerbos when you deploy the decision point is you can point it to different branches.
So let's say you have multiple environments, you have a production, you have a staging, you have a dev environment, for example you could have different branches for, for those in your Git repo. And then you can deploy a policy decision point, your production ones would point to the main branch, your staging cluster would point to the staging branch, and then your dev one could go to a dev branch.
And then so this pr we can merge it first into say a staging branch and then do a pr from staging to production Or the main branch and those changes kind of flow down through in this case. We're just going to go And merge it straight into our main branch our production, you know, assuming we've got all the right reviews and the sign offs, etc So I go merge this pull request And that now merges back into the main branch cool pull request is all green.
You can see that our policy files are now in production and going back to our application code. I'm sorry our serverless instance We can see here that the serverless instance that we deployed looking at that main branch Has actually detected a change in that repo. So it's cloned it down on that 10 second interval that we spoke about it's pulled it in is compiled the policies loaded them into memory And now, unsurprisingly, if I now go and make a request as my North America manager to a North America expense, that's allowed, but if I now go and pretend to be someone from EMEA, make that request, I get an unauthorized error.
So what we've just demonstrated here is we've changed the authorization logic. Of our application without having to touch your application code firstly So there's been no Code changes in terms of the app code. We we haven't had to go into the individual services We haven't had to go and take our business logic and convert it to like javascript if it's no no depth For example or go or python or whatever your services are in And we did that implementation once to make the authorization checks against the decision point running alongside as another container in our cluster and secondly what demonstrated is It's a truly kind of end to end GitOps based workflow where we have our policies, we wrote some tests against what our policies should do.
We then have a CI step, runs the serverless as part of the CI pipeline. We've then run those tests only when those tests are passed, we then allow the merge to happen. And then our actual policy decision points running inside of our cluster are just looking out for updates to. In our case, the main branch, but it could be whatever branch you're classing for that particular environment.
And those changes are pulled down into the container that's already running and now without us having to redeploy the application or roll the pods or anything like that, we've updated the authorization logic again. Without having to touch our infrastructure or redeploy anything on the backend on both the application code side of things, as well as the policies, policy decision point.
And that's what I really wanted to demonstrate today, how you can then Have a truly GitOps based workflow for testing and deploying authorization policies for your application services using Cerbos.
Taylor: Awesome. Thanks Alex. So for those watching the live stream, you're welcome to send in any Comments or questions and we'll try to get those displayed on here and those answered in the meantime I i'm curious because this is this is a the second stream i've done with you But some people who may not be as familiar with Cerbos what is kind of the difference between, you had explained kind of in that chart at the beginning, but like, what would you, people like the summaries for some reason, so what would you summarize as the like high level difference between something that Cerbos does versus other types of Auth Auth Z projects that are that are out there right now?
Alex: Yeah, absolutely. So, Cerbos, we really kind of a process of a, Solving one use case really well, which is application level permissions. So there's great projects if you look and say cncf There's like open policy agent for example, which is great things and has like kubernetes cluster Abac abac is great for like doing network type permissions as well And it can do bits and application thing But it's kind of a swiss army knife in terms of what it can do with Cerbos We came from it from a viewpoint of having to build one of these systems many times in previous companies where You There is this kind of distinct use case around end user application permissions.
So say you're running a SaaS business, you've got different users and different roles and different tenants need different definitions of those roles as well, because every business is kind of unique. We wanted to, we created a layer that is really solving that one. And we think we serve it, solve it in a very clean way.
And serverless particularly does it in a very stateless fashion. So everything we ran through today, these policy files are statically defined, and that's the only bit of data that's loaded into those policy decision points, these serverless instances here. When, when the serverless instances are deployed, they start up, as we saw, it pulls down from the policy store, in this case a Git repo, and loads them into memory.
That's the only time serverless instance is doing anything involving sort of files when an application is making a request to check permissions, it's sending all the payload and sending all the data it needs to the serverless instance at request time. So the whole principle object, the whole resource object, and what that now means is serverless itself doesn't need to go and query database, doesn't need to hit disk or anything like that.
It hasn't gotten some sort of third party data store, depending on it's making all the decisions in memory. At request time, which means the response times you're getting are super fast sub millisecond. Because there is just no disk or network or anything. I am involved, to make come up with a decision and because there isn't that kind of external store it relies upon it means it's horizontally scalable Which fits really well into your kubernetes?
Style workloads or really any sort of serverless type environment as well Because you can just have servers running as a sidecar inside of your pod You So as your application scales, server scales with it, you're never going to have a bottleneck when it comes to authorization because authorization at the application layer, when it's really end users are interacting with it sort of indirectly needs to be snappy because authorization is in the blocking path of every single request that comes in.
And by stripping servers down to really that core use case of application authorization, With this stateless architecture means that you are getting authorization decisions in sub millisecond timeframes. And, but still get all the benefits of this versionable testable GitOps style workflow for actually defining and testing and auditing your authorization logic.
Taylor: Awesome. That's great. Now you kind of mentioned like obviously Kubernetes, but one of the things that, so like I said, I've, I've been, I've been on here without for, so he knows this question was probably going to come, but. It's, it's not very often in the CNCS space that you see people design things to work outside of Kubernetes, especially in this space.
Why, why did you make that decision and what kind of use cases have you seen from people who are using it outside of a Kubernetes context?
Alex: Yeah, absolutely. So Kubernetes is obviously a very easy one for us to kind of plug into the ecosystem, but we really want to enable Authorization to be run essentially anywhere and across our user base that we know of at least, you know, it's an open source project There's definitely more people out there than we know of.
We see a lot of communities obviously, but We see all sorts of hybrid deployments. We see on prem air gapped Bare metal machines where serverless is running as a binary And we see examples of running in things like lambda functions where you have very serverless type compute or ecs or cloud run or okay native if you kind of want to look at it that way servers to servers are in and go it's extremely small binary.
We publish it as a container as well So you can run it anywhere, we even have a user out there that ships physical like ATM machines out to the world with servers running on top of them to kind of manage the authorization side of things. So that's kind of the, it'd be sort of backend use cases.
We also have a version of Cerbos that compiles down to WebAssembly. And there's a kind of a whole class of authorization type problems where it makes sense to do it at the edge or even like on device. And by on device, we're talking about like native apps or even like web browser when. Your back end api is always going to enforce permissions at the api level but if you know someone's role and there's a And you know what they're interacting with there's certain types of things that you want to do just very quickly and easily on device such as Showing or hiding menu buttons or navigation items or disabling or enabling form buttons and those kind of things Form controls and you can do that still backed by the exact same serverless policies that you've written and versioned and tested in your back end but you can do that via what we call our embedded policy decision point You Which you can then stick in an edge function or load directly onto a client device or use it in Waz on cloud or one of these other type environments.
Taylor: Yeah. Awesome. Thank you. I haven't seen any questions come up yet. Let me double check from the, from the community. So I think we can leave it there. I could, I could probably keep coming up with questions, but we'll keep pestering you that way. So just so people know, I'll make sure this is on screen for just a second you have the Cerbos website you can go, which is kind of the landing point for all this, it has links to the GitHub repository, correct, and all, and all those correct, all those details you've been showing, is that right?
Alex: Yeah, absolutely, Cerbos.dev is the website you can find the link straight off to our GitHub repo from there as well. Always open to conversations and contributions as well to our open source project. And then you can also try our managed control plane, which sits on top and makes some of the CI workflow that you suddenly set up using GitHub actions, more as a managed service through our serverless hub and control plane for the open source project as well.
Taylor: Awesome. Now, one last question I sometimes like to ask here is, What are you most excited for in the, in the coming months for the serverless projects that might get other people excited?
Alex: Yeah, great, great question. So what are the capabilities that you gain by having authorization sort of centralized like this, is you now have a centralized audit point and decision point.
So every decision that goes through a system serverless can generate a audit log of at this time, this user tries to do this action, this resource is allowed or denied by this particular policy. So the open source project today already has that built in and there's various destinations you can send that.
You can send it to a file. You can send it to a Kafka topic. That's a pluggable backend. I mean, actually the Kafka integration was a community contribution, which we're very thankful for. And what we're now working on as part of the serverless hub offering is bringing that into a UI that someone that necessarily isn't a developer, but maybe sits more on the security team or sits on more of the product team inside of a business that maybe doesn't live in code all day.
But wants to be able to inspect and understand what's going on inside of the system in terms of a logging and logging and auditing type system. So we're really enhancing our log logging story capabilities, not just in the episodes project, but also in our service commercial offering as well.
Taylor: Awesome.
Well, thank you so much for coming today, Alex. This is a great hands on demo. If anyone has any questions, please reach out to them through their, their various communication links and community guides on their website. And hope you all have a great day and thanks everyone for attending.
Subscribe to my newsletter
Read articles from Cerbos directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Cerbos
Cerbos
Cerbos is a self-hosted, open source authorization layer that separates your authorization logic from your core application code. Cerbos makes authorization more scalable, more secure and easier to change as an application evolves.