Why the cloud-native architecture required a new policy language
I recently started a new series on the Open Policy Agent (OPA) blog on why Rego, OPA’s policy language, looks and behaves the way it does. The blog post dives into the core design principles for Rego, why they’re important, and how they’ve influenced the language. I hope it will help OPA users better understand the language, so they can more easily jump into creating policy of their own.
I also want to take a step back and talk about why we needed to invent OPA in the first place.
The database analogy
To do that, I’ll draw on the analogy of the database, which today is a mainstay in the application developer’s toolkit. When we think of building applications, we rarely wonder if we really need to include a database. Databases are critical, ubiquitous and well understood. That’s the world of today. But that wasn’t always the case.
At some point in time there simply were no databases. Any time a developer wrote an application (e.g. let’s say it’s a financial app), they had to deal with data retrieval and storage themselves. They had to create code to write data to disk, and more code to go fetch that data from disk and swap it into memory as needed. It was required in order for the program to work, and the financial app with the most efficient data write/retrieve could crunch through numbers faster than the app without it.
At a certain point (arguably 1960, but let’s not fight about that!) companies started building software that every other organization could use to write data to disk and read it back efficiently. No more hand-rolling that code for every team and every application on the planet. And because a couple of companies could focus exclusively on that data software, its features, scalability and performance were far superior than what any team would consider trying to build on their own. The database was born. By outsourcing the implementation of data storage and retrieval to the database, app developers had more time and mental bandwidth to focus on what their customers really cared about.
Authorization is ready for a revolution
OPA isn’t a database; it’s a policy and authorization engine. But how application developers confront policy and authorization today is pretty much identical to how they confronted the problem of data storage and retrieval before there were databases. Every application needs authorization, just like every application needs data storage and retrieval. Before OPA/Styra, just about every team was rolling their own authorization subsystem, just like every team was writing their own data storage/retrieval subsystem in the past. When app developers embrace OPA/Styra they get to focus on their financial trading app, their container management platform or whatever the software application is, instead of spending that time on writing an authorization subsystem. Moreover, because OPA/Styra is purpose built and has a company and open source community contributing concentrated engineering effort, its feature set is far richer than what people would build themselves. Developers have less code to write and can expose richer features to their end-users, resulting in happier end-users and faster time-to-market.
And these aren’t the only benefits. Imagine a microservice architecture with many teams working independently to deliver software to end-users, and suppose those teams standardize on implementing authorization at the microservice level by running OPA as a sidecar. Security improves because each service takes responsibility for its own authorization, making it easier to move from environment to environment securely and to limit lateral movement if attackers compromise one of the microservices. Intrusion response can improve by isolating a compromised service at the API level by hotpatching policy. And even intrusion detection can benefit from uniform logging of unauthorized accesses leading to simpler SIEM rules.
Take this a step further and imagine using OPA as a standardized authorization language and platform throughout the enterprise. Compliance teams can now answer a question like “can any contractors read PII information” by analyzing the policy files for each application, all of which are written in the same language, even if used by wildly different applications with completely different purposes.
Why a new language
I can almost hear you asking, “did it really require a whole new language? Couldn’t you have solved authorization with Go, or Rust, or Python, or you-name-it?” Good question! We certainly tried.
In the space of languages, there are of course the general-purpose programming languages like Go, Rust, and Python designed for software developers. But general-purpose programming languages aren’t well-suited to specific problems and so a new class of languages is required. For example, big-data problems gave rise to Mapreduce, statistics to R and ACID-quality data transactions to SQL. Each of these languages was designed to address requirements that arose from the specifics of the domain. Similarly, Policy requirements are not well-served by general-purpose programming languages, so it needs its own language.
In fact, in the OPA blog series we discuss three clear requirements for the policy domain that heavily influenced the need for, and the design of, Rego:
1. Humans need to be able to write the policies they care about in a way that is also understandable to machines. In the real-world, often policies that people care about are if-statements: you are prohibited to read PII data if you are a contractor. This logic is sometimes quite sophisticated and can make decisions that are more than simply allow/deny. Consequently, 99% of statements in Rego are if-statements that can be made human-readable and that allow the writer and reader to focus on the policy logic itself.
2. The policy language must be able to deal natively with the inherently hierarchical information that defines the cloud-native ecosystem. For example, datacenter regions are built out of availability zones, which include many clusters, running many applications built out of microservices, each implemented as multiple containers, all of which are running on virtual machines that abstract away physical servers. To write a policy naturally you may need to talk about the apps, the containers they comprise, and the VMs they are running on. Rego was designed for expressing policy over these hierarchical domains.
3. Policy is something that many stakeholders throughout the organization need to understand, e.g. developers, operations, security and compliance. This means it’s important that the people writing policy don’t overcomplicate the logic so that machines can evaluate that policy more quickly. Rego and OPA were designed so that the policy author handles clarity and correctness (for humans), and OPA handles the performance (for machines).
Overall, choosing the right tool for the job of expressing and enforcing policy is not just about the initial outlay of effort and time, but also the ongoing collaboration, manageability, storage and performance that come from a simple, efficient language.
The results are overwhelmingly positive
Open Policy Agent and Rego are now officially and consistently solving the very problems they were designed to solve, and, thanks to the community, also doing so much more. We see new integrations across OPA’s many use cases—from in-app entitlements to microservice API authorization, to infrastructure guardrails—that all prove the value of a standard, efficient policy language. We’re excited to help developers, operations, security, and compliance teams solve real problems, and get some mental bandwidth back to focus on whatever challenges they really WANT to focus on.
Building your first OPA policies or looking for advice on how to deploy OPA at scale? We’re here to help!