Preview: OPA Policy in YAML
Introduction
While OPA is the de facto standard for policy in the cloud-native ecosystem, using OPA requires learning its policy language, called Rego. There is no way to use OPA without Rego; the two go hand-in-hand. While Rego is nowhere near as complex as traditional programming languages like Java, C++, or Golang, it does take time to learn, as the continued popularity of our Rego Policy Authoring course attests.
While some people find Rego quite natural and pick it up easily, other people have asked if there is a way to avoid learning a new syntax. If you write OPA policies every day or even every week, a dedicated syntax is well-worth the time to learn (as it makes policies more concise and uses idioms from popular programming languages), but if instead you’re writing policy once a month, you may want a way of writing policy that takes less study and that’s easier to get back up to speed after a hiatus.
In this blog post we describe a vision for what we call a “YAML skin” on Rego (which we call RegoYAML). It lets you:
- Write your OPA policies in YAML (as well as Rego)
- Convert policies between YAML and Rego.
- Write all the same policies as you could write in Rego
- Feed those policies into OPA, the same as you would with Rego
- Use the same tooling as you would for Rego (except for the syntax-highlighter of course)
As an example, the two policies that follow say the same thing, though one is written in Rego and one is written in YAML. While there are clear similarities, the key benefit of the YAML skin is that your brain (and your software) already knows how to parse the YAML version, so it’s not having to think about the punctuation, whitespace, or even the basic semantics of the file.
Rego Policy
package app.abac
allow if {
user_is_employee
action_is_read
}
allow if {
user_is_employee
user_is_senior
action_is_update
}
user_is_employee if {
data.atts[input.user].title == "employee"
}
user_is_senior if {
data.atts[input.user].tenure > 8
}
action_is_read if {
input.action == "read"
}
action_is_update if {
input.action == "update"
}
YAML Policy
package: app.abac
rules:
- allow:
if:
and:
- user_is_employee
- action_is_read
- allow:
if:
and:
- user_is_employee
- user_is_senior
- action_is_update
- user_is_employee:
if:
equal:
- data.atts[input.user].title
- employee
- user_is_senior:
if: data.atts[input.user].tenure > 8
- action_is_read:
if:
equal:
- input.action
- read
- action_is_update:
if:
equal:
- input.action
- update
Furthermore, a YAML skin enables us to introduce NEW policy constructs that we see used frequently, thereby making common cases easy through new abstractions. We could also introduce those constructs natively into Rego, though some are more natural in a YAML syntax. The MATCH construct shown below is an example. It handles strict equality matching in deeply nested objects by simply including those objects directly into policy. As an added benefit, this idiom encourages people to use the fragment of Rego that OPA efficiently indexes.
Below is an example from Kubernetes, where in the YAML version you can see a Pod description (in bold) written in the same format you would find in the Kubernetes docs. In contrast, the equivalent Rego uses expressions with dot-notation to describe constraints on paths through the Pod description, but the pod description itself is not part of the policy.
Rego Policy fragment
has_privilege_escalation {
input.apiVersion == “v1”
input.kind == “Pod”
input
.spec
.containers[_]
.securityContext
.allowPrivilegeEscalation
}
YAML Policy fragment with MATCH
- has_privilege_escalation:
if:
match:
input:
apiVersion: v1
kind: Pod
spec:
containers:
- securityContext:
allowPrivilegeEscalation: true
Operationally, RegoYAML is a translator that turns Rego into YAML or YAML into Rego. RegoYAML is still a work in progress, but there is a prototype available – feedback is welcome through any of the following means:
- Styra slack
- OPA slack
- Email: tim@styra.com
Benefits
Below we show more examples of YAMLRego in action, but first we cover some of its benefits and drawbacks.
1. Democratization of policy
Not all policy stakeholders within an organization are comfortable with programming languages. But some of those people are in fact comfortable with configuration languages like YAML. RegoYAML helps non-experts write/read YAML, experts read/write Rego, and tooling translates between them, thereby expanding the number of people who can engage directly with policy files.
Part of the reason YAML has been embraced by so many people for so many purposes is that once you learn it, your brain has an easier time making sense of anything written in it. You still need to understand the schema for those YAML files when you move from one project to the next, but there’s less work for your brain to do, and therefore it is less intimidating to learn something new or to switch contexts.
RegoYAML can also be valuable to programmers, who might spend 90% of their time in their favorite programming language and 10% of their time dealing with config in say YAML. If once a week or month they need to write policy, switching to YAML could make that context-switch less painful than switching to a new, custom-designed language.
2. Easier programmatic Rego generation and analysis
If you’re writing code to generate Rego (e.g. from a policy-authoring GUI), your programming language of choice already has libraries to read and write YAML. With YAMLRego that means you also have a library to read and write policy. (At some point, you’ll still need to treat pieces of the policy as opaque strings, but a bunch of small strings embedded into YAML is much easier to deal with than 1 large string that represents your entire policy.)
In contrast, while there are a growing number of libraries for Rego (Haskell , Java, Rust) the only official one is Golang.
Writing Rego with Python
var string = ```
package abc
allow { …
```
print(string)
Writing YAML with Python
import yaml
var data = {
package: “abc”,
rules:
- allow: …
}
print(yaml.dump(data))
Reading Rego with Python
var s = <rego string from file/http/…>
??
Reading YAML with Python
import yaml
var s = <yaml string from file/http/…>
var y = yaml.safe_load(s)
3. Simpler embedding of Policy into configuration files.
If you’re using (or building) tools that embed OPA, those configuration files are often YAML to begin with. RegoYAML makes it easier to embed OPA policies into those files.
For example, here is a fictitious configuration file for a gateway that dictates all traffic routed to /foo/bar should apply the embedded OPA rules to decide whether the traffic is allowed or not.
routes:
/foo/bar:
get:
ratelimit: 100
rules:
- allow:
if: user_is_employee
- allow:
if: user_is_admin
/baz
post:
ratelimit: 10
rules:
- allow:
if: user_is_admin
Drawbacks
1. YAML is more verbose than Rego
One of the core benefits of Rego is that it is a syntax purpose-built for expressing policy; YAML obviously was not built for expressing policy. In the end, this means policies in YAML can be more verbose – longer – than policies in Rego because everything we write must work around the YAML syntax instead of designing a new syntax.
Below you can see an example where the original policy is fairly short, but the RegoYAML representation is quite a bit taller. It is one of the canonical Gatekeeper examples and uses comprehensions, which have especially compact formatting.
Rego Policy fragment
package k8srequiredlabels
violation[{
"msg": msg,
"details": {
"missing_labels": missing}}] {
provided := {label |
input.review
.object
.metadata
.labels[label]}
required := {label |
label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("you must provide
labels: %v", [missing])
}
YAML Policy fragment
package: k8srequiredlabels
rules:
- violation:
contains:
details:
missing_labels: missing
msg: msg
if:
- assign:
- provided
- setof:
select: label
if: input.review.object.
metadata.labels[label]
- assign:
- required
- setof:
select: label
if:
- assign:
- label
- input.parameters.labels[_]
- assign:
- missing
- minus:
- required
- provided
- gt:
- count:
- missing
- 0
- assign:
- msg
- sprintf:
- 'you must provide labels: %v'
- - missing
2. Multiple syntaxes for Policy
Obviously having Rego and a YAML means that there is no longer 1 file format for expressing policy. You may find policies written in YAML or in Rego within a single team or organization. And without easy access to the tools that translate from one to the other, you need to know both syntaxes. Moreover, helping people write or debug policies means you may need to be fluent in both syntaxes and the tools people use with them.
3. Translation tooling has limitations
Not all transformations are invertible (though some are). Translating Rego into YAML and then translating YAML back into Rego will not necessarily produce the original Rego file. If, for example, someone wrote a Rego file and checked it into git, and someone else checked it out, converted to YAML, made changes, and converted it back to Rego, the resulting git-diff could include spurious changes, which makes reviewing more difficult than it would otherwise be.
Examples
Here we give a few examples, mainly breaking down the example in the introduction to highlight different aspects of a RegoYAML.
Rules
The policy shown below allows employees to read everything. Most people can guess what both of them are saying, though both would be easier to read if they included the explicit AND keyword within expressions.
Rego Policy fragment
package app.abac
allow if {
user_is_employee
action_is_read
}
YAML Policy fragment
package: app.abac
rules:
- allow:
if:
- user_is_employee
- action_is_read
Helper rules
The prior rule depended on the concepts of user_is_employee and action_is_read. We often call these concepts helper rules in Rego; they enable policy authors to define their own abstractions to make reading/writing policy easier. Below we see how they are defined in Rego and in YAML; because these helpers have a single expression in the rule body, they are nearly identical syntactically.
Rego Policy fragment
user_is_employee if {
data.atts[input.user].title == "employee"
}
action_is_read if {
input.action == "read"
}
YAML Policy fragment
rules:
- user_is_employee:
if:
equal:
- data.atts[input.user].title
- employee
- action_is_read:
if:
equal:
- input.action
- read
Infix expressions
One of the questions as we design the RegoYAML is how far to push YAML into the expressions. We could choose to leave infix expressions like `input.action == “read”` encoded exactly as they are. Of course, the more of that we do, the closer the YAML version is to Rego, and the harder it is to produce software that reads and writes that policy, but also the more similar those expressions appear to the math expressions we learned as kids. Below you see the Rego and YAML are almost identical.
The RegoYAML prototype provides an option to translate Rego into YAML and embed infix expressions as, for example, `input.action == “read”`.
Rego Policy fragment
user_is_senior {
data.atts[input.user].tenure > 8
}
YAML Policy fragment
rules:
- user_is_senior:
if: data.atts[input.user].tenure > 8
Function calls
Many programming languages contend with the trade-off between two ways of providing parameter values in a function call: by position and by name. For example:
startswith(str, “abc”)
VERSUS
startswith(base=str, prefix=”abc”)
As shown below, YAML supports both ways of providing parameter values since it supports both arrays and objects.
In the RegoYAML prototype, there is an option to select how you want the translator to treat functions, but it could be enhanced to handle both automatically.
Rego Policy fragment
# function call by position only
safe_registry {
startswith(input.registry, “hooli.com/”)
}
2 Different YAML Policy fragments
# function call by position (array)
- safe_registry:
if:
startswith:
- input.registry
- hooli.com
# function call by name (object)
- safe_registry
if:
startswith:
base: input.registry
prefix: hooli.com/
Match idiom
So far we’ve seen YAML used as an alternative syntax to Rego, but there is one case where YAML makes expressing policy significantly easier than Rego because the object you are writing policy about is already written in YAML. We call these `Configuration authorization` or `Configuration guardrail` policies. Applying OPA to constrain Kubernetes and Terraform files are perfect examples.
In those cases, when writing policy with Rego, people often start with a configuration file sample and extract the dotted paths they want to constrain from that sample. In contrast, with RegoYAML, you can embed that sample YAML file into your policy and then write constraints over it. This is what Kyverno and Cuelang both do.
For example, below is a sample Kubernetes Pod in YAML, pulled straight out of the docs.
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo-2
spec:
securityContext:
runAsUser: 1000
containers:
- name: sec-ctx-demo-2
image: gcr.io/google-samples/node-hello:1.0
securityContext:
runAsUser: 2000
allowPrivilegeEscalation: false
If we want a rule identifying pods that have privilegeEscalation, the RegoYAML ‘match’ keyword lets you embed the (relevant parts of the) sample file directly into the RegoYAML, as shown below.
Looking at the two, the YAML version is closer to the configuration file that we know and so should be easier to understand at a glance.
Rego Policy fragment
has_privilege_escalation {
input.apiVersion == “v1”
input.kind == “Pod”
input
.spec
.containers[_]
.securityContext
.allowPrivilegeEsclation
}
YAML Policy fragment
- has_privilege_escalation:
if:
match:
input:
apiVersion: v1
kind: Pod
spec:
containers:
- securityContext:
allowPrivilegeEscalation: true
The RegoYAML prototype has an Option that allows you to select whether to use the Match operator or not. Just realize that if the YAML includes a ‘match’ operator and you disable that option, the translator will treat the ‘match’ expression as if it were just a regular JSON object.
Related Work
RegoYAML is not the first incarnation of a policy system using a structured data format like YAML. This section covers other such examples.
Public clouds. Public clouds support policy languages encoded in JSON. Below are examples from Azure, GCP, and AWS. One of the benefits of JSON is that it can be encoded directly in API calls, whereas YAML requires conversion to/from JSON. Creating a second skin for Rego using JSON would be an obvious avenue to pursue.
AzurePolicy: Apps require app gateway frontend
{
"if": {
"field": "type",
"in": [
"Microsoft.Web/sites"
]
},
"then": {
"effect": "[parameters('effect')]",
"details": {
"type": "Microsoft.Network/applicationGateways",
"existenceScope": "subscription",
"existenceCondition": {
"count": {
"field": "Microsoft.Network/applicationGateways/backendAddressPools[*].backendAddresses[*]",
"where": {
"field": "Microsoft.Network/applicationGateways/backendAddressPools[*].backendAddresses[*].fqdn",
"like": "[concat(field('name'),'.','*')]"
}
},
"greaterOrEquals": 1
}
}
}
}
AWS IAM Policy: Users can access EC2 from the Mumbai region
{
“Version”: “2012–10–17”,
“Statement”: [
{
“Effect”: “Allow”,
“Action”: [ “ec2:*” ],
“Resource”: “*”,
“Condition”: {
“StringEquals”: { “aws:RequestedRegion”: “ap-south-1”}
}}]}
Google cloud policy: grant user permissions to deploy AppEngine for a while
{
"bindings": [
{
"members": [
"group:prod-dev@example.com",
"serviceAccount:prod-dev-example@appspot.gserviceaccount.com"
],
"role": "roles/appengine.deployer",
"condition": {
"title": "Expires_July_1_2024",
"description": "Expires on July 1, 2024",
"expression":
"request.time < timestamp('2024-07-01T00:00:00.000Z')"
}
}
],
"etag": "BwWKmjvelug=",
"version": 3
}
Ansible. Ansible combines YAML and Python to make common cases easy with YAML and complex cases possible with Python. The RegoYAML approach has similar benefits in that people can write Rego in one file and RegoYAML in another file and have them call each other as if they were all written in the same language.
Ansible YAML Playbook
- hosts: localhost
tasks:
- name: Test that my hello_world module works
hello_world:
register: result
- debug: var=result
Ansible Python Code for hello_world
#!/usr/bin/python
from ansible.module_utils.basic import *
def main():
module = AnsibleModule(argument_spec={})
theReturnValue = {"hello": "world"}
module.exit_json(changed=False, meta=theReturnValue)
if __name__ == '__main__':
main()
XACML. XACML took the inverse of the approach described here: starting with a data-centric XML language and then adding ALFA, a more traditional PL-style language, later. These are shown vertically instead of side-by-side simply because of the width of XACML expressions.
XACML
<Policy PolicyId="urn:curtiss:ba:taa:taa-1.1" RuleCombiningAlgId="urn:oasis:names:tc:xacml:1.0:rule-combining-algorithm:deny-overrides">
<Description>Policy for Business Authorization category TAA-1.1</Description>
<Rule Effect="Permit">
<Description />
<Target>
<Actions>
<Action>
<ActionMatch MatchId="urn:oasis:names:tc:xacml:1.0:function:string-equal">
<ActionAttributeDesignator
AttributeId="urn:oasis:names:tc:xacml:1.0:action:action-id"
DataType="http://www.w3.org/2001/XMLSchema#string" />
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">Any</AttributeValue>
</ActionMatch>
</Action>
</Actions>
</Target>
<Condition FunctionId="urn:oasis:names:tc:xacml:1.0:function:and">
<Apply xsi:type="AtLeastMemberOf" functionId="urn:oasis:names:tc:xacml:1.0:function:string-at-least-one-member-of">
<Apply functionId="urn:oasis:names:tc:xacml:1.0:function:string-bag">
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">Curtiss</AttributeValue>
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">Packard</AttributeValue>
</Apply>
<AttributeDesignator AttributeId="http://schemas.tscp.org/2012-03/claims/OrganizationID" DataType="http://www.w3.org/2001/XMLSchema#string" />
</Apply>
<Apply xsi:type="AtLeastMemberOf" functionId="urn:oasis:names:tc:xacml:1.0:function:string-at-least-one-member-of">
<Apply functionId="urn:oasis:names:tc:xacml:1.0:function:string-bag">
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">US</AttributeValue>
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">GB</AttributeValue>
</Apply>
<AttributeDesignator AttributeId="http://schemas.tscp.org/2012-03/claims/Nationality" DataType="http://www.w3.org/2001/XMLSchema#string" />
</Apply>
<Apply xsi:type="AtLeastMemberOf" functionId="urn:oasis:names:tc:xacml:1.0:function:string-at-least-one-member-of">
<Apply functionId="urn:oasis:names:tc:xacml:1.0:function:string-bag">
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">DetailedDesign</AttributeValue>
<AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">Simulation</AttributeValue>
</Apply>
<AttributeDesignator AttributeId="http://schemas.tscp.org/2012-03/claims/Work-Effort" DataType="http://www.w3.org/2001/XMLSchema#string" />
</Apply>
<Apply xsi:type="AndFunction" functionId="urn:oasis:names:tc:xacml:1.0:function:and" />
</Condition>
</Rule>
</Policy>
ALFA (roughly, as finding a converter or examples was difficult)
policy taa-1.1 {
rule {
target clause Action == Any
permit
condition
AtLeastMemberOf(OrganizationID, bag(“Curtiss”, “Packard”)) and
AtLeastMemberOf(Nationality, bag(“US”, “GB”)) and
AtLeastMemberOf(WorkEffort, bag(“DetailedDesign”, “Simulation”))
}
}
Summary
You’ve made it to the end! Here’s a quick summary of the key takeaways.
- RegoYAML is a translation layer on top of Rego so that we can choose to write policy either in YAML or in Rego, while still using OPA for evaluation.
- The key benefits of the RegoYAML are:
- More people can read/write YAML than can read/write Rego.
- Programming languages already have YAML parsers and printers, whereas you need a specialized library to read Rego.
- RegoYAML can more easily be embedded into configuration files than a custom language because many modern configuration files are written in YAML.
- The key drawbacks of RegoYAML are:
- YAML is more verbose and sensitive to white-space than Rego.
- Having multiple formats for expressing policy leads to more work for people throughout the organization who are supporting multiple people with policy.
- While tooling can help translating between the two syntaxes, the tooling has limitations. For example, not every translation is invertible. This can lead to unintended changes showing up in git-diffs, for example.
- A prototype is available to try it out. Feedback is welcome!
- Styra slack
- OPA slack
- Email: tim@styra.com