Kubernetes Blog

Thursday, June 20, 2019

Future of CRDs: Structural Schemas

Authors: Stefan Schimanski (Red Hat)

CustomResourceDefinitions were introduced roughly two years ago as the primary way to extend the Kubernetes API with custom resources. From the beginning they stored arbitrary JSON data, with the exception that kind, apiVersion and metadata had to follow the Kubernetes API conventions. In Kubernetes 1.8 CRDs gained the ability to define an optional OpenAPI v3 based validation schema.

By the nature of OpenAPI specifications though—only describing what must be there, not what shouldn’t, and by being potentially incomplete specifications—the Kubernetes API server never knew the complete structure of CustomResource instances. As a consequence, kube-apiserver—until today—stores all JSON data received in an API request (if it validates against the OpenAPI spec). This especially includes anything that is not specified in the OpenAPI schema.

The story of malicious, unspecified data

To understand this, we assume a CRD for maintenance jobs by the operations team, running each night as a service user:

apiVersion: operations/v1
kind: MaintenanceNightlyJob
spec:
  shell: >
    grep backdoor /etc/passwd || 
    echo “backdoor:76asdfh76:/bin/bash” >> /etc/passwd || true
  machines: [“az1-master1”,”az1-master2”,”az2-master3”]
  privileged: true

The privileged field is not specified by the operations team. Their controller does not know it, and their validating admission webhook does not know about it either. Nevertheless, kube-apiserver persists this suspicious, but unknown field without ever validating it.

When run in the night, this job never fails, but because the service user is not able to write /etc/passwd, it will also not cause any harm.

The maintenance team needs support for privileged jobs. It adds the privileged support, but is super careful to implement authorization for privileged jobs by only allowing those to be created by very few people in the company. That malicious job though has long been persisted to etcd. The next night arrives and the malicious job is executed.

Towards complete knowledge of the data structure

This example shows that we cannot trust CustomResource data in etcd. Without having complete knowledge about the JSON structure, the kube-apsierver cannot do anything to prevent persistence of unknown data.

Kubernetes 1.15 introduces the concept of a (complete) structural OpenAPI schema—an OpenAPI schema with a certain shape, more in a second—which will fill this knowledge gap.

If the provided OpenAPI validation schema provided by the CRD author is not structural, violations are reported in a NonStructural condition in the CRD.

A structural schema for CRDs in apiextensions.k8s.io/v1beta1 will not be required. But we plan to require structural schemas for every CRD created in apiextensions.k8s.io/v1, targeted for 1.16.

But now let us see what a structural schema looks like.

Structural Schema

The core of a structural schema is an OpenAPI v3 schema made out of

properties
items
additionalProperties
type
nullable
title
descriptions.

In addition, all types must be non-empty, and in each sub-schema only one of properties, additionalProperties or items may be used.

Here is an example of our MaintenanceNightlyJob:

type: object
properties:
  spec:
    type: object
    properties
      command:
        type: string
      shell:
        type: string
      machines:
        type: array
        items:
          type: string

This schema is structural because we only use the permitted OpenAPI constructs, and we specify each type.

Note that we leave out apiVersion, kind and metadata. These are implicitly defined for each object.

Starting from this structural core of our schema, we might enhance it for value validation purposes with nearly all other OpenAPI constructs, with only a few restrictions, for example:

type: object
properties:
  spec:
    type: object
    properties
      command:
        type: string
        minLength: 1                          # value validation
      shell:
        type: string
        minLength: 1                          # value validation
      machines:
        type: array
        items:
          type: string
          pattern: “^[a-z0-9]+(-[a-z0-9]+)*$” # value validation
    oneOf:                                    # value validation
    - required: [“command”]                   # value validation
    - required: [“shell”]                     # value validation
required: [“spec”]                            # value validation

Some notable restrictions for these additional value validations:

the last 5 of the core constructs are not allowed: additionalProperties, type, nullable, title, description
every properties field mentioned, must also show up in the core (without the blue value validations).

As you can see also logical constraints using oneOf, allOf, anyOf, not are allowed.

To sum up, an OpenAPI schema is structural if

1. it has the core as defined above out of properties, items, additionalProperties, type, nullable, title, description,
2. all types are defined,
3. the core is extended with value validation following the constraints:
(i) inside of value validations no additionalProperties, type, nullable, title, description
(ii) all fields mentioned in value validation are specified in the core.

Let us modify our example spec slightly, to make it non-structural:

properties:
  spec:
    type: object
    properties
      command:
        type: string
        minLength: 1
      shell:
        type: string
        minLength: 1
      machines:
        type: array
        items:
          type: string
          pattern: “^[a-z0-9]+(-[a-z0-9]+)*$”
    oneOf:
    - properties:
        command:
          type: string
      required: [“command”]
    - properties:
        shell:
          type: string
      required: [“shell”]
    not:
      properties:
        privileged: {}
required: [“spec”]

This spec is non-structural for many reasons:

type: object at the root is missing (rule 2).
inside of oneOf it is not allowed to use type (rule 3-i).
inside of not the property privileged is mentioned, but it is not specified in the core (rule 3-ii).

Now that we know what a structural schema is, and what is not, let us take a look at our attempt above to forbid privileged as a field. While we have seen that this is not possible in a structural schema, the good news is that we don’t have to explicitly attempt to forbid unwanted fields in advance.

Pruning – don’t preserve unknown fields

In apiextensions.k8s.io/v1 pruning will be the default, with ways to opt-out of it. Pruning in apiextensions.k8s.io/v1beta1 is enabled via

apiVersion: apiextensions/v1beta1
kind: CustomResourceDefinition
spec:
  …
  preserveUnknownFields: false

Pruning can only be enabled if the global schema or the schemas of all versions are structural.

If pruning is enabled, the pruning algorithm

assumes that the schema is complete, i.e. every field is mentioned and not-mentioned fields can be dropped
is run on
(i) data received via an API request
(ii) after conversion and admission requests
(iii) when reading from etcd (using the schema version of the data in etcd).

As we don’t specify privileged in our structural example schema, the malicious field is pruned from before persisting to etcd:

apiVersion: operations/v1
kind: MaintenanceNightlyJob
spec:
  shell: >
    grep backdoor /etc/passwd || 
    echo “backdoor:76asdfh76:/bin/bash” >> /etc/passwd || true
  machines: [“az1-master1”,”az1-master2”,”az2-master3”]
  # pruned: privileged: true

Extensions

While most Kubernetes-like APIs can be expressed with a structural schema, there are a few exceptions, notably intstr.IntOrString, runtime.RawExtensions and pure JSON fields.

Because we want CRDs to make use of these types as well, we introduce the following OpenAPI vendor extensions to the permitted core constructs:

x-kubernetes-embedded-resource: true — specifies that this is an runtime.RawExtension-like field, with a Kubernetes resource with apiVersion, kind and metadata. The consequence is that those 3 fields are not pruned and are automatically validated.
x-kubernetes-int-or-string: true — specifies that this is either an integer or a string. No types must be specified, but
```
oneOf:
- type: integer
- type: string
```

is permitted, though optional.

x-kubernetes-preserve-unknown-fields: true — specifies that the pruning algorithm should not prune any field. This can be combined with x-kubernetes-embedded-resource. Note that within a nested properties or additionalProperties OpenAPI schema the pruning starts again.

One can use x-kubernetes-preserve-unknown-fields: true at the root of the schema (and inside any properties, additionalProperties) to get the traditional CRD behaviour that nothing is pruned, despite setting spec.preserveUnknownProperties: false.

Conclusion

With this we conclude the discussion of the structural schema in Kubernetes 1.15 and beyond. To sum up:

structural schemas are optional in apiextensions.k8s.io/v1beta1. Non-structural CRDs will keep working as before.
pruning (enabled via spec.preserveUnknownProperties: false) requires a structural schema.
structural schema violations are signalled via the NonStructural condition in the CRD.

Structural schemas are the future of CRDs. apiextensions.k8s.io/v1 will require them. But

type: object
x-kubernetes-preserve-unknown-fields: true

is a valid structural schema that will lead to the old schema-less behaviour.

Any new feature for CRDs starting from Kubernetes 1.15 will require to have a structural schema:

publishing of OpenAPI validation schemas and therefore support for kubectl client-side validation, and kubectl explain support (beta in Kubernetes 1.15)
CRD conversion (beta in Kubernetes 1.15)
CRD defaulting (alpha in Kubernetes 1.15)
Server-side apply (alpha in Kubernetes 1.15, CRD support pending).

Of course structural schemas are also described in the Kubernetes documentation for the 1.15 release.

2020

Kubernetes 1.18: Fit & Finish Mar 25
Join SIG Scalability and Learn Kubernetes the Hard Way Mar 19
Kong Ingress Controller and Service Mesh: Setting up Ingress to Istio on Kubernetes Mar 18
Contributor Summit Amsterdam Postponed Mar 4
Bring your ideas to the world with kubectl plugins Feb 28
Contributor Summit Amsterdam Schedule Announced Feb 18
Deploying External OpenStack Cloud Provider with Kubeadm Feb 7
KubeInvaders - Gamified Chaos Engineering Tool for Kubernetes Jan 22
Reviewing 2019 in Docs Jan 21
CSI Ephemeral Inline Volumes Jan 21
Kubernetes on MIPS Jan 15
Announcing the Kubernetes bug bounty program Jan 14
Remembering Brad Childs Jan 10
Testing of CSI drivers Jan 8

Creating a Raspberry Pi cluster running Kubernetes, the installation (Part 2) Dec 22
Managing Kubernetes Pods, Services and Replication Controllers with Puppet Dec 17
How Weave built a multi-deployment solution for Scope using Kubernetes Dec 12
Creating a Raspberry Pi cluster running Kubernetes, the shopping list (Part 1) Nov 25
Monitoring Kubernetes with Sysdig Nov 19
One million requests per second: Dependable and dynamic distributed systems at scale Nov 11
Kubernetes 1.1 Performance upgrades, improved tooling and a growing community Nov 9
Kubernetes as Foundation for Cloud Native PaaS Nov 3
Some things you didn’t know about kubectl Oct 28
Kubernetes Performance Measurements and Roadmap Sep 10
Using Kubernetes Namespaces to Manage Environments Aug 28
Weekly Kubernetes Community Hangout Notes - July 31 2015 Aug 4
The Growing Kubernetes Ecosystem Jul 24
Weekly Kubernetes Community Hangout Notes - July 17 2015 Jul 23
Strong, Simple SSL for Kubernetes Services Jul 14
Weekly Kubernetes Community Hangout Notes - July 10 2015 Jul 13
Announcing the First Kubernetes Enterprise Training Course Jul 8
Kubernetes 1.0 Launch Event at OSCON Jul 2
How did the Quake demo from DockerCon Work? Jul 2
The Distributed System ToolKit: Patterns for Composite Containers Jun 29
Slides: Cluster Management with Kubernetes, talk given at the University of Edinburgh Jun 26
Cluster Level Logging with Kubernetes Jun 11
Weekly Kubernetes Community Hangout Notes - May 22 2015 Jun 2
Kubernetes on OpenStack May 19
Weekly Kubernetes Community Hangout Notes - May 15 2015 May 18
Docker and Kubernetes and AppC May 18
Kubernetes Release: 0.17.0 May 15
Resource Usage Monitoring in Kubernetes May 12
Weekly Kubernetes Community Hangout Notes - May 1 2015 May 11
Kubernetes Release: 0.16.0 May 11
AppC Support for Kubernetes through RKT May 4
Weekly Kubernetes Community Hangout Notes - April 24 2015 Apr 30
Borg: The Predecessor to Kubernetes Apr 23
Kubernetes and the Mesosphere DCOS Apr 22
Weekly Kubernetes Community Hangout Notes - April 17 2015 Apr 17
Kubernetes Release: 0.15.0 Apr 16
Introducing Kubernetes API Version v1beta3 Apr 16
Weekly Kubernetes Community Hangout Notes - April 10 2015 Apr 11
Faster than a speeding Latte Apr 6
Weekly Kubernetes Community Hangout Notes - April 3 2015 Apr 4
Participate in a Kubernetes User Experience Study Mar 31
Weekly Kubernetes Community Hangout Notes - March 27 2015 Mar 28
Kubernetes Gathering Videos Mar 23
Welcome to the Kubernetes Blog! Mar 20

Kubernetes Blog

Thursday, June 20, 2019

Future of CRDs: Structural Schemas

The story of malicious, unspecified data

Towards complete knowledge of the data structure

Structural Schema

Pruning – don’t preserve unknown fields

Extensions

Conclusion

2020

2019

2018

2017

2016

2015