This is the first post in the series of posts about writing and publishing a Kubernetes Operator. This series will discuss the following topics:

  • how to write an Operator,
  • run it on a Kubernetes cluster,
  • create a bundle and package it into an index image,
  • create a CatalogSource resource using this index image,
  • and finally create an Operand out of the Operator installed through the CatalogSource.

This particular post covers first two, while the next post wil discuss remaining topcis.

Tooling used by me for this blog series:

  • OS: Fedora 37
  • Kubernetes: minikube v1.29.0 with Docker provider
  • OLM specific tooling:
    • operator-sdk version 1.27.0
    • opm version 1.26.4

The Operator we are going to create here isn’t a unique idea of mine. It’s based on this post by Ishan Khare, but is created using operator-sdk instead of kubebuilder. Let’s get started.

Note
I’m writing this blog series as a part of learning Kubernetes Operators and Operator Framework myself. There might be mistakes in the post. Please share any feedback/suggestions via Twitter.

Initialize the project Link to heading

Create a directory and initialize the project in it:

$ mkdir at-operator
$ cd at-operator
$ operator-sdk init --domain example.com --repo github.com/dharmit/at-operator
$ operator-sdk create api --group at --version v1alpha1 --kind At --resource --controller

$ ls
api  bin  config  controllers  hack  Dockerfile  go.mod  go.sum  main.go  Makefile  PROJECT  README.md

What does the “At Operator” do? Link to heading

Similar to the original implementation, the At Operator here runs a specific command at the given time. To do this, it creates a Kubernetes Pod in which it runs the command. Nothing fancy here. :)

Code for the Operator Link to heading

Code is divided into two main parts:

  1. API - api directory. This contains the Go structs that define an At resource.
  2. Controllers - controllers directory. This contains the reconciliation logic.

API Link to heading

It mainly defines the At struct and the structs for its fields defining the spec and status. Of main interest here is the AtSpec struct which contains the Schedule, which is UTC time, and Command to be executed at the specified schedule. Below is the code for the file api/v1alpha1/at_types.go:

As mentioned in the operator-sdk documentation, run make generate after modifying the *_types.go file. This will update the api/v1alpha1/zz_generate_deepcopy.go file to ensure our API’s Go type definitions implement the runtime.Object interface that all Kind types must implement.

$ make generate

Controller Link to heading

The controller contains the reconciliation logic. It is the heart of an Operator as it is the business logic of the system which logic goes into the Reconcile function.

The Reconcile function for our Operator updates the Phase of a newly created Operand to PENDING. Next it evaluates the difference between the current time and the time mentioned in the .spec.schedule of the Operand. If this difference in time is greater than 0, it requeues the Operand to run it after the diff amount of time. On the other hand, if the difference is less than 0, it runs the command mentioned in .spec.command of our Operand by creating a Pod using the busybox image.

Below is the code for controllers/at_controller.go:

Notice that we have added Owns(&corev1.Pod{}) in the SetupWithManager function. This ensures that the Pod created by our Operator is owned by the At instance that created it. As a result, when we do kubectl delete at sample-at, Kubernetes garbage collector deletes the Pod as well.

Next run the command make manifests which generates the CRD manifests under config/crd/bases directory.

$ make manifests

Build the container images Link to heading

In the Makefile, set the desired value for IMAGE_TAG_BASE and use it to set the value of IMG:

# use the container registry and namespace you have access to
IMAGE_TAG_BASE ?= quay.io/dharmit/at-operator
IMG ?= ${IMAGE_TAG_BASE}:${VERSION}

Now build the container image and push it. Make sure to login to the container registry first:

$ docker login -u $QUAY_USERNAME -p $QUAY_PASSWORD quay.io
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/dshah/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

$ make docker-build docker-push

That builds and pushes the container image for at-operator. With the configurations shown here, it builds and pushes quay.io/dharmit/at-operator:v0.0.1 to the Red Hat Quay registry.

Run the Operator as Deployment on a cluster Link to heading

Using make deploy will create a new namespace on the cluster and start a Deployment for our Operator there.

$ make deploy
$ kubectl get ns
NAME                 STATUS   AGE
at-operator-system   Active   4s    <------ newly created by "make deploy"
default              Active   139m
kube-node-lease      Active   139m
kube-public          Active   139m
kube-system          Active   139m

$ kubectl get deploy -n at-operator-system
NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
at-operator-controller-manager   1/1     1            1           2m21s

It also creates our CRD on the cluster:

$ kubectl get crds
NAME                 CREATED AT
ats.at.example.com   2023-03-17T10:27:43Z

Create an At with below spec. Modify the schedule to the date and time when you are trying this out, and note that time should be in UTC (run date -u on the CLI) because that’s the default timezone used by a Kubernetes cluster:

$ cat <<EOF | kubectl create -f -
apiVersion: at.example.com/v1alpha1
kind: At
metadata:
  name: sample-at
spec:
  schedule: "2023-03-19T08:00:00Z"
  command: "echo hello world"
EOF

The logs of the Pod created for our Operator’s Deployment look something like below:

$ kubectl logs at-operator-controller-manager-5b4549c455-82n4r -f
1.6792134703300617e+09	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": "127.0.0.1:8080"}
1.6792134703302734e+09	INFO	setup	starting manager
1.6792134703304198e+09	INFO	Starting server	{"kind": "health probe", "addr": "[::]:8081"}
1.6792134703304203e+09	INFO	Starting server	{"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
I0319 08:11:10.330433       1 leaderelection.go:248] attempting to acquire leader lease at-operator-system/6cceeaca.example.com...
I0319 08:11:10.337896       1 leaderelection.go:258] successfully acquired lease at-operator-system/6cceeaca.example.com
1.679213470337917e+09	DEBUG	events	at-operator-controller-manager-5b4549c455-82n4r_4d3b6c39-b01d-45e5-9cf0-3bae78c29a76 became leader	{"type": "Normal", "object": {"kind":"Lease","namespace":"at-operator-system","name":"6cceeaca.example.com","uid":"b0042073-01e8-44fc-89b8-c22a728375ac","apiVersion":"coordination.k8s.io/v1","resourceVersion":"28109"}, "reason": "LeaderElection"}
1.6792134703379743e+09	INFO	Starting EventSource	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "source": "kind source: *v1alpha1.At"}
1.6792134703380027e+09	INFO	Starting EventSource	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "source": "kind source: *v1.Pod"}
1.679213470338007e+09	INFO	Starting Controller	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At"}
1.6792134704389532e+09	INFO	Starting workers	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "worker count": 1}
1.6792134704390779e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "08c339f8-4149-425c-b153-c87087c65c31"}
1.6792134704391074e+09	INFO	Phase: PENDING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "08c339f8-4149-425c-b153-c87087c65c31"}
1.679213470439118e+09	INFO	Schedule parsing done	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "08c339f8-4149-425c-b153-c87087c65c31", "Result": "19.560889822s"}
1.679213490000635e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "f9879ed4-4210-4bd2-9619-caa15028daec"}
1.6792134900006685e+09	INFO	Phase: PENDING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "f9879ed4-4210-4bd2-9619-caa15028daec"}
1.6792134900006785e+09	INFO	Schedule parsing done	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "f9879ed4-4210-4bd2-9619-caa15028daec", "Result": "-670.194µs"}
1.6792134900006816e+09	INFO	Time to execute	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "f9879ed4-4210-4bd2-9619-caa15028daec", "Ready to execute": "echo hello world"}
1.6792134900130224e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "9cfd8930-d726-404e-baec-c2f879e423eb"}
1.6792134900130394e+09	INFO	Phase: RUNNING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "9cfd8930-d726-404e-baec-c2f879e423eb"}
1.6792134900174663e+09	INFO	Pod created successfully	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "9cfd8930-d726-404e-baec-c2f879e423eb", "name": "sample-at"}
1.6792134900175796e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "8bd0631a-fb83-4a74-939d-c4c1c6d44a5c"}
1.6792134900176141e+09	INFO	Phase: RUNNING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "8bd0631a-fb83-4a74-939d-c4c1c6d44a5c"}
1.679213490024618e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "b4a787db-c39d-4921-93b1-781db45f24b6"}
1.6792134900246475e+09	INFO	Phase: RUNNING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "b4a787db-c39d-4921-93b1-781db45f24b6"}
1.679213490029606e+09	INFO	==== Reconciling at ====	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "72b1330f-c22c-4e10-986f-4781add65817"}
1.6792134900296335e+09	INFO	Phase: RUNNING	{"controller": "at", "controllerGroup": "at.example.com", "controllerKind": "At", "At": {"name":"sample-at","namespace":"at-operator-system"}, "namespace": "at-operator-system", "name": "sample-at", "reconcileID": "72b1330f-c22c-4e10-986f-4781add65817"}

First instance of reconciliation evaluates the time and finds that the At should be run about 20 seconds later and requeues it to run at that time. When it’s time, it runs the command mentioned in our manifest (echo hello world in this case). You should see a Pod with Completed status like below:

$ kubectl get pods sample-at
NAME                                              READY   STATUS      RESTARTS   AGE
sample-at                                         0/1     Completed   0          18s

$ kubectl logs sample-at
hello world

That’s it! Link to heading

That’s it in this part. In the next part of the series, we will see how to pack our At Operator into a bundle image, pack the bundle into an index image, and finally run things through Operator Lifecycle Manager like you do for the real Operators available via OperatorHub.