← All articlesPlatform Engineering

Kubernetes Operators for Custom Resources: A Practical Guide

Learn how to build Kubernetes Operators that manage custom resources. From CRDs to Operator SDK, this guide covers reconciliation loops, status...

Yash Pritwani

1 October 202516 min read

What Are Kubernetes Operators?

Kubernetes Operators extend the Kubernetes API to manage complex, stateful applications using custom resources. Instead of writing shell scripts or manual runbooks, you encode operational knowledge into code that Kubernetes runs continuously.

Microservices architecture: independent services communicate through an API gateway and event bus.

An Operator watches for changes to Custom Resources (CRs) and reconciles the actual state of your cluster with the desired state you declared. Think of it as a robot SRE that never sleeps.

The Operator Pattern

The Operator pattern consists of three pieces:

Custom Resource Definition (CRD): Extends the Kubernetes API with your own resource types
Custom Resource (CR): An instance of your CRD — the desired state
Controller: Code that watches CRs and reconciles actual state to match desired state

User creates CR → Controller watches → Reconcile loop runs → Actual state matches desired state

Building a CRD

Let us build an Operator that manages PostgreSQL databases. First, define the CRD:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.techsaas.cloud
spec:
  group: techsaas.cloud
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                engine:
                  type
                  enum: ["postgresql", "mysql", "mongodb"]
                version:
                  type
                storage:
                  type
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 5
            status:
              type: object
              properties:
                phase:
                  type
                connectionString:
                  type
                readyReplicas:
                  type: integer
      subresources:
        status: {}
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
      - db

Get more insights on Platform Engineering

Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.

Now users can create databases declaratively:

apiVersion: techsaas.cloud/v1alpha1
kind: Database
metadata:
  name: my-app-db
  namespace: production
spec:
  engine: postgresql
  version: "16"
  storage: 10Gi
  replicas: 2

The Reconciliation Loop

The heart of every Operator is the reconcile function. Using the Operator SDK with Go:

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)

    // 1. Fetch the Database CR
    var database v1alpha1.Database
    if err := r.Get(ctx, req.NamespacedName, &database); err != nil {
        if apierrors.IsNotFound(err) {
            // CR was deleted, clean up resources
            return ctrl.Result{}, nil
        }
        return ctrl.Result{}, err
    }

    // 2. Check if StatefulSet exists
    var sts appsv1.StatefulSet
    err := r.Get(ctx, types.NamespacedName{
        Name:      database.Name + "-db",
        Namespace: database.Namespace,
    }, &sts)

    if apierrors.IsNotFound(err) {
        // 3. Create StatefulSet if it does not exist
        sts = r.buildStatefulSet(&database)
        if err := r.Create(ctx, &sts); err != nil {
            return ctrl.Result{}, err
        }
        log.Info("Created StatefulSet", "name", sts.Name)
    }

    // 4. Create Service for the database
    if err := r.ensureService(ctx, &database); err != nil {
        return ctrl.Result{}, err
    }

    // 5. Create Secret with connection string
    if err := r.ensureSecret(ctx, &database); err != nil {
        return ctrl.Result{}, err
    }

    // 6. Update status
    database.Status.Phase = "Running"
    database.Status.ReadyReplicas = sts.Status.ReadyReplicas
    database.Status.ConnectionString = fmt.Sprintf(
        "postgresql://user:pass@%s-db.%s.svc:5432/app",
        database.Name, database.Namespace,
    )
    if err := r.Status().Update(ctx, &database); err != nil {
        return ctrl.Result{}, err
    }

    // 7. Requeue after 30 seconds for health check
    return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}

Key Reconciliation Patterns

Idempotency

Your reconcile function will be called many times. Every operation must be idempotent:

// BAD: Creates duplicate resources
func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) {
    r.Create(ctx, newService())  // Creates a new service every time
}

// GOOD: Check-then-create pattern
func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) {
    var svc corev1.Service
    err := r.Get(ctx, key, &svc)
    if apierrors.IsNotFound(err) {
        r.Create(ctx, newService())  // Only creates if missing
    }
}

Container orchestration distributes workloads across multiple nodes for resilience and scale.

→

AI Agents Are Becoming First-Class Citizens in Platform Engineering7 min read read

→

Platform Engineering in 2026: Building Internal Developer Platforms That Actually Get Used11 min read

→

Platform Engineering for Mid-Size Teams: You Don't Need 500 Engineers to Build an IDP12 min read

Owner References

Set owner references so child resources are garbage-collected when the parent CR is deleted:

ctrl.SetControllerReference(&database, &statefulSet, r.Scheme)

Status Subresource

Always report status back to the user. This lets them run kubectl get databases and see the state:

NAME	ENGINE	REPLICAS	PHASE	AGE
my-app-db	postgresql	2/2	Running	5m
analytics	mongodb	1/1	Running	2d

Operator SDK vs Kubebuilder vs KOPF

Feature	Operator SDK (Go)	Kubebuilder	KOPF (Python)
Language	Go	Go	Python
Scaffolding	Yes (full)	Yes (full)	Minimal
Helm/Ansible	Yes	No	No
Maturity	Production	Production	Mature
Learning curve	Steep	Steep	Moderate
Performance	Excellent	Excellent	Good
Best for	Complex operators	K8s-native	Quick prototyping

For production operators managing critical resources, Go with Operator SDK is the standard choice. For internal tools and prototypes, KOPF with Python gets you running faster.

Scaffolding with Operator SDK

# Initialize project
operator-sdk init --domain techsaas.cloud --repo github.com/techsaas/db-operator

# Create API and controller
operator-sdk create api --group db --version v1alpha1 --kind Database --resource --controller

# Generate CRD manifests
make manifests

# Build and push
make docker-build docker-push IMG=registry.techsaas.cloud/db-operator:v1

# Deploy to cluster
make deploy IMG=registry.techsaas.cloud/db-operator:v1

Free Resource

Free Cloud Architecture Checklist

A 47-point checklist covering security, scalability, cost optimization, and disaster recovery for production cloud environments.

Download the Checklist

When To Build an Operator

Build an Operator when:

You manage stateful applications that need lifecycle automation
You have day-2 operations (backups, scaling, upgrades) that are currently manual
Multiple teams need self-service access to provision resources
You want to encode operational knowledge into version-controlled code

Do not build an Operator when:

A Helm chart with values is sufficient
The application is stateless and simple
You are the only operator and kubectl commands suffice

API gateway pattern: a single entry point handles auth, rate limiting, and routing to backend services.

Production Considerations

RBAC: Your operator needs minimal permissions. Use a dedicated ServiceAccount
Leader election: For HA, only one controller instance should reconcile at a time
Metrics: Expose Prometheus metrics for reconcile duration, errors, and queue depth
Finalizers: Use finalizers for cleanup tasks that must complete before CR deletion
Webhook validation: Add admission webhooks to validate CRs before they are created

At TechSaaS, we build custom operators for clients who need platform-level automation. Whether it is database provisioning, certificate management, or environment cloning, operators turn manual toil into declarative infrastructure.

#kubernetes#operators#crd#platform-engineering#go

Related Service

Cloud Solutions

Let our experts help you build the right technology strategy for your business.

Get a Consultation Chat on WhatsApp

Need help with platform engineering?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

Get a Free Consultation WhatsApp Us

We Will Build You a Demo Site — For Free

Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.

47+ companies trusted us

99.99% uptime

< 48hr response

No spam. No contracts. Just a free demo.