Upgrade
This guide covers upgrading the Memcached Operator to a new version, including CRD updates, operator rollout, verification, and rollback procedures.
Pre-Upgrade Checklist
Before starting an upgrade, complete the following steps:
1. Back up existing Memcached custom resources
Export all Memcached CRs across all namespaces:
kubectl get memcached -A -o yaml > memcached-backup.yamlThis backup allows you to restore your Memcached definitions if the upgrade introduces breaking CRD schema changes.
2. Check the current operator version
kubectl get deployment -n memcached-operator-system memcached-operator-controller-manager \
-o jsonpath='{.spec.template.spec.containers[0].image}'3. Review release notes
Check the release notes for the target version at https://github.com/c5c3/memcached-operator/releases. Pay attention to:
- Breaking changes in the CRD schema (field removals, renames, type changes)
- New required fields or changed default values
- Changes to RBAC permissions
- Changes to webhook behavior
- Minimum Kubernetes version requirements
4. Check CRD schema changes
Compare the CRD between your current version and the target version:
# View the currently installed CRD schema
kubectl get crd memcacheds.memcached.c5c3.io -o yaml > crd-current.yaml
# Compare with the new version (from the release or repository)
diff crd-current.yaml <(kustomize build config/crd)5. Verify cluster health
Confirm that the cluster and existing Memcached instances are healthy before proceeding:
# Check all Memcached CRs
kubectl get memcached -A
# Check operator pod
kubectl get pods -n memcached-operator-system
# Verify no pending issues in operator logs
kubectl logs -n memcached-operator-system deployment/memcached-operator-controller-manager --tail=50All Memcached CRs should show Available: True in their status conditions before upgrading.
CRD Update
Update the CRD before upgrading the operator. The new operator version may depend on CRD schema changes (new fields, updated validation rules, or changed defaults).
Apply the updated CRD
From the repository at the target version:
make installOr apply a specific release artifact:
kubectl apply -f https://github.com/c5c3/memcached-operator/releases/download/<new-version>/memcached.c5c3.io_memcacheds.yamlCRD compatibility notes
- The operator now serves
v1beta1as the storage version andv1alpha1as a deprecated spoke version. Both have identical schemas — automatic conversion is handled by the API server. See the migration guide for details. - CRD updates within a given API version are generally backwards-compatible. New optional fields are added with defaults, and existing fields retain their behavior.
- Field removals or renames between versions require careful handling. If a field is removed, existing CRs that specify that field will continue to work (the field is ignored), but you should update your CRs to remove deprecated fields.
- New validation rules may reject existing CRs on the next update. If an existing CR violates a new validation constraint, you must update the CR to conform before applying changes to it.
Verify CRD update
kubectl get crd memcacheds.memcached.c5c3.io -o jsonpath='{.metadata.resourceVersion}'Confirm the resource version has changed, indicating the CRD was updated.
Operator Update
After updating the CRD, deploy the new operator version:
make deploy IMG=ghcr.io/c5c3/memcached-operator:<new-version>Replace <new-version> with the target version tag (e.g., v0.2.0).
This command:
- Updates the image reference in the Kustomize overlay.
- Rebuilds and applies all manifests from
config/default/, including any updated RBAC rules, webhook configurations, and cert-manager resources. - Triggers a rolling update of the controller manager Deployment.
Monitor the rollout
kubectl rollout status deployment/memcached-operator-controller-manager \
-n memcached-operator-systemWait until the rollout completes:
deployment "memcached-operator-controller-manager" successfully rolled outSingle-file upgrade
If you use the single-file installation method, apply the manifest directly from the GitHub Release:
kubectl apply -f https://github.com/c5c3/memcached-operator/releases/download/<new-version>/install.yamlOr generate it locally:
make build-installer IMG=ghcr.io/c5c3/memcached-operator:<new-version>
kubectl apply -f dist/install.yamlPost-Upgrade Verification
After the new operator version is running, verify that everything is working correctly.
Check the operator pod
kubectl get pods -n memcached-operator-systemConfirm the pod is Running with READY 1/1 and has RESTARTS 0.
Verify the operator version
kubectl get deployment -n memcached-operator-system memcached-operator-controller-manager \
-o jsonpath='{.spec.template.spec.containers[0].image}'Confirm it shows the new image tag.
Check operator logs
kubectl logs -n memcached-operator-system deployment/memcached-operator-controller-manager --tail=100Look for:
- Successful startup messages
- No error-level log entries
- Reconciliation activity resuming for existing Memcached CRs
Verify all Memcached instances are reconciled
kubectl get memcached -AAll instances should show the expected READY count. Inspect individual instances for healthy status conditions:
kubectl get memcached <name> -n <namespace> -o jsonpath='{.status.conditions}' | jq .Confirm:
AvailableisTrueDegradedisFalseProgressingisFalse
Verify managed resources
Check that all managed resources are present and healthy:
# Deployments
kubectl get deployments -l app.kubernetes.io/managed-by=memcached-operator -A
# Services
kubectl get services -l app.kubernetes.io/managed-by=memcached-operator -A
# PodDisruptionBudgets (if HA is enabled)
kubectl get pdb -l app.kubernetes.io/managed-by=memcached-operator -A
# ServiceMonitors (if monitoring is enabled)
kubectl get servicemonitor -l app.kubernetes.io/managed-by=memcached-operator -ACheck webhooks
kubectl get mutatingwebhookconfigurations,validatingwebhookconfigurations | grep memcachedVerify the webhook configurations are present and that creating or updating a Memcached CR succeeds without admission errors.
Rollback
If the upgrade causes issues, roll back to the previous operator version.
Redeploy the previous version
make deploy IMG=ghcr.io/c5c3/memcached-operator:<previous-version>Replace <previous-version> with the version you were running before the upgrade.
Restore CRDs if needed
If the CRD was updated and the rollback requires the old schema:
# From the previous version's branch or release:
git checkout <previous-version>
make installOr apply the CRD from the previous release:
kubectl apply -f https://github.com/c5c3/memcached-operator/releases/download/<previous-version>/memcached.c5c3.io_memcacheds.yamlRestore Memcached CRs if needed
If CRs were modified or deleted during the upgrade, restore them from backup:
kubectl apply -f memcached-backup.yamlVerify the rollback
Follow the same post-upgrade verification steps to confirm the previous version is running correctly and all Memcached instances are healthy.
Version Compatibility Matrix
| Operator Version | Kubernetes Version | CRD API Versions | cert-manager Version | Notes |
|---|---|---|---|---|
| v0.1.0 | v1.28 -- v1.32 | v1alpha1 | v1.12+ | Initial release |
| v0.2.0 | v1.28 -- v1.32 | v1beta1 (storage), v1alpha1 (deprecated) | v1.12+ | v1beta1 storage version; see migration guide |
Check the release notes for each version to confirm compatibility with your Kubernetes cluster version. The operator is tested against the Kubernetes versions listed in this matrix.
Common Upgrade Issues
Webhook certificate renewal
Symptom: After upgrading, creating or updating Memcached CRs fails with TLS errors such as x509: certificate signed by unknown authority or webhook connection refused.
Cause: cert-manager may need to reissue the webhook serving certificate after the upgrade. This typically resolves itself within a few minutes as cert-manager detects the new webhook Service and renews the certificate.
Resolution:
# Check the certificate status
kubectl get certificate -n memcached-operator-system
# Check cert-manager logs if the certificate is not ready
kubectl logs -n cert-manager deployment/cert-manager --tail=50
# Force certificate renewal if needed
kubectl delete secret -n memcached-operator-system memcached-operator-webhook-server-certDeleting the Secret triggers cert-manager to re-create it with a fresh certificate.
CRD validation changes rejecting existing CRs
Symptom: Existing Memcached CRs cannot be updated. kubectl apply returns validation errors for fields that were previously accepted.
Cause: The new CRD version introduced stricter validation rules (tighter ranges, new required fields, changed patterns).
Resolution:
- Identify the validation error from the
kubectl applyoutput. - Update the affected CRs to conform to the new validation rules.
- If the change is unintentional, file an issue and roll back to the previous CRD version.
Note: Existing CRs that are already stored in etcd are not re-validated until they are next updated. They continue to function, but you cannot modify them until they conform to the new schema.
RBAC permission changes
Symptom: The operator logs show Forbidden errors after upgrading, and reconciliation stops for some or all resources.
Cause: The new operator version requires additional RBAC permissions that were not present in the previous version.
Resolution:
Ensure you ran make deploy (not just updated the Deployment image), as make deploy applies the full Kustomize overlay including updated RBAC manifests. If you upgraded only the container image, apply the RBAC changes separately:
kustomize build config/rbac | kubectl apply -f -Operator pod crash-looping after upgrade
Symptom: The new operator pod enters CrashLoopBackOff immediately after deployment.
Cause: Common causes include missing CRDs, missing RBAC permissions, or incompatible Kubernetes API versions.
Resolution:
# Check the pod logs for the specific error
kubectl logs -n memcached-operator-system deployment/memcached-operator-controller-manager
# Ensure CRDs are installed for the new version
make install
# Ensure RBAC is up to date
kustomize build config/default | kubectl apply -f -Reconciliation lag after upgrade
Symptom: After upgrading, existing Memcached instances take several minutes to reconcile.
Cause: This is expected behavior. The new controller manager starts with empty caches and must re-list and re-watch all relevant resources. The reconciliation queue processes all existing Memcached CRs, which takes time proportional to the number of instances.
Resolution: No action needed. Monitor the operator logs and wait for all instances to be reconciled. The operator requeues reconciliation after 10 seconds for instances that are not yet ready, and after 60 seconds for healthy instances.