Skip to content

Key Rotation and Secret Lifecycle

This guide covers rotation procedures for all cryptographic material and credentials used by CloudTaser. Regular key rotation is a requirement for PCI DSS 3.5.1, DORA, and general defense-in-depth.


Vault Transit KEK Rotation

The CloudTaser S3 proxy uses Vault Transit engine for envelope encryption. Each S3 object is encrypted with a unique data encryption key (DEK), which is wrapped by a key encryption key (KEK) managed in Vault Transit. The KEK never leaves the EU-hosted vault.

Rotate the KEK

vault write -f transit/keys/cloudtaser/rotate

After rotation:

  • New encryptions use the latest key version automatically
  • Old data remains decryptable -- Vault retains all previous key versions by default
  • No re-encryption of existing S3 objects is required (the wrapped DEK metadata in each object references the key version used)

Set Minimum Decryption Version

To enforce that old key versions can no longer decrypt data (after re-encrypting all objects with the new key), set the minimum decryption version:

vault write transit/keys/cloudtaser/config min_decryption_version=N

Where N is the oldest key version you want to keep active. Any wrapped DEK using a version below N will fail to unwrap, making the corresponding S3 objects permanently unreadable unless the minimum version is lowered again.

Verify before advancing min_decryption_version

After rotation, monitor for decryption failures in application logs before advancing min_decryption_version. Advancing prematurely will make objects encrypted with old key versions permanently unreadable.

  • Quarterly rotation of the Transit KEK (minimum)
  • After rotation, monitor for decryption failures before advancing min_decryption_version
  • For regulated workloads (PCI DSS, DORA), document each rotation with date, operator, and key version number

Vault Token Lifecycle

CloudTaser components authenticate to Vault using Kubernetes auth and manage tokens with different lifecycles.

Wrapper Token Renewal

The wrapper runs as PID 1 in each injected pod. It authenticates to Vault on startup using the pod's Kubernetes service account token and receives a Vault token. The wrapper automatically renews this token before it expires using periodic renewal. If renewal fails, the wrapper logs an error and can optionally trigger a pod restart (depending on the rotation strategy annotation).

Operator Child Tokens

The operator creates short-lived child tokens for specific operations (e.g., token review during Kubernetes auth configuration). These tokens inherit the parent token's policies and have a short TTL.

Token TTL Configuration

Configure token TTLs in the Vault Kubernetes auth role:

vault write auth/kubernetes/role/cloudtaser \
  bound_service_account_names=default \
  bound_service_account_namespaces=cloudtaser-system \
  policies=cloudtaser \
  ttl=1h \
  max_ttl=24h
Parameter Description Recommended
ttl Initial token lifetime; the wrapper renews before expiry 1h
max_ttl Absolute maximum lifetime; after this, the wrapper must re-authenticate 24h

Short TTL, frequent renewal

A short ttl (1 hour) with automatic renewal limits the window of exposure if a token is compromised. The max_ttl (24 hours) forces daily re-authentication via Kubernetes auth, ensuring that revoked ServiceAccounts lose access within a day.


Operator TLS Certificate Rotation

The operator generates a self-signed CA and server certificate for the mutating webhook. These certificates are stored in a Kubernetes Secret (cloudtaser-operator-certs) and the CA bundle is injected into the MutatingWebhookConfiguration.

Certificate Validity

Certificate Default Validity
CA certificate 10 years
Server certificate 365 days
Client certificate 365 days

Automatic Rotation

The operator checks certificate expiry every 24 hours. When a certificate is within 30 days of expiry:

  1. A new server/client certificate is generated using the existing CA
  2. The Kubernetes Secret cloudtaser-operator-certs is updated
  3. The MutatingWebhookConfiguration CA bundle is patched
  4. The webhook server reloads the new certificate

No manual intervention or pod restart is required for automatic rotation.

Manual Rotation

To force certificate regeneration:

# Delete the certificate Secret
kubectl delete secret cloudtaser-operator-certs -n cloudtaser-system

# Restart the operator to regenerate certificates
kubectl rollout restart deployment/cloudtaser-operator -n cloudtaser-system

The operator will:

  1. Detect the missing Secret on startup
  2. Generate a new CA and server certificate
  3. Create the Secret with the new certificates
  4. Patch the MutatingWebhookConfiguration with the new CA bundle

Providing Your Own Certificates

For production environments with an existing PKI, you can provide your own certificates via Helm values:

operator:
  webhook:
    certSecret: my-webhook-certs

The Secret must contain tls.crt and tls.key files. In this case, the operator does not manage certificate rotation -- you are responsible for rotating the Secret and restarting the operator.


Emergency Procedures

Emergency procedures are disruptive

The following procedures are for incident response only. They will disrupt running workloads. Use them only when a key or credential is compromised.

Vault Token Revocation

Revoke a specific token (e.g., a compromised wrapper token):

# Revoke by accessor (does not require the token itself)
vault token revoke -accessor <accessor>

# Revoke a token directly
vault token revoke <token>

# Self-revoke (from the compromised context)
vault token revoke -self

After revocation:

  • The affected wrapper immediately loses access to Vault
  • The wrapper will log authentication errors and (depending on configuration) the pod may restart
  • Other pods and tokens are unaffected
  • The pod will re-authenticate via Kubernetes auth on restart

Transit Key Deletion (Irreversible)

This action is irreversible

Deleting a Transit key permanently destroys all key versions. All data encrypted with this key becomes permanently unrecoverable.

# First, allow deletion (disabled by default)
vault write transit/keys/cloudtaser/config deletion_allowed=true

# Delete the key (IRREVERSIBLE)
vault delete transit/keys/cloudtaser

After deletion:

  • All S3 objects encrypted with this key are permanently unreadable
  • New encryption operations will fail until a new key is created
  • Create a new key with the same name to resume operations: vault write -f transit/keys/cloudtaser

Only use this as a last resort when you need to guarantee that encrypted data can never be decrypted (e.g., data breach containment where the attacker may have copies of encrypted objects and wrapped DEKs).

Operator Certificate Regeneration

If the operator's TLS certificates are compromised:

# Delete the compromised certificates
kubectl delete secret cloudtaser-operator-certs -n cloudtaser-system

# Restart the operator to regenerate
kubectl rollout restart deployment/cloudtaser-operator -n cloudtaser-system

# Verify the webhook is operational
kubectl get mutatingwebhookconfiguration cloudtaser-operator-webhook -o yaml | grep caBundle

The operator will generate a new CA and server certificate. All webhook traffic will use the new certificates after the rollout completes.

Full Emergency Lockdown

In a severe compromise, revoke all CloudTaser access:

# 1. Revoke all tokens issued by the Kubernetes auth role
vault token revoke -mode=path auth/kubernetes

# 2. Disable the Kubernetes auth method entirely
vault auth disable kubernetes

# 3. Regenerate operator certificates
kubectl delete secret cloudtaser-operator-certs -n cloudtaser-system
kubectl rollout restart deployment/cloudtaser-operator -n cloudtaser-system

After lockdown:

  • All wrappers lose Vault access immediately
  • No new pods can authenticate
  • Applications continue running but cannot fetch new secrets or renew leases
  • Re-enable Kubernetes auth and reconfigure roles to restore access

Applications continue running during lockdown

Existing processes retain their in-memory secrets. They will continue to function until they need to fetch new secrets or renew leases. A lockdown does not terminate running applications.