Configure Envoy Ingress

Overview

Envoy Gateway implements the Kubernetes Gateway API and replaces ingress-nginx as the ingress controller for Kloudfuse. This guide covers new cluster installation, zero-downtime migration from ingress-nginx, rollback procedures, and post-uninstall cleanup.

Prerequisites

Hardware Requirements

Kloudfuse version 4.0.0 or later — Envoy Gateway is not compatible with earlier versions
Kubernetes cluster version 1.27 or later
kubectl configured with cluster access
Helm 3.x installed

Install Envoy CRDs

Install the CRDs before running helm install or helm upgrade. This uses the upstream Envoy Gateway CRDs chart:

helm template eg-crds oci://docker.io/envoyproxy/gateway-crds-helm \
  --set 'crds.gatewayAPI.enabled=true' \
  --set 'crds.envoyGateway.enabled=true' \
  --version v0.0.0-latest \
  | kubectl apply --server-side --force-conflicts -f -

bash

This command is idempotent — safe to run on clusters that already have the CRDs installed. The --server-side flag avoids annotation size limits on large CRDs.

Install or Upgrade cert-manager

cert-manager v1.14+ with Gateway API support is required for automatic TLS certificate issuance.

cert-manager is cluster-wide but runs in a single namespace. If cert-manager is already installed on your cluster, upgrade the existing installation in its current namespace rather than installing a second instance.

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install cert-manager jetstack/cert-manager \
  --namespace <cert-manager-namespace> \ (1)
  --version v1.17.1 \
  --set crds.enabled=true \
  --set config.enableGatewayAPI=true

bash

1	Use the namespace where cert-manager is already installed (e.g., `cert-manager`), not the Kloudfuse namespace. Run `kubectl get pods -A \| grep cert-manager` to find it.

The config.enableGatewayAPI=true flag is required. Without it, cert-manager only watches Ingress resources and will not create certificates for Gateway resources.

Add the following TLS section to your custom_values.yaml to enable automatic certificate issuance:

tls:
  enabled: true
  host: <your-hostname>
  email: <your-email>
  clusterIssuer: <namespace>-letsencrypt-prod
  createClusterIssuer: true

yaml

New Install

For new clusters where ingress-nginx has never been installed, enable envoy-gateway and disable ingress-nginx:

AWS (EKS)
GCP (GKE)

envoy-gateway:
  enabled: true
  installGatewayRoutes: true
  envoyService:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
      service.beta.kubernetes.io/aws-load-balancer-eip-allocations: <YOUR_EIP_ALLOC_IDS>
    patch:
      externalTrafficPolicy: Local
    external:
      enabled: true
    internal:
      enabled: true
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-internal: "true"

ingress-nginx:
  enabled: false
  installIngressRules: false

yaml

envoy-gateway:
  enabled: true
  installGatewayRoutes: true
  envoyService:
    patch:
      loadBalancerIP: "<YOUR_STATIC_IP>"
    external:
      enabled: true
    internal:
      enabled: true
      annotations:
        networking.gke.io/load-balancer-type: "Internal"
        cloud.google.com/load-balancer-type: "Internal"

ingress-nginx:
  enabled: false
  installIngressRules: false

yaml

TLS settings do not change. The existing tls.enabled, tls.host, and tls.clusterIssuer work with envoy-gateway. To disable the HTTP listener on port 80 (HTTPS only), set envoy-gateway.enableHttp to false in your values file.

envoy-gateway:
  enableHttp: false

yaml

Upgrade from Nginx to Envoy

This procedure migrates an existing Kloudfuse installation from ingress-nginx to envoy-gateway without recreating load balancers or changing IPs or DNS. This migration will result in zero-downtime.

The Nginx LoadBalancer Service is preserved throughout the migration. Instead of deleting it and creating a new Envoy LB, the Nginx service’s selector is repointed to Envoy proxy pods. The same NLB/LB keeps the same IPs and DNS.

Step Change Values Traffic Served By What Happens

Step	Change Values	Traffic Served By	What Happens
D0 (current)	None	Nginx	Starting state
Step 1 (prepare)	Enable Envoy + `envoyMigration.enabled`	Nginx (unchanged)	Envoy starts alongside nginx. Nginx LB gets `resource-policy: keep` annotation. Envoy creates ClusterIP services (no new LB).
Step 2 (switch)	Set `envoyMigration.external/internal: true`	Envoy	Nginx LB selector switches to Envoy pods. TargetPorts change to 10080/10443. Traffic flows through Envoy via the same LB.
Step 3 (cleanup)	Set `ingress-nginx.enabled: false`	Envoy	Remove Nginx Controller. Nginx LB Service is kept (resource-policy annotation).

D0 (current)

None

Nginx

Starting state

Step 1 (prepare)

Enable Envoy + envoyMigration.enabled

Nginx (unchanged)

Envoy starts alongside nginx. Nginx LB gets resource-policy: keep annotation. Envoy creates ClusterIP services (no new LB).

Step 2 (switch)

Set envoyMigration.external/internal: true

Envoy

Nginx LB selector switches to Envoy pods. TargetPorts change to 10080/10443. Traffic flows through Envoy via the same LB.

Step 3 (cleanup)

Set ingress-nginx.enabled: false

Envoy

Remove Nginx Controller. Nginx LB Service is kept (resource-policy annotation).

Step 1: Enable Envoy

Add the envoy-gateway section and the migration flag to your custom_values.yaml:

global:
  envoyMigration:
    enabled: true
    external: false
    internal: false

envoy-gateway:
  enabled: true
  installGatewayRoutes: true
  envoyService:
    external:
      enabled: true
    # Enable if you have an internal LB (1)
    # internal:
    #   enabled: true

yaml

1	Enable if you have an internal LB

Run helm upgrade:

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \ (1)
  -f custom-values.yaml

1	Replace `<VERSION>` with a valid Kloudfuse release value. See Release Notes for the latest release.

After completion, verify:

# Envoy pods should be running
kubectl get pods -n <namespace> | grep envoy

# Gateway should be Programmed: False
kubectl get gateway -n <namespace>

# Nginx should still be serving (302 redirect to login)
curl -sk https://your-hostname/ -o /dev/null -w "%{http_code}"

# Nginx LB should have keep annotation
kubectl get svc kfuse-ingress-nginx-controller -n <namespace> \
  -o jsonpath='{.metadata.annotations.helm\.sh/resource-policy}'

bash

Step 2: Switch Traffic to Envoy

Change global.envoyMigration.external to true in your custom_values.yaml. If you have an internal LB, also set internal: true:
```
global:
  envoyMigration:
    enabled: true
    external: true
    # Set to true if you have an internal Nginx LB to switch (1)
    internal: false
```
yaml
1 If you have an internal Nginx LB to switch, change to true

Run helm upgrade:

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \ (1)
  -f custom-values.yaml

1	Replace `<VERSION>` with a valid Kloudfuse release value. See Release Notes for the latest release.

After completion, verify:

# Selector should point to envoy, with targetPorts: 10080 10443
export NAMESPACE="<namespace>"
 kubectl get svc kfuse-ingress-nginx-controller -n $NAMESPACE -o jsonpath='selector: {.spec.selector}{"\n"}targetPorts: {.spec.ports[*].targetPort}{"\n"}'

# Should return 401 (Envoy auth) instead of 302 (Nginx redirect)
curl -sk https://your-hostname/ -o /dev/null -w "%{http_code}"

bash

AWS NLB health checks may take 15-30 seconds to converge after the targetPort change. This is transient.

Step 3: Disable Nginx

Set ingress-nginx.enabled to false in your custom_values.yaml:

ingress-nginx:
  enabled: false
  installIngressRules: false

yaml

Run helm upgrade:

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \ (1)
  -f custom-values.yaml

1	Replace `<VERSION>` with a valid Kloudfuse release value. See Release Notes for the latest release.

The Nginx controller pod is removed. The Nginx LB Service is not deleted because of the resource-policy: keep annotation — it continues routing traffic to Envoy pods.

Step 4: Verification

export NAMESPACE="<namespace>"

# Nginx controller pod should be gone; Envoy pods should be running
kubectl get pods -n $NAMESPACE | grep -E "nginx|envoy"

# Nginx LB service should still exist; selector and targetPorts should point to Envoy
kubectl get svc kfuse-ingress-nginx-controller -n $NAMESPACE \
  -o jsonpath='selector: {.spec.selector}{"\n"}targetPorts: {.spec.ports[*].targetPort}{"\n"}'

# Gateway should show Programmed: False
kubectl get gateway -n $NAMESPACE

# All HTTPRoutes should show Accepted and Resolved
kubectl get httproute -n $NAMESPACE

# HTTPS should return 401 (Envoy auth challenge) — not 302 (Nginx redirect)
curl -sk https://your-hostname/ -o /dev/null -w "%{http_code}\n"

bash

Rollback to Nginx

To revert from Envoy back to Nginx:

Remove the helm.sh/resource-policy annotations from the Nginx Load Balancer:

export NAMESPACE="<namespace>"
kubectl annotate svc kfuse-ingress-nginx-controller \
  helm.sh/resource-policy- -n $NAMESPACE
kubectl annotate svc kfuse-ingress-nginx-controller-internal \
  helm.sh/resource-policy- -n $NAMESPACE  # if internal LB exists

bash

Update the custom_values.yaml to disable Envoy and re-enable Nginx:

global:
  envoyMigration:
    enabled: false
    external: false
    internal: false

envoy-gateway:
  enabled: false
  installGatewayRoutes: false

ingress-nginx:
  enabled: true
  installIngressRules: true

yaml

Run helm upgrade:

helm upgrade --install kfuse oci://us-east1-docker.pkg.dev/mvp-demo-301906/kfuse-helm/kfuse \
  -n kfuse \
  --version <VERSION> \ (1)
  -f custom-values.yaml

1	Replace `<VERSION>` with a valid Kloudfuse release value. See Release Notes for the latest release.

Clean up orphaned Envoy resources:

export NAMESPACE="<namespace>"

# Delete controller-managed proxy Deployments and Services
kubectl delete deploy -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE
kubectl delete svc -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE

# Remove GatewayClass finalizers and delete them
for gc in $(kubectl get gatewayclass -o name | grep "$NAMESPACE"); do
  kubectl patch "$gc" --type=merge -p '{"metadata":{"finalizers":[]}}'
  kubectl delete "$gc"
done

bash

Verify the rollback:

export NAMESPACE="<namespace>"

# Nginx controller pod should be running; no Envoy proxy pods should remain
kubectl get pods -n $NAMESPACE | grep -E "nginx|envoy"

# Nginx LB selector should point back to Nginx pods; targetPorts should be 80/443
kubectl get svc kfuse-ingress-nginx-controller -n $NAMESPACE \
  -o jsonpath='selector: {.spec.selector}{"\n"}targetPorts: {.spec.ports[*].targetPort}{"\n"}'

# HTTPS should return 302 (Nginx login redirect) — not 401
curl -sk https://your-hostname/ -o /dev/null -w "%{http_code}\n"

# Confirm no orphaned Envoy resources remain
kubectl get all -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE
kubectl get gatewayclass | grep $NAMESPACE

bash

Uninstall

When running helm delete (or helm uninstall), the envoy-gateway controller is removed but the resources it created at runtime are not deleted by Helm. These orphaned resources must be cleaned up manually.

The Envoy Deployments, Services, and ConfigMaps created by the envoy-gateway controller (not by Helm templates). Since Helm didn’t create them, it doesn’t track or delete them.

Cleanup Steps

After helm delete kfuse -n <namespace>:

# 1. Delete controller-managed proxy Deployments and Services
export NAMESPACE="<namespace>"
kubectl delete deploy -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE
kubectl delete svc -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE
kubectl delete configmap -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE

# 2. Remove GatewayClass finalizers and delete them (cluster-scoped)
for gc in $(kubectl get gatewayclass -o name | grep "$NAMESPACE"); do
  kubectl patch "$gc" --type=merge -p '{"metadata":{"finalizers":[]}}'
  kubectl delete "$gc"
done

# 3. Verify no orphaned resources remain
kubectl get all -l app.kubernetes.io/managed-by=envoy-gateway -n $NAMESPACE
kubectl get gatewayclass | grep $NAMESPACE

bash

Shared clusters — Do NOT delete Gateway API or Envoy Gateway CRDs on clusters where other namespaces also use envoy-gateway. CRDs are cluster-scoped; deleting them removes all Gateway, HTTPRoute, and SecurityPolicy resources across all namespaces. Only delete the GatewayClass resources for your own namespace (step 2 above).

If a CRD gets stuck in a terminating state, remove the finalizer and reinstall:

kubectl patch crd <crd-name> \
  -p '{"metadata":{"finalizers":[]}}' --type=merge

bash

Then re-run the CRD install command from Prerequisites.

If using the Nginx LB service repointing (migration), the Nginx LB service will also survive helm delete due to the helm.sh/resource-policy: keep annotation. Delete it manually if no longer needed: kubectl delete svc kfuse-ingress-nginx-controller -n <namespace>

Troubleshooting

Kubernetes Gateway Programmed

If you are doing an upgrade, the envoy-gateway will says Programmed is False. you’re reusing the Nginx LB instead of creating a new one — so the Gateway object never gets an address assigned to it directly. This is expected and harmless.

Failed to install CRDS

Error Message:

Error: failed to install CRD crds/gatewayapi-crds.yaml: 10 errors occurred:
        * customresourcedefinitions.apiextensions.k8s.io "gatewayclasses.gateway.networking.k8s.io" is forbidden: ValidatingAdmissionPolicy 'safe-upgrades.gateway.networking.k8s.io' with binding 'safe-upgrades.gateway.networking.k8s.io' denied request: Installing CRDs with version before v1.5.0 is prohibited by default. Uninstall ValidatingAdmissionPolicy safe-upgrades.gateway.networking.k8s.io to install older versions.

There are two potential issues here,

The cluster already has Gateway API CRDs at v1.5.0+ and the safe-upgrades ValidatingAdmissionPolicy is blocking a downgrade. The tag you tried to install is resolving to an older version.

An older version than 1.5.0 is being accidently installed and this is there to prevent that from being a problem.

If you deleted a version of kfuse from the cluster, but you did not execute the Cleanup Steps

It will find the existing envoy-gateway resources and prevent it from being updated.

AWS NLB Hairpin

On AWS, cert-manager’s HTTP-01 self-check may fail during initial certificate issuance because pods cannot reach the external NLB from inside the cluster. This only affects the first-time cert setup — not ongoing operations. Not applicable to GCP.

If the certificate takes too long to issue, add a hostAliases entry to cert-manager that maps your domain to the envoy ClusterIP inside the cluster:

export NAMESPACE="kfuse"
export HOSTNAME="observe.example.com"

ENVOY_IP=$(kubectl get svc envoy -n $NAMESPACE \
  -o jsonpath='{.spec.clusterIP}')

helm upgrade cert-manager jetstack/cert-manager \
  --namespace $NAMESPACE \
  --set "hostAliases[0].ip=$ENVOY_IP" \
  --set "hostAliases[0].hostnames[0]=$HOSTNAME"

bash

If the envoy service is recreated (new ClusterIP), update the hostAlias with the new IP.

Shared Cluster CRD Deletion

Do NOT delete CRDs on clusters where other namespaces also use envoy-gateway. CRDs are cluster-scoped — deleting them removes all Gateway, HTTPRoute, and SecurityPolicy resources across all namespaces.

Only delete the GatewayClass resources for your own namespace.

If a CRD gets stuck in terminating state:

kubectl patch crd <crd-name> \
  -p '{"metadata":{"finalizers":[]}}' --type=merge

bash

Then re-run the CRD install command from Prerequisites.

Multi-Tenant Clusters

When multiple Kloudfuse instances share the same cluster, each namespace must scope its envoy-gateway controller:

envoy-gateway:
  config:
    envoyGateway:
      provider:
        kubernetes:
          watch:
            type: Namespaces
            namespaces:
              - <your-namespace>

yaml

The chart automatically uses a namespace-specific controllerName to prevent cross-namespace interference.

References

AWS (EKS)

GCP (GKE)