So, last week our kibana dashboard kept crashing due to random 502 errors. The errors did not happen when using kubectl port-forward so I assumed something was wrong with the load balancer config but couldn’t find anything to tweak. After respose from support it’s most likely an error due to kibana server keepalive default was lower than GCP lb as described here: https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340
The setup is:
GCP Load balancer -> Load Balancer Ingress (https via gke-managed-certs) -> Kube service -> Kube deployment
After giving up trying to change the keepalive settings for kibana which is not supported. I decided to throw an nginx proxy in front of kibana with a sidecar-container proxying requests. To insert the sidecar I added another container:
- name: kibana-sidecar-container-nginx
# Simple sidecar: nginx-keepalive
image: naturalcycles/gcp-iap-proxy:test-keepalive
ports:
- containerPort: 80
env:
- name: UPSTREAM_URL
value: "http://127.0.0.1:5601" # This points at main kibana container
This seems to have resolved all of the 502s 🎉 For many cases you can probably modify the existing deployment with better keepalive settings but this is simple to prove, generic quick-fix that’s reasonably stable.
To get the service to point to nginx instead I tested first with kubectl port-forward to the pod and when it worked, I just changed the service to go to pod port 80 instead of 5601.
Going deeper into the nginx container, it’s a very minimal nginx reverse proxy…
nginx.conf:
events {
}
http {
# Settings to play nice with GCP load balancer
keepalive_timeout 650;
keepalive_requests 10000;
server {
listen 80;
listen [::]:80;
location / {
proxy_pass $UPSTREAM_URL; # make it generic
}
}
}
And finally the docker file to include config and support env vars:
FROM nginx:alpine
COPY nginx.conf /etc/nginx/nginx.conf
CMD envsubst < /etc/nginx/nginx.conf > /etc/nginx/nginx.conf && nginx -g 'daemon off;'