In a previous article, I covered how to deploy Cloudflare Zero Trust in Kubernetes (Using Cloudflare Zero Trust to acess your private resources)
In this article we’re going to be covering how to setup Prometheus metrics and some monitors so we can be alerted if the health of our Cloudflared agent is suboptimal or experiencing issues.
cloudflared agent by default exposes a Prometheus metrics endpoint on a randomised port when launched. You can see this from the logs of the pod when it starts up.
2023-10-10T11:25:06Z INF Starting metrics server on 127.0.0.1:46255/metrics
2023-10-10T21:12:46Z INF Starting metrics server on 127.0.0.1:42605/metrics
Because of this, we can’t simply expose a specific port from our deployment via a service, we are going to need to first specify a specific port to be used so that we can configure Prometheus to scrape the endpoint.
Setting up the agent metrics
From the Cloudflare documentation for the agent, we can see that we can specify requirements for the metrics endpoint (in the format
<ip>:<port>) via an environment variable
To do this, we’ll add a new env value to our deployment that contains our runner pod as follows:
- name: cloudflared
- name: TUNNEL_METRICS
Feel free to use a different port, I used the last port that was used by the pod when it launched.
The next thing we’ll want to do is create a ClusterIP service that will expose the port to the Kubernetes cluster so it can be scraped by Prometheus.
Create a new file in your repository named
service.yaml with the following contents:
- protocol: TCP