Monitoring Cloudflared with Prometheus

tjtharrison
6 min readNov 6, 2023

In a previous article, I covered how to deploy Cloudflare Zero Trust in Kubernetes (Using Cloudflare Zero Trust to acess your private resources)

In this article we’re going to be covering how to setup Prometheus metrics and some monitors so we can be alerted if the health of our Cloudflared agent is suboptimal or experiencing issues.

Photo by Markus Winkler on Unsplash

Default behaviour

The cloudflared agent by default exposes a Prometheus metrics endpoint on a randomised port when launched. You can see this from the logs of the pod when it starts up.

First pod:

2023-10-10T11:25:06Z INF Starting metrics server on 127.0.0.1:46255/metrics

Second pod:

2023-10-10T21:12:46Z INF Starting metrics server on 127.0.0.1:42605/metrics

Because of this, we can’t simply expose a specific port from our deployment via a service, we are going to need to first specify a specific port to be used so that we can configure Prometheus to scrape the endpoint.

Setting up the agent metrics

From the Cloudflare documentation for the agent, we can see that we can specify requirements for the metrics endpoint (in the format <ip>:<port>) via an environment variable TUNNEL_METRICS

To do this, we’ll add a new env value to our deployment that contains our runner pod as follows:

spec:
template:
spec:
containers:
- name: cloudflared
env:
- name: TUNNEL_METRICS
value: "0.0.0.0:49500"

Feel free to use a different port, I used the last port that was used by the pod when it launched.

The next thing we’ll want to do is create a ClusterIP service that will expose the port to the Kubernetes cluster so it can be scraped by Prometheus.

Create a new file in your repository named service.yaml with the following contents:

apiVersion: v1
kind: Service
metadata:
name: cloudflared-metrics
namespace: cloudflare
spec:
selector:
app: cloudflared
ports:
- protocol: TCP
port: 49500
targetPort: 49500

--

--