How We Built a Real-Time Kubernetes UI Using Informers

When we started building Podscape, the first version polled the Kubernetes API every three seconds. It worked. It was also wrong — stale state between polls, unnecessary load on the API server, and the UI felt sluggish compared to what engineers expected from a native tool.

The fix was informers. This is what we learned building a production Kubernetes UI on top of them.

The Problem With Polling

Every kubectl get pods --watch you've ever run is doing something smarter than your typical dashboard. It's not refreshing. It's watching.

Polling has three real problems for a UI:

Staleness. A pod crashes at t=0. Your poll fires at t=2.9s. The user is looking at stale state for nearly three seconds. In an incident, that's three seconds of confusion.

API server load. Multiply your poll interval by your user count and your resource types. A small dashboard hitting pods, deployments, services, events, and nodes every three seconds across ten clusters is making thousands of requests per minute to API servers that have real work to do.

Noise. Polling gives you the current state, not what changed. You either diff everything client-side or you miss transitions entirely — a pod that crashed and restarted between two polls looks healthy.

Informers solve all three.

What Informers Actually Are

An informer in client-go is a combination of three things:

A List call — fetch the current state of all resources on startup
A Watch stream — keep an HTTP/2 connection open and receive events as resources change
A local in-memory cache — so your application reads from local state, not the API server

The key insight: after the initial list, your application never hits the API server again for reads. It reads from a local cache that's kept current by the watch stream. This is how kubectl get with --watch is so fast — and it's how we built Podscape.


go
1import (
2    "k8s.io/client-go/informers"
3    "k8s.io/client-go/tools/cache"
4)
5
6factory := informers.NewSharedInformerFactory(clientset, 0)
7podInformer := factory.Core().V1().Pods()
8
9podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
10    AddFunc: func(obj interface{}) {
11        pod := obj.(*v1.Pod)
12        broadcastPodEvent("ADDED", pod)
13    },
14    UpdateFunc: func(oldObj, newObj interface{}) {
15        pod := newObj.(*v1.Pod)
16        broadcastPodEvent("MODIFIED", pod)
17    },
18    DeleteFunc: func(obj interface{}) {
19        pod := obj.(*v1.Pod)
20        broadcastPodEvent("DELETED", pod)
21    },
22})
23
24stopCh := make(chan struct{})
25factory.Start(stopCh)
26factory.WaitForCacheSync(stopCh)

factory.WaitForCacheSync is the part everyone forgets. Without it, your handlers can fire before the initial list is complete and you'll serve partial state.

SharedInformer vs. Individual Informers

The Shared in SharedInformerFactory matters. If you naively create one informer per controller per resource type, each one opens its own watch connection. For a UI watching pods, deployments, replicasets, services, events, nodes, ingresses, and configmaps — that's eight watch streams per cluster, per user.

SharedInformerFactory ensures a single watch stream per resource type regardless of how many parts of your application are listening. Multiple event handlers share one stream.


go
// Both of these share a single watch connection for pods
podInformer1 := factory.Core().V1().Pods().Informer()
podInformer2 := factory.Core().V1().Pods().Informer() // same underlying informer

In Podscape, we register multiple handlers on the same informer — one to update the in-memory cache, one to push events to connected frontend clients via IPC.

Bridging to the Frontend

Informers run in your backend process. Getting events to the UI is a separate problem. In Podscape (an Electron app), we use Electron's IPC to push events from the Go backend to the renderer process:


go
1// Go backend sends events over a WebSocket to the Electron main process
2func broadcastPodEvent(eventType string, pod *v1.Pod) {
3    event := PodEvent{
4        Type:      eventType,
5        Name:      pod.Name,
6        Namespace: pod.Namespace,
7        Phase:     string(pod.Status.Phase),
8        Ready:     isPodReady(pod),
9        Restarts:  getRestartCount(pod),
10        Node:      pod.Spec.NodeName,
11        Age:       pod.CreationTimestamp.Time,
12    }
13    eventBus.Publish("pod.changed", event)
14}

The frontend subscribes to pod.changed and updates React state. The result is a UI that reflects reality within milliseconds of a change happening in the cluster — without a single poll.

The Resync Interval

The 0 in NewSharedInformerFactory(clientset, 0) is the resync period. Setting it to zero disables periodic resync. A non-zero value (e.g., 30 * time.Second) triggers UpdateFunc for every object in the cache at that interval, even if nothing changed.

We initially set it to 30 seconds thinking it was a safety net. It generated thousands of spurious update events that hammered our event bus and caused the UI to re-render constantly. Set it to zero unless you have a specific reason not to.


go
// Resync every 10 minutes as a safety net — not every 30 seconds
factory := informers.NewSharedInformerFactory(clientset, 10*time.Minute)

If you need a resync, 10 minutes is a more reasonable floor for a UI.

Namespace Scoping

By default, SharedInformerFactory watches all namespaces. For large clusters this means your informer caches every pod across every namespace. On a cluster with thousands of pods across 50 namespaces, this is a lot of memory and a lot of events.

Scope to the namespace the user is currently viewing:


go
factory := informers.NewSharedInformerFactoryWithOptions(
    clientset,
    10*time.Minute,
    informers.WithNamespace("production"),
)

In Podscape we maintain a factory per (cluster, namespace) pair and tear them down when the user switches context. This keeps memory bounded and avoids processing events the user doesn't care about.

What Goes Wrong in Production

Watch connection drops silently. Kubernetes API servers close watch connections after a timeout (typically around 5 minutes). client-go handles reconnection automatically, but it re-lists first. If your list takes 10 seconds on a large cluster, your UI goes stale during that window. We surfaced a "reconnecting..." indicator in Podscape rather than silently serving potentially stale state.

ResourceVersion tracking breaks on API server restart. The watch stream relies on ResourceVersion to resume from where it left off. If the API server restarts and prunes its event history, the client gets a 410 Gone and must restart from a fresh list. client-go handles this, but you'll see a spike in memory allocation as it re-lists. Watch your memory metrics after cluster upgrades.

DeleteFunc receives a tombstone, not the object. When an informer misses a delete event (because it was reconnecting), it delivers a cache.DeletedFinalStateUnknown tombstone instead of the actual object. Type-asserting directly to *v1.Pod panics:


go
1// Wrong — panics on tombstone
2DeleteFunc: func(obj interface{}) {
3    pod := obj.(*v1.Pod) // panic: interface conversion fails
4}
5
6// Right — handle tombstone
7DeleteFunc: func(obj interface{}) {
8    pod, ok := obj.(*v1.Pod)
9    if !ok {
10        tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
11        if !ok {
12            return
13        }
14        pod, ok = tombstone.Obj.(*v1.Pod)
15        if !ok {
16            return
17        }
18    }
19    broadcastPodEvent("DELETED", pod)
20}

This is documented in client-go but easy to miss. It will eventually panic in production.

Events resource is noisy. Kubernetes Events are a high-volume resource. A single pod crash can generate dozens of events in seconds. We filtered events to Warning type and deduplicated by reason + involvedObject.name before broadcasting to the frontend. Without this, the UI update queue backs up during incidents — exactly when you need it to be fast.

Multi-cluster connection management. Informers don't know about cluster identity. When you're watching three clusters simultaneously, your event handlers need to tag events with the cluster they came from. We learned this the hard way after shipping a version where pod events from cluster B were updating the pod list for cluster A.

Key Takeaways

Use WaitForCacheSync before serving any state — partial cache is worse than no cache
Set resync to zero or a long interval (10+ minutes) — periodic resync is not a free safety net
Always handle the tombstone case in DeleteFunc — it will panic in production eventually
Scope informers to the namespace the user is watching — all-namespace watching doesn't scale
Tag every event with cluster identity before it enters your event bus if you're multi-cluster

How We Built a Real-Time Kubernetes UI Using Informers

The Problem With Polling

What Informers Actually Are

SharedInformer vs. Individual Informers

Bridging to the Frontend

The Resync Interval

Namespace Scoping

What Goes Wrong in Production

Key Takeaways

Related Topics

Read Next

Introducing Podscape: The Premium Kubernetes IDE for Human Beings

The Kubernetes Periodic Table: Every Essential Tool Category Explained

Service Mesh Showdown: Istio vs Linkerd in 2026