Kubernetes Logging with Fluent Bit and Grafana Loki (2026)

Elasticsearch was the default log store for Kubernetes for years — powerful, but operationally expensive. Loki (Grafana Labs, CNCF) takes a different approach: it indexes only log labels (not the full log content), dramatically reducing storage and compute costs at the price of slower full-text search. For most Kubernetes use cases — finding logs for a specific pod, filtering by namespace, correlating logs with traces — Loki's label-based model is the right trade-off.

Fluent Bit (CNCF) is the collection agent: a small DaemonSet that reads container log files from each node and forwards them to Loki (or to Elasticsearch, CloudWatch, S3, or any other output). It's the successor to Fluentd with a much lower memory footprint — 650KB binary, ~50MB memory in production.

Kubernetes Log Architecture

Container stdout/stderr is written to /var/log/pods/<namespace>_<pod>_<uid>/<container>/ on each node. The kubelet creates symlinks at /var/log/containers/<pod>_<namespace>_<container>-<containerID>.log pointing to these files. Fluent Bit reads the symlinks at /var/log/containers/*.log as its input source.

Container (stdout/stderr)
    ↓
Node filesystem: /var/log/containers/*.log
    ↓
Fluent Bit (DaemonSet, one per node)
    ↓ tail input, kubernetes filter (enrich with pod metadata)
Loki (stores with labels: namespace, pod, container, node, app)
    ↓
Grafana (LogQL queries, dashboards, alerting)

Each log line is enriched by Fluent Bit's kubernetes filter with pod metadata from the Kubernetes API: pod name, namespace, container name, pod labels, and node name. This enrichment is what makes Kubernetes log exploration ergonomic — you can filter by app=payments-api without any log-format knowledge.

Installation

Loki (Distributed Mode via Helm)

For production, use Loki in distributed mode (separate read/write/backend microservices) rather than the monolithic single binary:

bash

1helm repo add grafana https://grafana.github.io/helm-charts
2helm repo update
3
4helm install loki grafana/loki \
5  --namespace monitoring \
6  --create-namespace \
7  --values loki-values.yaml

yaml

1# loki-values.yaml — distributed mode with S3 backend
2loki:
3  auth_enabled: false    # Disable multi-tenancy for single-cluster deployments
4                         # Enable and use X-Scope-OrgID header for multi-tenant
5
6  storage:
7    type: s3
8    s3:
9      region: us-east-1
10      bucketnames: my-org-loki-logs
11      s3ForcePathStyle: false   # Use virtual-hosted-style (path style is deprecated)
12  limits_config:
13    retention_period: 744h    # 31 days
14    max_streams_per_user: 100000
15    ingestion_rate_mb: 50
16    ingestion_burst_size_mb: 100
17
18deploymentMode: Distributed
19
20ingester:
21  replicas: 3
22
23querier:
24  replicas: 2
25
26queryFrontend:
27  replicas: 2
28
29distributor:
30  replicas: 2
31
32compactor:
33  replicas: 1
34  persistence:
35    enabled: true
36    storageClass: gp3
37    size: 20Gi
38
39# ServiceAccount with IRSA for S3 access
40serviceAccount:
41  annotations:
42    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/loki-s3-role

For development or small clusters, the loki-stack chart installs a single-binary Loki alongside Promtail (simpler but not production-grade):

bash

helm install loki-stack grafana/loki-stack \
  --namespace monitoring \
  --set promtail.enabled=false   # Use Fluent Bit instead
  --set loki.persistence.enabled=true \
  --set loki.persistence.storageClassName=gp3 \
  --set loki.persistence.size=50Gi

Fluent Bit

bash

helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

helm install fluent-bit fluent/fluent-bit \
  --namespace monitoring \
  --values fluent-bit-values.yaml

Fluent Bit Configuration

yaml

1# fluent-bit-values.yaml
2config:
3  inputs: |
4    [INPUT]
5        Name              tail
6        Path              /var/log/containers/*.log
7        Exclude_Path      /var/log/containers/*_kube-system_*.log   # Exclude kube-system (tune as needed)
8        Refresh_Interval  5
9        Mem_Buf_Limit     50MB
10        Skip_Long_Lines   On
11        Tag               kube.*
12        multiline.parser  cri,docker   # Support both CRI (containerd) and Docker log formats
13
14  filters: |
15    # Enrich log records with Kubernetes metadata (pod name, namespace, labels, etc.)
16    [FILTER]
17        Name                kubernetes
18        Match               kube.*
19        Kube_URL            https://kubernetes.default.svc:443
20        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
21        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
22        Kube_Tag_Prefix     kube.var.log.containers.
23        Merge_Log           On       # Merge JSON-formatted log messages into structured fields
24        Merge_Log_Key       log_processed
25        K8S-Logging.Parser  On       # Read parser annotation from pod spec
26        K8S-Logging.Exclude On       # Allow pods to exclude themselves from logging
27
28    # Drop health check and readiness probe logs (high volume, low value)
29    [FILTER]
30        Name  grep
31        Match kube.*
32        Exclude log .*kube-probe.*
33        Exclude log .*health.*
34
35    # Add cluster name label for multi-cluster Loki tenancy
36    [FILTER]
37        Name  record_modifier
38        Match kube.*
39        Record cluster production-us-east-1
40
41  outputs: |
42    [OUTPUT]
43        Name              loki
44        Match             kube.*
45        Host              loki-gateway.monitoring.svc.cluster.local
46        Port              80
47        Labels            job=fluentbit, cluster=$kubernetes.cluster, namespace=$kubernetes.namespace_name, pod=$kubernetes.pod_name, container=$kubernetes.container_name, node=$kubernetes.host, app=$kubernetes.labels.app
48        Label_keys        $kubernetes.pod_name,$kubernetes.namespace_name
49        Remove_keys       kubernetes,stream,time   # Remove fields included in Loki labels
50        Line_Format       json
51        Retry_Limit       False   # Buffer to disk and retry indefinitely (no log loss)
52
53  service: |
54    [SERVICE]
55        Flush         1
56        Daemon        Off
57        Log_Level     warn
58        Parsers_File  /fluent-bit/etc/parsers.conf
59        HTTP_Server   On      # Exposes /api/v1/health and Prometheus metrics at :2020
60        HTTP_Listen   0.0.0.0
61        HTTP_Port     2020
62        storage.path  /var/log/flb-storage/   # Disk buffering for backpressure
63        storage.sync  normal
64        storage.checksum Off
65        storage.max_chunks_up 128
66
67# DaemonSet — one Fluent Bit pod per node
68daemonSetVolumes:
69  - name: varlog
70    hostPath:
71      path: /var/log
72  - name: flb-storage
73    hostPath:
74      path: /var/log/flb-storage
75
76daemonSetVolumeMounts:
77  - name: varlog
78    mountPath: /var/log
79  - name: flb-storage
80    mountPath: /var/log/flb-storage
81
82resources:
83  requests:
84    cpu: 50m
85    memory: 64Mi
86  limits:
87    cpu: 200m
88    memory: 256Mi
89
90tolerations:
91  - key: node-role.kubernetes.io/master
92    effect: NoSchedule
93  - operator: Exists     # Run on all nodes including tainted ones

Loki Label Design

Loki is not Elasticsearch — don't try to index every log field as a label. Labels are used to partition log streams; high-cardinality labels (pod name, trace ID, request ID) create millions of streams and degrade performance.

Good labels (low cardinality, always useful for filtering):

namespace — filter by team or environment
app or service — filter by application
cluster — for multi-cluster setups
container — distinguish sidecar logs

Bad labels (high cardinality — put these in the log line, not labels):

pod — hundreds of pod IDs per application
trace_id — every request is unique
request_id
user_id

For structured log fields you want to search (but not label), use Loki's line_format and logfmt parser — Loki 3.x's Bloom filters make unindexed field searches significantly faster.

LogQL Querying

LogQL is Loki's query language. It combines label selectors (like PromQL) with log filtering operations:

logql

1# All logs from the payments namespace
2{namespace="production", app="payments-api"}
3
4# Filter by log level (string match)
5{namespace="production", app="payments-api"} |= "ERROR"
6
7# Structured log parsing (logfmt)
8{namespace="production", app="payments-api"}
9  | logfmt
10  | level="error"
11  | duration > 1s
12
13# JSON log parsing
14{namespace="production", app="payments-api"}
15  | json
16  | status_code >= 500
17
18# Count errors per minute (metric query)
19sum(rate({namespace="production"} |= "ERROR" [1m])) by (app)
20
21# Recent error logs for a specific pod (use limit parameter in API or Grafana line limit)
22{pod="payments-api-xxxxx-yyyyy"} |= "ERROR"

Alerting on Logs

Warning: The AlertingRule CRD requires the Loki Operator (a separate Kubernetes operator). It is NOT part of the standard grafana/loki Helm chart installed earlier in this guide. For Helm-based Loki deployments, configure ruler alerting via the ruler: section in your loki-values.yaml instead.

Loki supports alerting rules similar to Prometheus:

yaml

1# Loki rule group — alert on application errors
2# Loki Operator uses loki.grafana.com/v1 AlertingRule (not PrometheusRule)
3apiVersion: loki.grafana.com/v1
4kind: AlertingRule
5metadata:
6  name: application-log-alerts
7  namespace: monitoring
8spec:
9  groups:
10    - name: application-errors
11      interval: 1m
12      rules:
13        - alert: HighErrorRate
14          expr: |
15            sum(rate({namespace="production"} |= "ERROR" [5m])) by (app) > 1
16          for: 2m
17          labels:
18            severity: warning
19          annotations:
20            summary: "High error log rate for {{ $labels.app }}"
21
22        - alert: OOMKillDetected
23          expr: |
24            sum(count_over_time({namespace="production"} |= "OOMKilled" [5m])) by (pod) > 0
25          for: 0m
26          labels:
27            severity: critical

Frequently Asked Questions

Fluent Bit vs Promtail — which should I use?

Promtail is Loki-native (same team, tight integration, automatic pod discovery), but it's Loki-only — it can't forward to Elasticsearch or other backends. Fluent Bit supports multiple outputs and has lower memory usage. For Loki-only deployments, either works well. For multi-destination logging (Loki for recent logs, S3 for long-term archive, CloudWatch for AWS-native tooling), Fluent Bit's multi-output support is preferable.

How do I parse multi-line logs (stack traces, JSON objects)?

yaml

# Fluent Bit multiline parsing for Java stack traces
[INPUT]
    Name              tail
    Path              /var/log/containers/*_production_*.log
    Tag               kube.*
    multiline.parser  java,cri

For custom multiline patterns, define a custom parser:

yaml

[MULTILINE_PARSER]
    name          custom-go-panic
    type          regex
    flush_timeout 1000
    rule "start_state" "/(goroutine \d+)/gm" "go_state"
    rule "go_state"    "/^(\s+)/gm" "go_state"

How do I control log volume costs?

Exclude noisy logs at collection time — Fluent Bit grep filter to drop health checks, probe logs, and debug-level logs in production
Aggregate rather than stream — for high-volume structured logs, use Fluent Bit's rewrite_tag and throttle filters
Reduce retention — Loki supports per-stream retention (shorter for debug, longer for error)
Use S3 instead of block storage — Loki's S3 backend is significantly cheaper than SSD-backed block storage for log data

For OpenTelemetry-based trace correlation with logs, see OpenTelemetry Instrumentation Guide. For the Prometheus-based metrics side of the observability stack, see SLOs, Error Budgets, and Burn Rate Alerts.

Setting up a production logging stack for a Kubernetes cluster? Talk to us at Coding Protocols — we help platform teams design log collection pipelines that balance observability with cost and operational simplicity.

Kubernetes Logging with Fluent Bit and Grafana Loki

Kubernetes Log Architecture

Installation

Loki (Distributed Mode via Helm)

Fluent Bit

Fluent Bit Configuration

Loki Label Design

LogQL Querying

Alerting on Logs

Frequently Asked Questions

Fluent Bit vs Promtail — which should I use?

How do I parse multi-line logs (stack traces, JSON objects)?

How do I control log volume costs?

Related Topics

Read Next

Kubernetes Logging: Fluent Bit and Grafana Loki

OpenTelemetry Collector: Unified Telemetry Pipeline for Kubernetes

Prometheus and Grafana on Kubernetes: Production Monitoring Stack