Logs Management on K8S Cluster using Loki, Fluentbit & Grafana

Unnati GuptaUnnati Gupta
6 min read

When you're running applications on Kubernetes, logs start piling up fast—really fast. And if you're working with multiple services or teams, figuring out how to manage all those logs without burning a hole in your cloud bill becomes a real challenge.

In this blog, I’ll share how I set up a simple but powerful logging pipeline using Fluent Bit, Grafana, Loki, and AWS S3. This setup helped me get real-time visibility into what's happening in my cluster, while also storing logs long-term in a way that’s easy on the wallet.

Grafana/Lokiis an open-source log aggregation system that supports large-scale log management. It is a widely adopted tool for organizations observability stack.

Fluent Bit is a lightweight and high-performance log forwarder designed for cloud-native environments like Kubernetes. It collects logs from containers, processes them, and routes them to destinations like Loki, Elasticsearch, or AWS S3. Its small footprint and flexible configuration make it ideal for efficient log management at scale.

Amazon S3 (Simple Storage Service) is a scalable object storage service offered by AWS, designed for storing and retrieving any amount of data from anywhere on the web. It provides secure, durable, and highly available storage for a wide range of use cases, including backups, data lakes, websites, and big data analytics. S3 integrates with other AWS services, offers fine-grained access controls, and supports lifecycle management and versioning.

Loki Setup:

To Setup Loki over the K8S Cluster, I am using Helm Chart.

Step 1: Add Grafana Repo in your system to download the helm chart.

Note: You can install grafana with loki OR can do separate setup. In My Case I have separate setup of Grafana.

helm repo add grafana https://grafana.github.io/helm-charts

Step 2: After repo add, run below command to use updated helm chart.

helm repo update

Step 3: Take the values.yaml, for installing as based on the cluster

helm show values grafana/loki > values.yaml

Step 4: Custom Values.yaml

loki:
  podAnnotations:
    kyverno.io/inject-cacerts: enabled

  podSecurityContext:
    runAsNonRoot: true
    runAsGroup: 101
    runAsUser: 101
    fsGroup: 101
    seccompProfile:
      type: RuntimeDefault

  auth_enabled: true

  analytics:
    reporting_enabled: false

  compactor:
    retention_enabled: true
    delete_request_store: s3
    retention_delete_delay: 2h

  memberlist:
    service:
       enabled: true

  compactorReplicas: 1

  limits_config:
    retention_period: 720h
    max_streams_per_user: 100000

  frontend:
    max_outstanding_per_tenant: 4096

  commonConfig:
    ring:
      kvstore:
        store: memberlist

  storage:
    filesystem: null
    s3:
      endpoint: s3.ap-south-1.amazonaws.com
      accessKeyId: "ACCESS-KEY"
      secretAccessKey: "SECRET-ACCESS-KEY"
      s3ForcePathStyle: false
      region: ap-south-1
    bucketNames:
      chunks: <bucket-name>
      ruler: <bucket-name>
      admin: <bucket-name>

  schemaConfig:
    configs:
      - from: "2025-01-01"
        index:
          period: 24h
          prefix: loki_index_
        object_store: s3
        schema: v12
        store: boltdb-shipper
      - from: "2025-03-03"
        index:
          period: 24h
          prefix: loki_index_
        object_store: s3
        schema: v13
        store: tsdb
  storage_config:
      aws:
        s3: <endpoint>
        region: <region>

  query_scheduler:
    max_outstanding_requests_per_tenant: 32768

  querier:
    max_concurrent: 16
    multi_tenant_queries_enabled: true

test:
  enabled: false

lokiCanary:
  enabled: false

gateway:
  enabled: true
  replicas: 1
  resources:
    requests:
      memory: 25Mi
      cpu: 5m
  podSecurityContext:
    fsGroup: 101
    runAsGroup: 101
    runAsNonRoot: true
    runAsUser: 101
    seccompProfile:
      type: RuntimeDefault

write:
  replicas: 2
  resources:
    limits:
      memory: 1500Mi
      cpu: 300m
    requests:
      memory: 1500Mi
      cpu: 50m
  persistence:
    size: 3Gi
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                app.kubernetes.io/component: write
                app.kubernetes.io/instance: loki
                app.kubernetes.io/name: loki
            topologyKey: kubernetes.io/hostname
  extraArgs:
    - "-config.expand-env=true"

read:
  replicas: 2
  resources:
    limits:
      memory: 700Mi
      cpu: 400m
    requests:
      memory: 256Mi
      cpu: 100m
  persistence:
    size: 5Gi
  extraArgs:
    - "-config.expand-env=true"

backend:
  replicas: 2
  resources:
    limits:
      memory: 1300Mi
      cpu: 600m
    requests:
      memory: 592Mi
      cpu: 50m
  persistence:
    size: 7Gi
    enableStatefulSetAutoDeletePVC: true
  extraArgs:
    - "-config.expand-env=true"


sidecar:
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop:
        - ALL
    readOnlyRootFilesystem: true

memcached:
  podSecurityContext:
    runAsNonRoot: true
    runAsGroup: 101
    runAsUser: 101
    seccompProfile:
      type: RuntimeDefault

resultsCache:
  enabled: true
  resources:
    limits:
      memory: 1331Mi
    requests:
      cpu: 20m
      memory: 1331Mi
  defaultValidity: 12h
  allocatedMemory: 1024

chunksCache:
  enabled: true
  resources:
    limits:
      memory: 2662Mi
    requests:
      cpu: 20m
      memory: 2662Mi
  defaultValidity: 0s
  allocatedMemory: 2048

rbac:
  namespaced: true

Step 5: Deploy the Loki

After making ready values.yaml, deploy over the cluster

helm install loki grafana/loki  -f value.yaml  -n monitoring

Step 6: Make Sure all Pod Running.

After setting up Loki, set up Fluent-bit to collect logs from the cluster and send them to Loki.

Fluent-bit Setup:

Fluent-bit is deployed as DaemonSets, allowing it to collect logs from the entire cluster.

Step 1: You can use below daemonset.yaml file:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: monitoring
  labels:
    app: fluent-bit
    version: v1
spec:
  selector:
    matchLabels:
      app: fluent-bit
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: fluent-bit
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - name: fluent-bit
        image: <fluentbit-latest-image>
        volumeMounts:
        - name: fluentbitconfigvol
          mountPath: /etc/fluent-bit/conf/
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        command: ["/fluent-bit/bin/fluent-bit"]
        args: ["-c", "/etc/fluent-bit/conf/fluent-bit.conf"]
      serviceAccountName: fluent-bit
      volumes:
      - name: fluentbitconfigvol
        configMap:
          name: fluent-bit
      - name: varlibdockercontainers
        hostPath:
           path: /var/lib/docker/containers

Step 2: Set up the configmap.yaml file to provide the configuration for Fluent-bit to connect to Loki:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit
  namespace: monitoring
  labels:
    app: fluent-bit
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        1
        Log_Level    trace
        Parsers_File parsers.conf
    @INCLUDE input-logs.conf
    @INCLUDE output-loki.conf
    @INCLUDE filters.conf

  parsers.conf: |
    [PARSER]
        Name        logs
        Format      json
        Time_Key    requestReceivedTimestamp
        Time_Format %Y-%m-%dT%H:%M:%S.%LZ
        Time_Keep   On


  input-logs.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/*/*.log
        Mem_Buf_Limit     10MB
        Skip_Long_Lines   Off
        Refresh_Interval  5
        Parser           json 

  output-loki.conf: |
    [OUTPUT]
      Name                loki
      Match               *
      Host                loki-gateway.monitoring.svc.cluster.local
      Port                80
      tls                 Off
      tls.verify          Off
      Labels              job=fluentbit

    [OUTPUT]
      Name    stdout
      Match   *
      Format  json

  filters.conf: |
    [FILTER]
        Name                kubernetes
        Match               *
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Keep_Log            Off
        Annotations         On
        Labels              On

Step 3: Apply the files and deploy Fluent Bit as a DaemonSet across the cluster.

kubectl apply -f configmap.yaml -f daemonset.yaml -n monitoring

Step 4: Make sure all pods are running and connecting to Loki.

So, Fluent-bit is set up successfully over the cluster. You can check the pod logs to verify connectivity with Grafana.

kubectl logs <fluentbit-pod-name> -n monitoring

Grafana Setup:

To set up Grafana on the cluster, I am using a Helm chart:

Step 1: Add Grafana Repo in your system to download the helm chart.

Note: You can install Grafana with Loki or set them up separately. In my case, I have set up Grafana separately.

helm repo add grafana https://grafana.github.io/helm-charts

Step 2: After repo add, run below command to use updated helm chart.

helm repo update

Step 3: You can use custom-values.yaml

replicaCount: 1

image:
  repository: grafana/grafana
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
  hosts:
    - host: grafana.yourdomain.com
      paths:
        - /
  tls: []

persistence:
  enabled: true
  storageClassName: gp2 # Use your storage class
  accessModes:
    - ReadWriteOnce
  size: 10Gi

adminUser: admin
adminPassword: admin123

datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
      - name: Loki
        type: loki
        access: proxy
        url: http://loki-gateway.monitoring.svc.cluster.local:80
        isDefault: true

dashboardProviders:
  dashboardproviders.yaml:
    apiVersion: 1
    providers:
      - name: 'default'
        orgId: 1
        folder: ''
        type: file
        disableDeletion: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards/default

dashboards:
  default:
    kubernetes-logs:
      gnetId: 15172
      revision: 1
      datasource: Loki

resources:
  limits:
    cpu: 200m
    memory: 256Mi
  requests:
    cpu: 100m
    memory: 128Mi

nodeSelector: {}

tolerations: []

affinity: {}

Step 4: Use below command to Deploy Grafana Over the Cluster:

helm install grafana grafana/grafana -f values.yaml -n monitoring

Step 5: Make Sure Pod are Running State

Step 6: To access the Grafana GUI, use the port-forward method.

kubectl port-forward svc/grafana -n monitoring 3000:3000

Step 7: Enter the credentials and log in to the GUI.

Step 8: Now, since we have already configured the DataSource in values.yaml, simply check in the GUI to see if it's working.

Go to the Hamburger menu » Click on Data Source » Loki should be listed there.

Step 9: Click on Explore and check that the logs are listed.

Conclusion :

Thank you for giving your precious time to read this blog/article, I hope it’s help you to setup monitoring system in your cluster. if any suggestions or improvements are required on my blogs feel free to connect on LinkedIn Unnati Gupta. Happy Learning 💥🙌**!!**

0
Subscribe to my newsletter

Read articles from Unnati Gupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Unnati Gupta
Unnati Gupta

👨‍💻 DevOps Architect @ Hippo Technik, LLC Passionate about bridging the gap between development and operations, I'm a dedicated DevOps Engineer at 6D Technology. With a strong belief in the power of automation, continuous integration, and continuous delivery, I thrive in optimizing software development pipelines for efficiency and reliability. 🚀 Exploring the DevOps Universe In my articles, I delve into the fascinating world of DevOps, where I share insights, best practices, and real-world experiences. From containerization and orchestration to CI/CD pipelines and infrastructure as code, I'm here to demystify the complex and empower fellow developers and ops enthusiasts. 📝 Blogging for Knowledge Sharing As a tech enthusiast and a lifelong learner, I'm committed to sharing knowledge. My articles aim to simplify complex concepts and provide practical tips that help teams and individuals streamline their software delivery processes. 🌐 Connect with Me Let's connect and explore the ever-evolving landscape of DevOps together. Feel free to reach out, comment, or share your thoughts on my articles. Together, we can foster a culture of collaboration and innovation in the DevOps community. 🔗 Social Links LinkedIn: https://www.linkedin.com/in/unnati-gupta-%F0%9F%87%AE%F0%9F%87%B3-a62563183/ GitHub: https://github.com/DevUnnati 📩 Contact Have questions or looking to collaborate? You can reach me at unnatigupta527@gmail.com Happy Learning!!