Series: Kubernetes Services Deployment, Autoscaling and Monitoring, Part 1

At NearForm we are always learning and investigating new ways to work with the open source projects we use in our client projects. The nature of Open Source projects means that they are always being upgraded and improved upon with new features and platforms popping up on a consistent basis. For this reason it is crucial for our developers to stay abreast of the latest trends, technologies and best practices.

Our DevOps community recently did some research on kubernetes services deployment, autoscaling and monitoring and we are sharing the results of their investigation here.

Deploying and Autoscaling Kubernetes with Knative
Autoscaling Kubernetes with Keda
Monitoring and Tracing Kubernetes with Otel

Kubernetes Overview

Kubernetes (K8s) is an open source platform for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community.

Kubernetes provides a powerful API that enables third-party tools to provision, secure, connect and manage workloads on any cloud or infrastructure platform. It also includes features that help you automate control plane operations such as rollout cadence, versioning and rollback.

Kubernetes is ideal for organizations that want to run containerized applications at scale.

Many organizations now run entire business functions in containers instead of traditional applications. This shift requires changes in how IT operates – from managing virtual machines to managing containers orchestrators like Kubernetes. Business leaders are demanding more agility from their IT departments to support fast-moving projects with new technologies like microservices architecture, serverless computing and cloud native computing platforms like OpenShift Container Platform or Azure Kubernetes Service (AKS).

What is Knative?

Knative is a platform that enables serverless cloud native applications to run inside Kubernetes clusters. In order to do that, Knative has tools to make the application management, build and deployment as easy as possible, so developers can focus on the code without needing to worry about setting up complex infrastructure.

It was originally created by Google with contributions from several different companies until it became an open source project hosted by the Cloud Native Computing Foundation (CNCF).

Exploring Knative

Knative offers two main solutions for serverless Kubernetes-based applications:

Knative Serving:
enables serverless workloads for applications inside Kubernetes
Knative Eventing:
enables you to use an event-driven architecture with your serverless application

This article will focus only on the Knative Serving solution.

Installation

First of all, follow the steps described here, in order to install Knative on your cluster. Choose the installation option that will best suit your needs. Be aware that, if you are installing in a pre-existing Kubernetes cluster, Knative needs to use a service mesh in order to properly function. So, if you don’t have one configured, you will need to install one (the documentation has a few examples of service meshes that you can use).

Knative

Knative Serving is the solution used to enable serverless workloads. In order for the solution to be able to define and control how serverless workloads behave and manage the underlying Kubernetes objects, Knative defines a set of Kubernetes Custom Resource Definitions (CRDs).

The Custom Resource Definitions are:

Services:
Manages the entire lifecycle of your workload and is the main resource, since it also controls the creation of the other resources and ensures that they are working properly
Routes:
Maps a network endpoint to one or more revisions
Configurations:
Resource used to maintain the desired state of your deployment by creating new a new revision when the configuration changes
Revisions:
Point-in-time snapshot of the code and the configuration

You can find more detailed information regarding the CRDs managed by Knative on its documentation: https://knative.dev/docs/serving/

Benefits of Knative

Knative comes with a collection of solutions that aims to give you a more robust way to manage your cloud native applications out of the box. It also reduces the amount of complexity in terms of how to do it. Below we listed a few benefits of using Knative in your infrastructure

Autoscaling

Knative provides autoscaling to the K8s pods managed by the Knative Services (CRD).

Knative implements an autoscaling solution called Knative Pod Autoscaler (KPA) that you can use with your applications, providing the features below:

Scale-To-Zero:
Knative uses the Knative Pod Autoscaler (KPA) by default. With KPA you can scale your application to zero pods, if the application is not receiving any traffic.
Concurrency:
You can use the Concurrency configuration, to determine how many simultaneous connections your pods can process at any given time. If the number of requests exceeds the threshold for each pod, Knative will scale up the number of pods.
Requests Per Second:
You can also use Knative to define how many requests per second each pod can handle. If the number of requests per second exceeds the threshold, Knative will scale up the number of pods.

You can also use Horizontal Pod Autoscaler (HPA) with Knative but, HPA and KPA can’t be used for the same service together. (HPA is not installed by Knative, if you want to use it, you need to install it separately). KPA is used by default, but you can control which type of autoscaler to use, through annotations in the service definitions.

For HPA:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: sample-knative-svc
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev"

For KPA:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: sample-knative-svc
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"

Using a similar approach, you can define the type of metric you want to use to autoscale your service and also determine the target to be reached in order to trigger it.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: sample-knative-svc
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"
        autoscaling.knative.dev/metric: "concurrency"
        autoscaling.knative.dev/target: "50"

Take a look here to learn more about autoscaling in Knative

Traffic Management

With this feature, you can manage routing traffic to different revisions of your configuration, by only making a few changes in a yaml file. Thanks to this, you can use a few features that would be hard to manage if you were only using Kubernetes plain objects, like:

Blue/Green Deployments:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: example-service
  namespace: default
spec:
...
  traffic:
  - percent: 0
  revisionName: example-service-v1
  tag: staging
  - percent: 40
  revisionName: example-service-v2
  - percent: 60
  revisionName: example-service-v3

You can use the following kn CLI command to split traffic between revisions:

kn service update <service-name> --traffic <revision-name>=<percent>

<service-name> is the name of the Knative Service that you are configuring traffic routing for.
<revision-name> is the name of the revision that you want to configure to receive a percentage of traffic.
<percent> is the percentage of traffic that you want to send to the revision specified by .

Example:

kn service update example-service –traffic example-service-v1=75 –traffic example-service-v2=25

Alternatively, you can use the traffic section to perform canary deployments with yaml configuration files like the example:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: example-service
  namespace: default
spec:
...
  traffic:
  - tag: current
    revisionName: example-service-v1
    percent: 75
  - tag: candidate
    revisionName: example-service-v2
    percent: 25

You can gradually update the percent values by applying the yaml file changes with the `kubectl apply` command.

With this approach you can rotate the revision versions using Canary and Blue/Green deployments.

You can also achieve the same behavior by managing another Knative resource: routes.

This approach will avoid changing the Knative Service yaml file.

Route example:

apiVersion: serving.knative.dev/v1
kind: Route
metadata:
  name: example-service-route
  namespace: default
spec:
  traffic:
  - revisionName: example-service-v1
      percent: 100 # All traffic goes to this revision

<route-name> is the name you choose for your route.
<first-revision-name> is the name of the initial Revision from the previous step.

Example:

apiVersion: serving.knative.dev/v1
kind: Route
metadata:
  name: example-service-route
  namespace: default
spec:
  traffic:
    - tag: blue
      revisionName: example-service-v1
      percent: 75
    - tag: green
      revisionName: example-service-v2
      percent: 25

Once the candidate revision is validated, you can rotate all the traffic to it and perform a Blue/Green deployment:

apiVersion: serving.knative.dev/v1
kind: Route
metadata:
  name: example-service-route
  namespace: default
spec:
  traffic:
    - tag: blue
      revisionName: example-service-v1
      percent: 0
    - tag: green
      revisionName: example-service-v2
      percent: 100

Simpler Configuration

Last but not the least, Knative enables you to provision a slightly complex setup for your application using only a few lines of code.

In this first example we define all the needed components to deploy a simple demonstration app in Kubernetes.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: sample-app
  name: sample-app
  namespace: sample-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - image: <image>
        name: sample-app-container
        ports:
        - containerPort: 8080

---

apiVersion: v1
kind: Service
metadata:
  name: sample-app-svc
spec:
  selector:
     app: sample-app
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080

---

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
 name: hpa-sample-app
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: sample-app
 minReplicas: 1
 maxReplicas: 5
 targetCPUUtilizationPercentage: 50

---

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-sample-app
spec:
  rules:
  - host: sampleapp.foo.org
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: sample-app-svc
            port:
              number: 80
  ingressClassName: nginx

As you can see, all the components need to be declared. As you add more configurations you may need to split this into multiple files, and as you add more services you need to manage and create multiple files for them as well.

In this second example, we are using Knative to achieve the same result as above.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: sample-knative-svc
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "1"
        autoscaling.knative.dev/max-scale: "5"
        autoscaling.knative.dev/initial-scale: "1"
        autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev"
        autoscaling.knative.dev/metric: "cpu"
        autoscaling.knative.dev/target: "50"
    spec:
      containers:
        - image: <image>
          ports:
          - containerPort: 8080

As you can see, in only a few lines, all the resources can be created. This is only possible because Knative creates the CRDs that are used to deploy and manage these underlying resources.

Metrics

Knative supports different popular tools for collecting metrics:

Grafana dashboards are available for metrics collected directly with Prometheus.

You can also set up the OpenTelemetry Collector to receive metrics from Knative components and distribute them to other metrics providers that support OpenTelemetry.

You can’t use OpenTelemetry Collector and Prometheus at the same time. The default metrics backend is Prometheus. See “Understanding the Collector” at knative.dev for more info.

Knative comes with pre-configured monitoring components.

In an environment with Prometheus and Grafana, the metrics can be exported to Prometheus and presented in a Grafana Dashboard.

Conclusion

If you want to use serverless applications but don’t know how to properly manage them without adding more complexity in your setup, Knative could be your answer. It implements great solutions that will help you to implement and manage these applications and improve your deployments without much effort.

Series: Deploying and Autoscaling Kubernetes with Knative

Table of contents