AWS Karpenter Hands-on
Abstract
Karpenter - Just-in-time Nodes for Any Kubernetes Cluster. It is one of Node-based autoscaling (adding or removing nodes as needed) beyond cluster autoscaler.
What does this post provide?
Hands-on installing karpenter on EKS Cluster
Provide permission for Karpenter to create AWS resources through IAM for service account (IRSA)
Create a sample karpenter provisioner to test scaling-out and scaling down nodes by karpenter
Finally what Karpenter improvements vs cluster autoscaler.
Table Of Contents
π Pre-requisite
EKS cluster
IAM Instance profile which will be assigned to EKS nodes and help them to join the EKS cluster
OIDC provider
π Create karpenter service account
The best practice to provide AWS permission for Kubernetes service is Using IAM Service Account Instead Of Instance Profile For EKS Pods
If you already set up OIDC by using IAM identity provider then you can create the IAM role as a service account for karpenter manually or using CDK. The role needs permission on EC2 actions only
kapenter-sa.ts
Then generate service-account yaml based on the output IAM role ARN
Create service account by
kubectl apply -f karpenter-sa.yaml
and then check the resultβ‘ $ kf get sa -n karpenter NAME SECRETS AGE karpenter-controller 1 17m
π Install karpenter using helm chart
Use
karpenter-values.yaml
to disable creating new serviceAccount and point to the one which is created above. ReplaceclusterName
andclusterEndpoint
with your EKS cluster-
helm repo add karpenter https://charts.karpenter.sh helm repo update helm upgrade --install karpenter karpenter/karpenter --namespace karpenter \ --version 0.8.1 \ --values yaml/karpenter-values.yaml \ --wait
Check the karpenter pod created, it is included
controller
andwebhook
containers. Describe the pod to ensure it is assigned correct serviceAccount with the proper IAM role
β‘ $ kf get pod -n karpenter
NAME READY STATUS RESTARTS AGE
karpenter-67f957c8c4-t752q 2/2 Running 0 2m56s
- Note that Karpenter provides autoscalling nodes for our services but it still needs a node to deploy itself, we run the Karpenter controller on EKS Fargate or on a worker node that belongs to a node group
π Karpenter provisioner
Provisioner - Provides options for karpenter controller create expected nodes such as instance profile, AMI family such as Bottlerocket, instance type, security group, subnet, tags, capacity type such as spot, etc.
Sample provisioner:
instanceProfile
: As described above, this profile just needs enough permissions to join the EKS cluster, eg.eks-node-role-dev
with permission'AmazonEKS_CNI_Policy', 'AmazonEKSWorkerNodePolicy', 'AmazonEC2ContainerRegistryReadOnly', 'AmazonSSMManagedInstanceCore'
Use Bottlerocket as AMI family
subnetSelector
: Specify the tag of the subnet we want to host the nodes, we should use the private subnet of the EKS VPC so ensure the private subnets have a tagEksCluster/EKSVpc/PrivateSubnet*
securityGroupSelector
: Specify the tag of the security group we want to attach to the nodesNode requirements:
spot
instances, small size as we just testtaints
: Taint the nodes if we want to control our services assigned to expected nodes and separate resources
Apply the yaml file by
kf apply -f dev-provisioner.yaml
and then check the resultβ‘ $ kf get provisioners.karpenter.sh NAME AGE dev 1m
π Test karpenter scaleout nodes
Test the karpenter provisioner we just created above by applying the deployment. First check that there's no provisioned node created yet
β‘ $ kf get node -l karpenter.sh/provisioner-name=dev No resources found
Apply the deployment
test-deployment.yaml
, note that the deployment must havetolerations
map with thetaints
if we specify in the provisioner.β‘ $ kf apply -f yaml/test-deployment.yaml deployment.apps/inflate created β‘ $ kf get pod NAME READY STATUS RESTARTS AGE inflate-5db4558c8-rtd8g 1/1 Running 0 85s inflate-5db4558c8-x8sbn 1/1 Running 0 85s $ kf get node -l karpenter.sh/provisioner-name=dev NAME STATUS ROLES AGE VERSION ip-10-0-147-5.ap-northeast-2.compute.internal Ready <none> 77s v1.22.6-eks-b18cdc9
We can check the log of karpenter controller to see how it works
2022-04-10T10:20:56.558Z INFO controller.provisioning Batched 2 pods in 1.029435847s {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:56.800Z DEBUG controller.provisioning Discovered subnets: [subnet-048931802b9fe4d68 (ap-northeast-2a) subnet-03112351e0b6460ab (ap-northeast-2c) subnet-089fe79a0c5c5247d (ap-northeast-2b)] {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:56.903Z INFO controller.provisioning Computed packing of 1 node(s) for 2 pod(s) with instance type option(s) [t3.xlarge] {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:57.057Z DEBUG controller.provisioning Discovered security groups: [sg-074f63f39cbb17d7f sg-0de828621f69e459f] {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:57.059Z DEBUG controller.provisioning Discovered kubernetes version 1.22 {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:57.155Z DEBUG controller.provisioning Discovered ami-09fd6fd7a1729123a for query /aws/service/bottlerocket/aws-k8s-1.22/x86_64/latest/image_id {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:57.327Z DEBUG controller.provisioning Created launch template, Karpenter-eks-dev-1642383919802956422 {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:58.997Z DEBUG controller.provisioning Discovered 257 EC2 instance types {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:59.083Z DEBUG controller.provisioning Discovered EC2 instance types zonal offerings {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:59.495Z INFO controller.provisioning Launched instance: i-0c33d36e900713716, hostname: ip-10-0-147-5.ap-northeast-2.compute.internal, type: t3.xlarge, zone: ap-northeast-2b, capacityType: spot {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:59.518Z INFO controller.provisioning Bound 2 pod(s) to node ip-10-0-147-5.ap-northeast-2.compute.internal {"commit": "78d3031", "provisioner": "dev"} 2022-04-10T10:20:59.518Z INFO controller.provisioning Waiting for unschedulable pods {"commit": "78d3031", "provisioner": "dev"}
Go to AWS console to check the node
π Karpenter Deprovisioning
How Karpenter nodes are deprovisioned
Node empty: Node with no pods (non-daemonset) will be deleted after
ttlSecondsAfterEmpty
Node expired: based on
ttlSecondsUntilExpired
to delete node no matter any workload. This option is to handle node upgrades with the latest AMI. We can protect the applicationβs availability by using Disruption budgetsDisruption budgets -
PodDisruptionBudget
By adding a
karpenter.sh/do-not-evict
annotation to the pod, you are instructing Karpenter to preserve the node until the Pod is terminated or thedo-not-evict
annotation is removed
In our provisioner, we specify
ttlSecondsAfterEmpty: 30
so let's delete thetest-deployment.yaml
and check karpenter scale-down the nodeβ‘ $ kf delete -f yaml/test-deployment.yaml deployment.apps "inflate" deleted β‘ $ kf get pod No resources found in default namespace. β‘ $ kf get node -l karpenter.sh/provisioner-name=dev No resources found
See log for more detail
2022-04-10T10:33:31.785Z INFO controller.node Added TTL to empty node {"commit": "78d3031", "node": "ip-10-0-147-5.ap-northeast-2.compute.internal"} 2022-04-10T10:34:01.000Z INFO controller.node Triggering termination after 30s for empty node {"commit": "78d3031", "node": "ip-10-0-147-5.ap-northeast-2.compute.internal"} 2022-04-10T10:34:01.031Z INFO controller.termination Cordoned node {"commit": "78d3031", "node": "ip-10-0-147-5.ap-northeast-2.compute.internal"} 2022-04-10T10:34:01.289Z INFO controller.termination Deleted node {"commit": "78d3031", "node": "ip-10-0-147-5.ap-northeast-2.compute.internal"}
π Troubleshooting
Error we might get and how to troubleshoot
I got the following error due to the wrong setup the
ServiceAccount
2022-03-15T14:59:22.984Z ERROR controller.provisioning Could not launch node, launching instances, getting launch template configs, getting launch templates, creating launch template, UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: u3BZrZUSziC6V7RjV0uPSiwFLaqJMFHwZIuPvJPp-p7sYQMP0EUtzbldDO7xSZD5XrdTLdFz_zxjzNhOPtnmenOgb-sEDxiMuClKprCuB4Gl2QxOJz97UYEFx5FB1Mks0qC-aoPrnMnDzKjdMlNdBsnj81UuFn9tnYBo3pIZG7A_Mwk7g6VotuOLDZl3sjZi15dyWQ6Roe1Htf_8uT2Wzw9n2WTF815us6SEA_3yh71xu4dEuPoT2CECj-RdbSgpcH6ZUi0UKDK3SRNvwLlibg8-dmvQriGC5eXRNRhQuMridNGIV4XmH1j8BN887nmgl023iZQ7XzV3Ge_5NY-MI0-Wsyk1gL57tm87n3CM23vGhTe1GIp2LvZMolVdE4IOgDn00DydMfI9VxD1KqJLNFuxwhzT0pv7JUYKR_i-H8ibmRgNAc-NXwDQBoEqk0OuSiTYBzYDsKyCgrtV4jwPHgOe9sC93GUegNJBrErc0sXSSxSR4pTqYu-Fw8CAPLKFzwph30rChTxlaONWfi-LyDpkpbV0i30soH-_vqJ244k49bs
The error of authorization status of a request was encoded so we need to decode to get more info on the error by using decode-authorization-message
~ $ aws sts decode-authorization-message --encoded-messag u3BZrZUSziC6V7RjV0uPSiwFLaqJMFHwZIuPvJPp-p7sYQMP0EUtzbldDO7xSZD5XrdTLdFz_zxjzNhOPtnmenOgb-sEDxiMuClKprCuB4Gl2QxOJz97UYEFx5FB1Mks0qC-aoPrnMnDzKjdMlNdBsnj81UuFn9tnYBo3pIG7A_Mwk7g6VotuOLDZl3sjZi15dyWQ6Roe1Htf_8uT2Wzw9n2WTF815us6SEA_3yh71xu4dEuPoT2CECj-RdbSgpcH6ZUi0UKDK3SRNvwLlibg8-dmvQriGC5eXRNRhQuMridNGIV4XmH1j8BN887nmgl023iZQ7XzV3Ge_5Y-MI0-Wsyk1gL57tm87n3CM23vGhTe1GIp2LvZMolVdE4IOgDn00DydMfI9VxD1KqJLNFuxwhzT0pv7JUYKR_i-H8ibmRgNAc-NXwDQBoEqk0OuSiTYBzYDsKyCgrtV4jwPHgOe9sC93GUegNJBrErc0sXSSxSR4pTqYu-FwCAPLKFzwph30rChTxlaONWfi-LyDpkpbV0i30soH-_vqJ244k49bs { "DecodedMessage": "{\"allowed\":false,\"explicitDeny\":false,\"matchedStatements\":{\"items\":[]},\"failures\":{\"items\":[]},\"context\":{\"principal\":{\"id\":\"AROAZUFR7JW2VHL4MT3MS:i-0fa071d72e68e7a33\",\"arn\":\"arn:aws:sts::123456789012:assumed-role/eks-node-role-dev/i-0fa071d72e68e7a33\"},\"action\":\"ec2:CreateLaunchTemplate\",\"resource\":\"arn:aws:ec2:ap-northeast-2:123456789012:launch-template/*\",\"conditions\":{\"items\":[{\"key\":\"aws:Region\",\"values\":{\"items\":[{\"value\":\"ap-northeast-2\"}]}},{\"key\":\"aws:ID\",\"values\":{\"items\":[{\"value\":\"*\"}]}},{\"key\":\"aws:Service\",\"values\":{\"items\":[{\"value\":\"ec2\"}]}},{\"key\":\"aws:Resource\",\"values\":{\"items\":[{\"value\":\"launch-template/*\"}]}},{\"key\":\"aws:Type\",\"values\":{\"items\":[{\"value\":\"launch-template\"}]}},{\"key\":\"aws:Account\",\"values\":{\"items\":[{\"value\":\"123456789012\"}]}},{\"key\":\"aws:ARN\",\"values\":{\"items\":[{\"value\":\"arn:aws:ec2:ap-northeast-2:123456789012:launch-template/*\"}]}}]}}}" }
Invalid instance profile
Error
2022-03-24T17:16:36.430Z ERROR controller.provisioning Could not launch node, launching instances, with fleet error(s), InvalidParameterValue: Value (eks-node-role-dev) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name {"commit": "78d3031", "provisioner": "dev"}
The reason in my case is that the IAM role attached to EC2 was created by CDK and the instance profile named eg.
eks-febfdbb4-6ab2-a865-acf8-2884edf78fdc
, not the role name.To fix this issue we can create a new instance profile with a role name and add it to the role
β‘ $ aws iam create-instance-profile --instance-profile-name eks-node-role-dev β‘ $ aws iam add-role-to-instance-profile --instance-profile-name eks-node-role-dev --role-name eks-node-role-dev
π Karpenter improvements
-
Designed to handle the full flexibility of the cloud: Karpenter has the ability to efficiently address the full range of instance types available through AWS. Cluster autoscaler was not originally built with the flexibility to handle hundreds of instance types, zones, and purchase options.
Group-less node provisioning: Karpenter manages each instance directly, without using additional orchestration mechanisms like node groups. This enables it to retry in milliseconds instead of minutes when capacity is unavailable. It also allows Karpenter to leverage diverse instance types, availability zones, and purchase options without the creation of hundreds of node groups.
Scheduling enforcement: Cluster autoscaler doesnβt bind pods to the nodes it creates. Instead, it relies on the kube-scheduler to make the same scheduling decision after the node has come online. A node that Karpenter launches has its pods bound immediately. The kubelet doesnβt have to wait for the scheduler or for the node to become ready. It can start preparing the container runtime immediately, including pre-pulling the image. This can shave seconds off of node startup latency.
π Conclusion
At the time of writing this post, Karpenter is still on its way to add/improve many features, check more here
In practice, we can create multiple karpenter provisioners in following usecases
When different teams are sharing a cluster and need to run their workloads on different worker nodes with specific request resources or AMI such as Bottlerocket
when creating multiple vclusters and would like to separate resources between vclusters.
In general, using multiple provisioners makes sure that the most appropriate assets are available to each team
References:
Subscribe to my newsletter
Read articles from Vu Dao directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Vu Dao
Vu Dao
π AWSome Devops | AWS Community Builder | AWS SA || βοΈ CloudOpz βοΈ