Firstly, why do we have to do this? I’m doing a function which allow users do download a lot of data to excel file. We have an existing backoffice server which handles sidekiq background jobs. However, we can not just add a lot of jobs to run in the same instance like this. We’re able to actually launch one more instance, kind of horizontal scaling. But it’s a waste of money because users don’t use this function regularly. Use lambda? I don’t think it’s suitable for this time. Because we have a complex business logic that we’ve done already in ruby on rails code base. So we have below criterion:
- Reuse the rails source code.
- We need more resources but only in period of time. Working like aws lambda is great.
We go to conclusion that using AWS EKS Job. Rails source code is containerized and ship to EKS cluster as Kubernetes Job. But it can not be cronjob. Because the runtime depends on users. We need to launch the job if users click on download button. This job will be deleted after success. Thus, we save the resources as well as we save the cost.
Prepare ConfigMap for namespace
By the fault, EKS creates a config map name aws-auth. It maps roles and username. We get this config map by below command.
kubectl get configmap -n kube-system aws-auth -o yaml > aws-auth.yaml
There are two steps of validation while running a step function which calls EKS. It authenticates based on AWS IAM, then authenticates in internal cluster. We read the aws-auth config and see the username is “username: system:node:{{SessionName}}”.
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
- system:node-proxier
rolearn: arn:aws:iam::xxx:role/eksctl-xxx-fargate-xxx-xxx
username: system:node:{{SessionName}}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{}
creationTimestamp: "xxx-xx-xxT15:22:58Z"
name: aws-auth
namespace: kube-system
resourceVersion: "xxxxxx"
uid: xx-xxx-xxx-xxx-xxxx
Role and Permission for Step Function
We select above role rolearn for step function – this role has the permission on cluster already.
In addition, it must be assigned as well as the cluster permissions which is Kubernetes Role-Base Access Control (RBAC). We create below role run_job in kube-system namespace.
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: run_job
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
Now we have user and role. We want to bind the role to user. However, the username is “system:node:{{SessionName}}” which is changed regularly. So we bind role to Group instead.
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: bind-run-job
namespace: kube-system
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: run_job
apiGroup: rbac.authorization.k8s.io
If we look at the configmap, we found that the fargate role and username in system:nodes group. We bind RBAC role to this group. It applies to all group’s roles and users.
Finally, we create step function.
{
"Comment": "A description of my state machine",
"StartAt": "EKS RunJob",
"States": {
"EKS RunJob": {
"Type": "Task",
"Resource": "arn:aws:states:::eks:runJob.sync",
"Parameters": {
"ClusterName": "XXX",
"Namespace": "kube-system",
"CertificateAuthority": "==",
"Endpoint": "https://XXX.gr7.ap-southeast-x.eks.amazonaws.com",
"Job": {
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {
"name": "hello",
"namespace": "kube-system"
},
"spec": {
"template": {
"spec": {
"containers": [
{
"name": "hello",
"image": "XXX.dkr.ecr.ap-southeast-x.amazonaws.com/XXX:XXX-dirty",
"command": [
"/bin/sh",
"-c",
"date; rake greeting"
]
}
],
"restartPolicy": "OnFailure"
}
}
}
}
},
"End": true
}
}
}
Above instruction we used the kube-system namespace. In case the fargate profile of this namespace does not have the NAT Gateway, we can use another suitable namespace. We need to check the configmap for that namespace. If that is the new one, we should get the configmap of kube-system. We modify the namespace and apply for new one.
Furthermore, we may change the backoffLimit to proper number if the image is big and need more time to provision the pod.
Pod’s Resources
In case the tasks require more resources, we need to request the minimum resources and set limits. The key “limits” is mandatory. If not we would receive the message error of “your pod’s cpu/memory requirements exceed the max fargate configuration“. EKS auto scales the pod vertically until the limits.
{
"containers": [
{
"name": "fun-job",
"image": "999999.dkr.ecr.ap-southeast-1.amazonaws.com/xxx:job",
"command.$": "$.Command",
"resources": {
"requests": {
"memory": "1Gi",
"cpu": "250m"
},
"limits": {
"memory": "4Gi",
"cpu": "2Gi"
}
}
}
]
}
Note that the request’s cpu only accepts millicores. Can not set 2Gi. But it accept 1024m and auto calculate to 2Gi.
Auto Clean Job
We want to clean the jobs after they were finished to save the cost.
https://kubernetes.io/docs/concepts/workloads/controllers/job/#ttl-mechanism-for-finished-jobs
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-ttl
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
Workflow
Reference:
We followed this instruction https://aws.amazon.com/blogs/containers/introducing-aws-step-functions-integration-with-amazon-eks/
Step function examples https://us-west-2.console.aws.amazon.com/states/home?region=us-west-2#/sampleProjects