Skip to content

Running KIAM on EKS

EKS is Amazon’s managed Kubernetes offering. Running Kubernetes on AWS is great! We can dynamically deploy all of the workloads that we want to use on Kubernetes and have all of the provisioned services like RDS or Elasticache that we want as well. Best of both worlds!

Naturally, in order for our pods running on kubernetes to interact with AWS services, we want to assign those pods AWS IAM roles. For this purpose, US Switch has created a tool called KIAM which uses an agent/server model to intercept calls to the AWS API from individual pods, request credentials from the AWS STS on behalf of the pod making the API call, and then pass those credentials back to the pod, thus allowing the pod to assume an IAM role.

This is a tutorial on configuring the KIAM tool and using it on EKS

KIAM server requirements:

The KIAM Server is the process that requests the STS credential on the behalf of the pod. KIAM is designed to be run on your kubernetes master nodes. EKS manages the Kubernetes master processes for you, which is wonderful and cost effective, but prevents users from deploying their own pods to master nodes. Therefore we recommend standing up one or more individual instances specifically for the purpose of hosting the KIAM Server pods.

Requirements for the EC2 instance autoscaling group

What we are asking the user to do is create a set of dedicated EC2 instances for the purposes of hosting the KIAM server processes.

  • Bootstrap arguments to label the instances. The code below can be added to the instances /usr/data:
  • /etc/eks/bootstrap.sh ${EKSClusterName} ${BootstrapArguments} \
    --kubelet-extra-args '--node-labels=kiam-server=true --register-with-taints=kiam-server=false:NoExecute'

After adding this code to the instance’ /usr/data, the steps below will ensure that the instances can host the KIAM server process.

  • Security group must allow access on port 443 from all nodes in the EKS cluster.
  • Create an IAM role + instance profile for the KIAM server nodes. Below is some Terraform code that creates an appropriate role, policy and instance profile for the instance.
  • The role must have trust relationship with Service:EC2. In the code below this is created by the assume_role_policy document. This section allows EC2 instances to assume the role.
  • The Role Policy allows the STS assumeRole action by the kiam_server role.

# IAM Role
resource "aws_iam_role" "server_node" {
 name        = "server_node"
 description = "Role for the Kiam Server instance profile"

 assume_role_policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": { "Service": "ec2.amazonaws.com"},
     "Action": "sts:AssumeRole"
   }
 ]
}
EOF
}

# IAM Role Policy
resource "aws_iam_role_policy" "server_node" {
 name = "server_node"
 role = "${aws_iam_role.server_node.name}"

 policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
     ],
     "Resource": "${aws_iam_role.kiam_server.arn}"
   }
 ]
}
EOF
}

# IAM Instance Profile
resource "aws_iam_instance_profile" "server_node" {
 name = "server_node"
 role = "${aws_iam_role.server_node.name}"
}

Finally, Update EKS ConfigMap to allow kiam server node role to join the cluster. This is what you have to do for any EC2 instances that you want to add to your EKS cluster. Below is a Kubernetes config file that adds the role we created above to the aws-auth ConfigMap in Kubernetes.

---
apiVersion: v1
kind: ConfigMap
metadata:
 name: aws-auth
 namespace: kube-system
data:
 mapRoles: |
   # This is the role created when the EKS cluster is created.
   - rolearn: arn:aws:iam::<ACCOUNT_NUMBER>:role/eksctl-user-test-20190715-nodegro-NodeInstanceRole-LV49BLRVFQT3
     username: system:node:
     groups:
       - system:bootstrappers
       - system:nodes
   # The following section is added to allow your KIAM master nodes to join the EKS cluster
   - rolearn: arn:aws:iam::<ACCOUNT_NUMBER>:role/server_node
     username: system:node:
     groups:
       - system:bootstrappers
       - system:nodes

Create KIAM Server DaemonSet

The KIAM server Daemonset is the set of gatekeeper pods that request STS credentials on behalf of the worker pods. Below is a config file that creates an appropriate DaemonSet. Before that, here are the important points to review in the file.

The KIAM server pods also require an AWS role they use to request other roles on behalf of individual worker pods. Below is Terraform code that creates an appropriate role for the KIAM server. The role below allows the KIAM server to assume any role in your AWS account.

# IAM Role
resource "aws_iam_role" "kiam_server" {
 name        = "kiam-server"
 description = "Role the Kiam Server process assumes"

 assume_role_policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "Service": "ec2.amazonaws.com"
     },
     "Action": "sts:AssumeRole"
   },
   {
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "AWS": "${aws_iam_role.server_node.arn}"
     },
     "Action": "sts:AssumeRole"
   }
 ]
}
EOF
}

# IAM Policy
resource "aws_iam_policy" "kiam_server_policy" {
 name = "kiam_server_policy"
 description = "Policy for the Kiam Server process"

 policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
     ],
     "Resource": "*"
   }
 ]
}
EOF
}

# IAM Policy Attachment
resource "aws_iam_policy_attachment" "server_policy_attach" {
 name       = "kiam-server-attachment"
 roles      = ["${aws_iam_role.kiam_server.name}"]
 policy_arn = "${aws_iam_policy.kiam_server_policy.arn}"
}

Before creating the KIAM server DaemonSet, create the server RBAC Cluster roles and bindings. The official project provides a config file that creates the appropriate Service Account, ClusterRole and ClusterRoleBindings.

Below is a configuration file that creates an appropriate DaemonSet. There are several important steps to be sure that your KIAM server DaemonSet will work on your infrastructure.

  • Update SSL certs volume to match the relevant OS distribution.
  • nodeSelector match labels
  • Toleration to bypass the node taints
  • Default logging is not very verbose, so we have set env variables to increase the logging verbosity
  • Create the kiam-server-tls secret
  • Create a ca + certificates for the server
  • create secret documented in docs/TLS.md

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
 namespace: kube-system
 name: kiam-server
spec:
 template:
   metadata:
     annotations:
       prometheus.io/scrape: "true"
       prometheus.io/port: "9620"
     labels:
       app: kiam
       role: server
   spec:
     serviceAccountName: kiam-server
     nodeSelector:
       kiam-server: "true"
     tolerations:
     - key: kiam-server
       value: "false"
       effect: NoExecute
       operator: Equal
     volumes:
       - name: ssl-certs
         hostPath:
           # for AWS linux or RHEL distros
           path: /etc/pki/ca-trust/extracted/pem/
           # debian or ubuntu distros
           # path: /etc/ssl/certs
           # path: /usr/share/ca-certificates
       - name: tls
         secret:
           secretName: kiam-server-tls
     containers:
       - name: kiam
         image: quay.io/uswitch/kiam:v3.2 # USE A TAGGED RELEASE IN PRODUCTION
         imagePullPolicy: Always
         command:
           - /kiam
         args:
           - server
           - --json-log
           - --level=debug
           - --bind=0.0.0.0:443
           - --cert=/etc/kiam/tls/server.pem
           - --key=/etc/kiam/tls/server-key.pem
           - --ca=/etc/kiam/tls/ca.pem
           - --role-base-arn-autodetect
           - --assume-role-arn=arn:aws:iam::<ACCOUNT_NUMBER>:role/kiam-server
           - --sync=1m
           - --prometheus-listen-addr=0.0.0.0:9620
           - --prometheus-sync-interval=5s
         volumeMounts:
           - mountPath: /etc/ssl/certs
             name: ssl-certs
           - mountPath: /etc/kiam/tls
             name: tls
         livenessProbe:
           exec:
             command:
             - /kiam
             - health
             - --cert=/etc/kiam/tls/server.pem
             - --key=/etc/kiam/tls/server-key.pem
             - --ca=/etc/kiam/tls/ca.pem
             - --server-address=127.0.0.1:443
             - --gateway-timeout-creation=1s
             - --timeout=5s
           initialDelaySeconds: 10
           periodSeconds: 10
           timeoutSeconds: 10
         readinessProbe:
           exec:
             command:
             - /kiam
             - health
             - --cert=/etc/kiam/tls/server.pem
             - --key=/etc/kiam/tls/server-key.pem
             - --ca=/etc/kiam/tls/ca.pem
             - --server-address=127.0.0.1:443
             - --gateway-timeout-creation=1s
             - --timeout=5s
           initialDelaySeconds: 3
           periodSeconds: 10
           timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
 name: kiam-server
 namespace: kube-system
spec:
 clusterIP: None
 selector:
   app: kiam
   role: server
 ports:
 - name: grpclb
   port: 443
   targetPort: 443
   protocol: TCP

Create the KIAM agent DaemonSet

The KIAM agent DaemonSet deploys the pods that run on every other Kubernetes worker node and intercepts call to the AWS API and routes them through to the KIAM server.

The two DaemonSets require separate sets of certificates to ensure that the agents can communicate securely with the server.

  • Create the kiam-server-tls secret
  • Create a certificates for the agent
  • Create secret documented in docs/TLS.md

Below is a configuration file for the KIAM agent DaemonSet, be sure that the following sections match your specific infrastructure:

  • No specific nodeSelector or toleration is necessary, if there are any for any other reason we just need to make sure that the agents don’t run on the Kiam server nodes.
  • Update SSL certs volume to match the relevant OS distribution
  • Default logging is not very verbose, so I have set ENV variables to increase the logging verbosity
  • Update the host-interface flag for the agent to !eth0

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
 namespace: kube-system
 name: kiam-agent
spec:
 template:
   metadata:
     annotations:
       prometheus.io/scrape: "true"
       prometheus.io/port: "9620"
     labels:
       app: kiam
       role: agent
   spec:
     hostNetwork: true
     dnsPolicy: ClusterFirstWithHostNet
     nodeSelector: {}
     volumes:
       - name: ssl-certs
         hostPath:
           # for AWS linux or RHEL distros
           path: /etc/pki/ca-trust/extracted/pem/
           # debian or ubuntu distros
           # path: /etc/ssl/certs
           # path: /usr/share/ca-certificates
       - name: tls
         secret:
           secretName: kiam-agent-tls
       - name: xtables
         hostPath:
           path: /run/xtables.lock
           type: FileOrCreate
     containers:
       - name: kiam
         securityContext:
           capabilities:
             add: ["NET_ADMIN"]
         image: quay.io/uswitch/kiam:v3.2 # USE A TAGGED RELEASE IN PRODUCTION
         imagePullPolicy: Always
         command:
           - /kiam
         args:
           - agent
           - --iptables
           - --host-interface=!eth0
           - --json-log
           - --port=8181
           - --cert=/etc/kiam/tls/agent.pem
           - --key=/etc/kiam/tls/agent-key.pem
           - --ca=/etc/kiam/tls/ca.pem
           - --server-address=kiam-server:443
           - --prometheus-listen-addr=0.0.0.0:9620
           - --prometheus-sync-interval=5s
           - --gateway-timeout-creation=1s
         env:
           - name: HOST_IP
             valueFrom:
               fieldRef:
                 fieldPath: status.podIP
           - name: AWS_METADATA_SERVICE_TIMEOUT
             value: "5"
           - name: AWS_METADATA_SERVICE_NUM_ATTEMPTS
             value: "5"
         volumeMounts:
           - mountPath: /etc/ssl/certs
             name: ssl-certs
           - mountPath: /etc/kiam/tls
             name: tls
           - mountPath: /var/run/xtables.lock
             name: xtables
         livenessProbe:
           httpGet:
             path: /ping
             port: 8181
           initialDelaySeconds: 3
           periodSeconds: 3

Create the pod

Finally we are ready to deploy a pod that uses an AWS role.

Below is our final piece of Terraform code creating a role and policy allowing your worker pod to interact with an S3 bucket. Once again, the trust relationship defined in the assume_role_policy is important.

  • Keep in mind, the pod IAM Role must contain a trust relationship with the kiam-server roles.
  • Pod policy is the place to finally grant specific permissions to the process or service acting in AWS.

# IAM Role
resource "aws_iam_role" "pod_role" {
name        = "kiam-pod"
description = "Role the Kiam Pod process assumes"

assume_role_policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "Service": "ec2.amazonaws.com"
     },
     "Action": "sts:AssumeRole"
   },
   {
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "AWS": "${aws_iam_role.kiam_server.arn}"
     },
     "Action": "sts:AssumeRole"
   }
 ]
}
EOF
}

# IAM Policy
resource "aws_iam_policy" "pod_policy" {
name = "kiam_pod_policy"
description = "Policy for the Kiam application pod process"

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
 {
   "Effect": "Allow",
   "Action": [
     "s3:*"
   ],
   "Resource": ["arn:aws:s3:::mission-k8s-test1",
     "arn:aws:s3:::mission-k8s-test1/*"
     ]
 }
]
}
EOF
}

# IAM Policy Attachment
resource "aws_iam_policy_attachment" "pod_policy_attach" {
name       = "kiam-pod-attachment"
roles      = ["${aws_iam_role.pod_role.name}"]
policy_arn = "${aws_iam_policy.pod_policy.arn}"
}

The _namespace_ of your worker pods must contain an annotation for KIAM that specifies which roles are allowed in that namespace. Below is a namespace configuration file that allows pods to assume any role. It’s important to note that the role matching is a RegEx.

apiVersion: v1
kind: Namespace
metadata:
 name: application1
 annotations:
   iam.amazonaws.com/permitted: ".*"

Finally, the pod spec must contain an annotation of the specific IAM role ARN. Below is a sample pod deployment that assumes a role simply named kiam-pod:

---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: app1
 labels:
   app: app1
 namespace: application1
spec:
 replicas: 1
 selector:
   matchLabels:
     app: app1
 template:
   metadata:
     labels:
       app: app1
     annotations:
       iam.amazonaws.com/role: arn:aws:iam::<ACCOUNT_NUMBER>:role/kiam-pod
   spec:
     containers:
     - name: example-pod
       image: bambooengineering/ubuntu-awscli:1.16.169
       command: [ "/bin/bash", "-c", "--" ]
       # Sleep forever to allow a user to ssh into the container and test IAM permissions
       args: [ "while true; do sleep 30; done;" ]

If everything is configured correctly (no small if), the pod should be able to do all the actions specified in the kiam-pod. The setup is a bit complicated, but the end result is a much more effective security posture that fits the model of Kubernetes deployments. Hooray!

* Since this post was written, AWS has added support to assign IAM Permissions to Kubernetes Service Accounts. Learn more here: https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-eks-adds-support-to-assign-iam-permissions-to-kubernetes-service-accounts/

Useful Resources:

Author Spotlight:

Luke Reimer

Keep Up To Date With AWS News

Stay up to date with the latest AWS services, latest architecture, cloud-native solutions and more.