在漏洞利用门槛如此低廉的今天,作为企业安全的建设者(搬砖人),除了考虑部署容器层面运行时检测平台,在k8s api-server层面,启用日志审计功能,也是一个成本低廉又高效发现入侵攻击的途径。
通过对api-server的日志进行审计分析,对于攻击者的信息收集行为,部署k8s cronjob后门、利用rbac做权限提升等持久化攻击行为都能及时的发现并输出告警。
- 发生了什么? - 什么时候发生的? - 谁触发的? - 为什么发生? - 在哪观察到的? - 它从哪触发的? - 它将产生什么后果?
审计日志示例(图片来自参考[5]

1、api-server命令行启动时,添加如下参数
--audit-policy-file=/etc/kubernetes/audit/audit-default-policy.yaml # 审计策略文件 --audit-log-path=/data/log/audit/audit.log # kube-apiserver 输出的审计日志文件,此处以日志文件落地的方式做日志收集 --audit-log-maxbackup=10 # kube-apiserver 审计日志文件的最大备份数量 --audit-log-format=json #日志保存格式 --audit-log-maxage=10 #日志最大保留时间 --audit-log-maxsize=500
2、kubeadm启动时
修改api-server配置文件/etc/kubernetes/manifests/kube-apiserver.yaml,增加如下内容
--audit-policy-file=/etc/kubernetes/audit/audit-default-policy.yaml --audit-log-path=/data/log/audit/audit.log --audit-log-maxbackup=10 --audit-log-format=json --audit-log-maxage=10 --audit-log-maxsize=500
2. 攻击行为检测
作为甲方安全的守护者,从安全建设的角度,如何有效及时的发现攻击者的入侵行为,是一个无法避开的问题。本文通过观察cdk的攻击行为,从k8s日志审计的角度罗列一些入侵检测的常见规则。
目前,针对于日志的审计分析,我们落地方案的整个流程为:



{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "stage": "ResponseComplete", "requestURI": "/", "verb": "get", "user": { "username": "system:anonymous", "groups": [ "system:unauthenticated" ] }, "sourceIPs": [ "172.18.0.2" ], "userAgent": "Go-http-client/1.1", "responseStatus": { "metadata": {}, "status": "Failure", "reason": "Forbidden", "code": 403 }, "requestReceivedTimestamp": "2023-02-02T08:29:12.189459Z", "stageTimestamp": "2023-02-02T08:29:12.189553Z", "annotations": { "authorization.k8s.io/decision": "forbid", "authorization.k8s.io/reason": "" } }
2.2.2 列举namespace
{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "41770e7a-1827-4a14-860f-a812d3db1647", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces", "verb": "list", "user": { "username": "system:serviceaccount:test:default", "uid": "63b8dd88-88dd-4426-bdd1-7966906dc0d5", "groups": [ "system:serviceaccounts", "system:serviceaccounts:test", "system:authenticated" ] }, "sourceIPs": [ "172.18.0.2" ], "userAgent": "Go-http-client/1.1", "objectRef": { "resource": "namespaces", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "status": "Failure", "reason": "Forbidden", "code": 403 }, "requestReceivedTimestamp": "2023-02-02T08:29:12.205422Z", "stageTimestamp": "2023-02-02T08:29:12.205485Z", "annotations": { "authorization.k8s.io/decision": "forbid", "authorization.k8s.io/reason": "" } }
2.2.3 探测可访问的api
1、userAgent: Go-http-client/1.1 # CDK特定的userAgent,此时,该字段为主要特征 2、responseStatus.code: 403 #默认serviceaccount无权限时 api-server返回的状态码 3、requestURI: / # 访问根目录
2.3 漏洞利用

当一个拥有privilege、sys_admin、network、ipc等特殊权限的pod创建时,它的日志记录是这样的。
{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/testpods/pods", "verb": "create", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "172.18.0.1" ], "userAgent": "kubectl1.16.15/v1.16.15 (darwin/amd64) kubernetes/2adc8d7", "objectRef": { "resource": "pods", "namespace": "testpods", "name": "testpod", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 201 }, "responseObject": { "kind": "Pod", "apiVersion": "v1", "metadata": { "name": "testpod", "namespace": "testpods", "selfLink": "/api/v1/namespaces/testpods/pods/testpod", "uid": "e717d204-7e6d-4608-998b-648a8667e8e1", "resourceVersion": "13517", "creationTimestamp": "2023-02-02T09:51:10Z", "labels": { "creator": "zhiye", "team": "teamf" } }, "spec": { "volumes": [ { "name": "rootfs", "hostPath": { "path": "/", "type": "" } } ], "containers": [ { "name": "trpe", "image": "alpine", "command": [ "/bin/sh", "-c", "tail -f /dev/null" ], "resources": {}, "volumeMounts": [ { "name": "default-token-mm6s8", "readOnly": true, "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount" } ], "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "imagePullPolicy": "Always", "securityContext": { "capabilities": { "add": [ "SYS_ADMIN" ] }, "privileged": true } } ], "restartPolicy": "Always", "terminationGracePeriodSeconds": 30, "dnsPolicy": "ClusterFirst", "serviceAccountName": "default", "serviceAccount": "default", "hostNetwork": true, "hostPID": true, "hostIPC": true, "securityContext": {}, "schedulerName": "default-scheduler", "tolerations": [ { "key": "node.kubernetes.io/not-ready", "operator": "Exists", "effect": "NoExecute", "tolerationSeconds": 300 }, { "key": "node.kubernetes.io/unreachable", "operator": "Exists", "effect": "NoExecute", "tolerationSeconds": 300 } ], "priority": 0, "enableServiceLinks": true }, "status": { "phase": "Pending", "qosClass": "BestEffort" } }, "requestReceivedTimestamp": "2023-02-02T09:51:10.632436Z", "stageTimestamp": "2023-02-02T09:51:10.660958Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } }
从日志上可以看出,针对于错误配置导致的逃逸,我们可以关注以下几个日志字段,制定告警规则。
responseObject.spec.volumes # 检测敏感卷,是否挂载docker.sock等 responseObject.spec.containers.volumeMounts # 检测敏感挂载,是否挂载docker.sock等 responseObject.spec.containers.securityContext.capabilities.add # 是否使用SYS_ADMIN权限,(字段嵌套这么多层,真的得吐槽 responseObject.spec.containers.securityContext.privileged # 检测是否为特权pod容器 responseObject.spec.hostNetwork # 是否使用宿主机网络 responseObject.spec.hostPID # 是否使用宿主机hostPID responseObject.spec.hostIPC # 是否共享宿主机内存 responseObject.spec.serviceAccount # 是否使用特殊的serviceaccount 默认为default
下边列举些主要特征
requestURI: /api/v1/secrets, requestURI: /api/v1/configmaps requestURI: /apis/policy/v1beta1/podsecuritypolicies userAgent: Go-http-client/1.1 user.username: "system:anonymous" responseStatus.code: 403 "verb": "list"
2.3.4 权限提升
日志如下
{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "bfc643d6-8337-434e-9dec-ba41dd36bfa7", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/kube-system/pods", "verb": "create", "user": { "username": "system:serviceaccount:test:default", "uid": "63b8dd88-88dd-4426-bdd1-7966906dc0d5", "groups": [ "system:serviceaccounts", "system:serviceaccounts:test", "system:authenticated" ] }, "sourceIPs": [ "172.18.0.3" ], "userAgent": "Go-http-client/1.1", "objectRef": { "resource": "pods", "namespace": "kube-system", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "status": "Failure", "reason": "Forbidden", "code": 403 }, "responseObject": { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "pods is forbidden: User \"system:serviceaccount:test:default\" cannot create resource \"pods\" in API group \"\" in the namespace \"kube-system\"", "reason": "Forbidden", "details": { "kind": "pods" }, "code": 403 }, "requestReceivedTimestamp": "2023-02-03T09:56:24.061825Z", "stageTimestamp": "2023-02-03T09:56:24.061888Z", "annotations": { "authorization.k8s.io/decision": "forbid", "authorization.k8s.io/reason": "" } }
由上可发现如下特征:
Pod内serviceaccount无权限的情况 requestURI: /api/v1/namespaces/kube-system/pods userAgent: Go-http-client/1.1 responseStatus.code: 403 Pod内serviceaccount有权限的情况 requestURI: /api/v1/namespaces/kube-system/pods responseObject.metadata.selfLink: /api/v1/namespaces/kube-system/pods/cdk-rbac-bypass-create-pod responseObject.metadata.spec.containers.args: *cat /run/secrets/kubernetes.io/serviceaccount/token* verb: create
2.3.5 持久化
源码是这样定义的(参考[6]):

objectRef.name: cdk-backdoor-daemonset objectRef.namespace: kube-system responseObject.metadata.selfLink: /apis/apps/v1/namespaces/kube-system/daemonsets/cdk-backdoor-daemonset responseObject.spec.template.spec.volumes.hostPath.path: / responseObject.spec.template.spec.containers.name: cdk-backdoor-pod responseObject.spec.template.spec.containers.securityContext[capabilities:ptivileged]:如图上所示 responseObject.spec.template.spec.[hostNetwork|hostPID]: true
部署K8S CronJob
CDK源代码是这样定义的(参考[7]):

requestURI: /apis/batch/v1beta1/namespaces/kube-system/cronjobs verb: create objectRef.name: cdk-backdoor-cronjob responseObject.matadata.name: cdk-backdoor-cronjob responseObject.matadata.selfLink: /apis/batch/v1beta1/namespaces/kube-system/cronjobs/cdk-backdoor-cronjob responseObject.spec.jobTemplate.spec.template.spec.containers.name: cdk-backdoor-cronjob-container
部署影子k8s api-server
在pod权限足够的情况下,通过创建shadow api-server做权限维持,详情见参考[4]
在非二开的情况下,通过k8s日志升级可检测以下几个字段
objectRef.name: *-shadow-* responseObject.metadata.labels.component: kube-apiservershadow responseObject.spec.containers.command: "--secure-port=9444"
2.3.6 总结
执行./cdk kcurl default get ‘https://10.96.0.1:443/api/v1/nodes’ ,日志内容如下:
{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "418cafa5-2c1e-4fbf-b086-3d68e321d2bb", "stage": "ResponseComplete", "requestURI": "/api/v1/nodes", "verb": "list", "user": { "username": "system:serviceaccount:test:default", "uid": "63b8dd88-88dd-4426-bdd1-7966906dc0d5", "groups": [ "system:serviceaccounts", "system:serviceaccounts:test", "system:authenticated" ] }, "sourceIPs": [ "172.18.0.3" ], "userAgent": "Go-http-client/1.1", "objectRef": { "resource": "nodes", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 200 }, "requestReceivedTimestamp": "2023-02-06T09:51:51.790260Z", "stageTimestamp": "2023-02-06T09:51:51.791075Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"defaultadmin\" of ClusterRole \"cluster-admin\" to ServiceAccount \"default/test\"" } }
日志中annotations.authorization.k8s.io/reason给出了允许执行的原因。我们可以根据如下三个字段制定告警规则:
serviceaccount有权限的情况下: annotations.authorization.k8s.io/reason annotations.authorization.k8s.io/decision userAgent: Go-http-client/1.1 responseStatus.code:200 无权限的情况下: userAgent: Go-http-client/1.1 responseStatus.code:403
2.4.2 总结
API(Aggregation API)SSRF漏洞.
APIService 可以将客户端的请求转发到任意的 URL 上,这就有可能会导致 Client 发送请求时,所携带的一些认证信息可能会被发送给第三方。
通过日志审计监控responseStatus.code字段来进行判断是否有出现重定向的情况,通过检测如下字段:
responseObject.code:302 responseObject.code:301
3.2 使用不合规镜像创建pod
verb : create level:RequestResponse esponseObject.kind:Pod requestObject.spec.containers.image:镜像仓库地址
3.3 pod命令执行
objectRef.subresource:exec objectRef.subresource:attach userAgent
4. 落地实践踩过的坑
解决方案为对于字段不一致的obj,选择为不做深层次解析。(或者使用hdfs等存储方式,查询时对字段进行解析
2、日志量过大,导致api-server磁盘读写io过高
持续优化audit.yaml中的日志规则,对于其中的node/status,pod/status,coordination.k8s.io/leases等不做日志记录。
当然,CDK作为开源工具,这些特征都可以做关键字替换。因此,笔者认为功夫应该用到平时,加强k8s的基线管控,比如避免出现高serviceaccount权限、通过准入策略限制使用的docker镜像,并部署容器运行时入侵检测平台。让安全能力覆盖每个环节,才能保证集群的安全稳定。
[2] https://www.cdxy.me/?p=839
[3] https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/
[4] https://discuss.kubernetes.io/t/security-advisory-cve-2022-3172-aggregated-api-server-can-cause-clients-to-be-redirected-ssrf/21322
[5]https://github.com/tencentyun/qcloud-documents/blob/master/product/%E5%AD%98%E5%82%A8%E4%B8%8ECDN/%E6%97%A5%E5%BF%97%E6%9C%8D%E5%8A%A1/%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5/TKE%20%E5%AE%A1%E8%AE%A1%E6%97%A5%E5%BF%97%E5%88%86%E6%9E%90.md
[6] https://github.com/cdk-team/CDK/blob/main/pkg/exploit/k8s_backdoor_daemonset.go#LL35-L87C2
[7] https://github.com/cdk-team/CDK/blob/main/pkg/exploit/k8s_cronjob.go#LL34-L59C2