前言:
kubernetes集群的apiserver服务的审计日志通常是不开启的,如果是新安装的kubernetes集群的话。
审计日志是kube-apiserver中比较常见的一种加固手段,通过对每一次请求的行为进行审计,从而达到加固集群的目的,同时,审计日志还能够帮助我们troubleshooting,因为每一次请求的内容都会被记录下来,如果请求的内容本身有问题,从而导致api返回5xx的错误,我们可以从审计日志中直接把报错信息抓出来给开发,帮助他们定位问题。
但有一点特别需要注意,如果审计策略是不恰当的,什么都记录的,可能会造成集群的内存资源浪费(审计日志会占用很多内存的)审计日志记录功能会增加 API server 的内存消耗,因为需要为每个请求存储审计所需的某些上下文。 内存消耗取决于审计日志记录的配置。
审计记录最初产生于 kube-apiserver 内部。每个请求在不同执行阶段都会生成审计事件;这些审计事件会根据特定策略 被预处理并写入后端。策略确定要记录的内容和用来存储记录的后端。 当前的后端支持日志文件和 webhook。
一,
如何正确的开启apiserver的审计日志?
首先,我们需要一个审计策略文件,该文件是yaml格式的,此文件内容定义的是要抓取审计哪些内容。
同样的,日志审计功能在官方的文档里有详细的说明:审计 | Kubernetes
什么时候记录?
每个请求都可被记录其相关的阶段(stage)。已定义的阶段有:
RequestReceived
- 此阶段对应审计处理器接收到请求后,并且在委托给 其余处理器之前生成的事件。ResponseStarted
- 在响应消息的头部发送后,响应消息体发送前生成的事件。 只有长时间运行的请求(例如 watch)才会生成这个阶段。ResponseComplete
- 当响应消息体完成并且没有更多数据需要传输的时候。Panic
- 当 panic 发生时生成。
记录哪些内容?
审计策略定义了关于应记录哪些事件以及应包含哪些数据的规则。 审计策略对象结构定义在 audit.k8s.io API 组 。处理事件时,将按顺序与规则列表进行比较。第一个匹配规则设置事件的 审计级别(Audit Level)。已定义的审计级别有:
None
- 符合这条规则的日志将不会记录。Metadata
- 记录请求的元数据(请求的用户、时间戳、资源、动词等等), 但是不记录请求或者响应的消息体。Request
- 记录事件的元数据和请求的消息体,但是不记录响应的消息体。 这不适用于非资源类型的请求。RequestResponse
- 记录事件的元数据,请求和响应的消息体。这不适用于非资源类型的请求。
下面是一个比较标准的审计策略定义:
apiVersion: audit.k8s.io/v1beta1 # This is required. kind: Policy #不要为RequestReceived阶段中的所有请求生成审核事件。 omitStages: - "RequestReceived" rules: # 以下请求被手动确定为高容量和低风险,因此请取消这些请求。 - level: None users: ["system:kube-proxy"] verbs: ["watch"] resources: - group: "" # core resources: ["endpoints", "services"] - level: None users: ["system:unsecured"] namespaces: ["kube-system"] verbs: ["get"] resources: - group: "" # core resources: ["configmaps"] - level: None users: ["kubelet"] # legacy kubelet identity verbs: ["get"] resources: - group: "" # core resources: ["nodes"] - level: None userGroups: ["system:nodes"] verbs: ["get"] resources: - group: "" # core resources: ["nodes"] - level: None users: - system:kube-controller-manager - system:kube-scheduler - system:serviceaccount:kube-system:endpoint-controller verbs: ["get", "update"] namespaces: ["kube-system"] resources: - group: "" # core resources: ["endpoints"] - level: None users: ["system:apiserver"] verbs: ["get"] resources: - group: "" # core resources: ["namespaces"] #不要记录这些只读URL。 - level: None nonResourceURLs: - /healthz* - /version - /swagger* #不要记录事件请求。 - level: None resources: - group: "" # core resources: ["events"] # 机密、配置映射和令牌审查可以包含敏感和二进制数据, # 因此,只能在元数据级别进行日志记录。 - level: Metadata resources: - group: "" # core resources: ["secrets", "configmaps"] - group: authentication.k8s.io resources: ["tokenreviews"] - level: Request verbs: ["get", "list", "watch"] resources: - group: "" # core - group: "admissionregistration.k8s.io" - group: "apps" - group: "authentication.k8s.io" - group: "authorization.k8s.io" - group: "autoscaling" - group: "batch" - group: "certificates.k8s.io" - group: "extensions" - group: "networking.k8s.io" - group: "policy" - group: "rbac.authorization.k8s.io" - group: "settings.k8s.io" - group: "storage.k8s.io" # 已知API的默认级别。 - level: RequestResponse resources: - group: "" # core - group: "admissionregistration.k8s.io" - group: "apps" - group: "authentication.k8s.io" - group: "authorization.k8s.io" - group: "autoscaling" - group: "batch" - group: "certificates.k8s.io" - group: "extensions" - group: "networking.k8s.io" - group: "policy" - group: "rbac.authorization.k8s.io" - group: "settings.k8s.io" - group: "storage.k8s.io" - group: "autoscaling.alibabacloud.com" # 所有其他请求的默认级别。 - level: Metadata
假设这个文件存放路径为:/etc/kubernetes/logpolicy/sample-policy.yaml ,那么现在需要在apiserver的配置文件内启用日志审计功能:
cat /etc/kubernetes/manifests/kube-apiserver.yaml apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.123.11:6443 creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --advertise-address=192.168.123.11 - --allow-privileged=true - --authorization-mode=Node,RBAC - --client-ca-file=/etc/kubernetes/pki/ca.crt - --enable-admission-plugins=NodeRestriction - --audit-policy-file=/etc/kubernetes/logpolicy/sample-policy.yaml #add 这个是绝对路径指定审计策略文件 - --audit-log-path=/var/log/kubernetes/audit-logs/audit.log#add 这个是绝对路径 - --audit-log-maxsize=7#add 单位为M,最多7M大小,超出就另生成一个新日志 - --audit-log-maxbackup=2#add 最多有两个7M的日志文件 - --enable-bootstrap-token-auth=true - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key - --requestheader-allowed-names=front-proxy-client - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --requestheader-extra-headers-prefix=X-Remote-Extra- - --requestheader-group-headers=X-Remote-Group - --requestheader-username-headers=X-Remote-User - --secure-port=6443 - --service-account-issuer=https://kubernetes.default.svc.cluster.local - --service-account-key-file=/etc/kubernetes/pki/sa.pub - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key - --service-cluster-ip-range=10.96.0.0/12 - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key image: registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.15 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 192.168.123.11 path: /livez port: 6443 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: kube-apiserver readinessProbe: failureThreshold: 3 httpGet: host: 192.168.123.11 path: /readyz port: 6443 scheme: HTTPS periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 250m startupProbe: failureThreshold: 24 httpGet: host: 192.168.123.11 path: /livez port: 6443 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /etc/ssl/certs name: ca-certs readOnly: true - mountPath: /etc/kubernetes/logpolicy/sample-policy.yaml#add name: audit-policy#add readOnly: true#add - mountPath: /var/log/kubernetes/audit-logs#add name: audit-logs#add readOnly: false#add 这个不能是true,否则apiserver启动不了 - mountPath: /etc/pki name: etc-pki readOnly: true - mountPath: /etc/kubernetes/pki name: k8s-certs readOnly: true hostNetwork: true priorityClassName: system-node-critical securityContext: seccompProfile: type: RuntimeDefault volumes: - hostPath: path: /etc/ssl/certs type: DirectoryOrCreate name: ca-certs - hostPath: path: /etc/pki type: DirectoryOrCreate name: etc-pki - hostPath: path: /etc/kubernetes/pki type: DirectoryOrCreate name: k8s-certs - hostPath:#add path: /etc/kubernetes/logpolicy/sample-policy.yaml#add type: File#add 这个地方必须是这个type name: audit-policy#add - hostPath:#add path: /var/log/kubernetes/audit-logs#add 这个路径不需要手动建立,会自动建立的 type: DirectoryOrCreate#add 这个地方必须是这个type name: audit-logs#add status: {}
二,
验证审计策略
此命令生成相关日志
kubectl get configmaps -n kube-system
利用jq命令检索,可以发现没有相关日志生成:
cat audit.log |grep configmaps|grep get |jq
为什么没有相关审计日志呢?由于上面的审计策略里有这个:
- level: None users: ["system:unsecured"] namespaces: ["kube-system"] verbs: ["get"] resources: - group: "" # core resources: ["configmaps"]
查询list就可以查到了:
[root@k8s-master audit-logs]# cat audit.log |grep configmaps|grep list |jq { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "39290452-13a5-4547-a1f3-8bd214ad88b4", "stage": "ResponseComplete", "requestURI": "/api/v1/configmaps?limit=500", "verb": "list", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "192.168.123.11" ], "userAgent": "kubectl/v1.23.15 (linux/amd64) kubernetes/b84cb8a", "objectRef": { "resource": "configmaps", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 200 }, "requestReceivedTimestamp": "2023-01-08T15:13:38.086276Z", "stageTimestamp": "2023-01-08T15:13:38.092581Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } } { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "ae5f5d3a-9f26-46be-bfa8-f0889a0321c4", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/default/configmaps?limit=500", "verb": "list", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "192.168.123.11" ], "userAgent": "kubectl/v1.23.15 (linux/amd64) kubernetes/b84cb8a", "objectRef": { "resource": "configmaps", "namespace": "default", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 200 }, "requestReceivedTimestamp": "2023-01-08T15:14:01.133134Z", "stageTimestamp": "2023-01-08T15:14:01.137090Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } }
三,
审计后端
审计后端实现将审计事件导出到外部存储。Kube-apiserver
默认提供两个后端:
- Log 后端,将事件写入到文件系统
- Webhook 后端,将事件发送到外部 HTTP API
很明显,本文是使用的Log后端,仅仅将事件写入文件系统,基于此日志文件,可以将日志推送到filebeat内,构成一个完整的日志系统。
审计策略小结:
1、rule是白名单,配置了规则rule才会被打印 (验证:如果none类型后面还配置了 metadata类型,不会打印日志;如果去掉后面的metadata类型,只保留前面的none类型的,不会打印任何日志;如果metadata类型在none类型的前面,将会打印日志)
例如这样的打印日志:
apiVersion: audit.k8s.io/v1beta1 kind: Policy rules: - level: Metadata resources: - group: "" # core resources: ["secrets", "configmaps"] - group: authentication.k8s.io resources: ["tokenreviews"] - level: None - level: None userGroups: ["system:kube-controller-manager"] nonResourceURLs: - "/api*" # 通配符匹配。 - "/version" - level: None resources: - group: "" # core resources: ["events"]
把metadata和none位置替换后,将不会打印日志:
apiVersion: audit.k8s.io/v1beta1 kind: Policy rules: - level: None - level: None userGroups: ["system:kube-controller-manager"] nonResourceURLs: - "/api*" # 通配符匹配。 - "/version" - level: None resources: - group: "" # core resources: ["events"] - level: Metadata resources: - group: "" # core resources: ["secrets", "configmaps"] - group: authentication.k8s.io resources: ["tokenreviews"]
2、rule规则中 最前面那个是 结果 select 输出结果,后面的是条件 where 条件 (验证:查看输出结果就知道)
3、每条rule规则中,多个条件是与的关系,任何一次操作同时满足这些条件才能打印指定类型日志
4、rules是数组,越前面优先级越高,一条日志走策略文件,先匹配到哪条就返回指定结果
5、omitStage 可以配置全局的,也可以配置在每条规则下 (验证:略)