Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Note

未经允许不得转载



Info
iconfalse

Table of Contents


Kubernetes Healthcheck

对 Pod 的健康状态检查可以通过两类探针来检查:LivenessProbe 和 ReadinessProbe。

  • LivenessProbe :用于判断容器是否存活,如果 LivenessProbe 探针探测到容器不健康,则 kubelet 将杀掉容器,并根据容器的启动策略做相应的处理,如果一个容器不包括 LivenessProbe 探针,kubelet 认为该容器的 LivenessProbe 探针返回的值永远是 Success。
  • ReadinessProbe:用于判断容器是否启动完成(ready 状态),可以接受请求,如果 ReadinessProbe 探针检测到失败,则 Pod 的状态则被更改。

1)ExecAction :容器内不执行一个命令,如果该命令的返回码为 0 ,则表示容器健康,15秒探测一次,超时1秒后会重启服务。

Code Block
languagebash
themeRDark
kind: ...
spec:
  containers:
  - name: ..
    livenessProbe:
     exec:
     command:
      - cat
      - /tmp/health
      initialDelaySeconds: 15
      timeoutSeconds: 1 

2)TCPSocketAction:通过容器的 IP 地址和端口号执行 TCP 检查,如果能够建立 TCP 连接,则表明容器健康。

Code Block
languagebash
themeRDark
...
spec:
  containers:
  - name: 
  ...
    livenessProbe:
      tcpSocket:
        port: 80
      initialDelaySeconds: 30
      timeoutSeconds: 1

3)HTTPGETAction:通过容器的 IP 地址,端口及路径调用 HTTP GET 方法,如果响应的状态码大于等于 200 且小于 400,则任务健康,访问的路径是 localhost:80/_status/healthz

Code Block
languagebash
themeRDark
...
spec:
  containers:
  - ports:
  ...
    livenessProbe:
      httpGet:
        path: /_status/healthz
        port: 80
      initialDelaySeconds: 30
      timeoutSeconds: 1

它们的含义分别如下:

  • initialDelaySeconds:启动容器后进行首次健康检查的等待时间,单位为 s 。

  • timeutSeconds:健康检查发送请求后等待相应的超时时间,单位为 s。当超时发生时,kubeler 会认为容器已经无法提供服务,将会重启该容器。

  • initialDelaySeconds:检查开始执行的时间,以容器启动完成为起点计算

  • periodSeconds:检查执行的周期,默认为10秒,最小为1秒

  • timeoutSeconds:检查超时的时间,默认为1秒,最小为1秒

  • successThreshold:从上次检查失败后重新认定检查成功的检查次数阈值(必须是连续成功),默认为1

  • failureThreshold:从上次检查成功后认定检查失败的检查次数阈值(必须是连续失败),默认为1

  • httpGet的属性

    • host:主机名或IP

    • scheme:链接类型,HTTP或HTTPS,默认为HTTP

    • path:请求路径

    • httpHeaders:自定义请求头

    • port:请求端口


定义说明如下:

Code Block
languagebash
themeRDark
$ kubectl explain pod.spec.containers.livenessProbe
KIND:     Pod
VERSION:  v1

RESOURCE: livenessProbe <Object>

DESCRIPTION:
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

     Probe describes a health check to be performed against a container to
     determine whether it is alive or ready to receive traffic.

FIELDS:
   exec	<Object>
     One and only one of the following should be specified. Exec specifies the
     action to take.

   failureThreshold	<integer>
     Minimum consecutive failures for the probe to be considered failed after
     having succeeded. Defaults to 3. Minimum value is 1.

   httpGet	<Object>
     HTTPGet specifies the http request to perform.

   initialDelaySeconds	<integer>
     Number of seconds after the container has started before liveness probes
     are initiated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

   periodSeconds	<integer>
     How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
     value is 1.

   successThreshold	<integer>
     Minimum consecutive successes for the probe to be considered successful
     after having failed. Defaults to 1. Must be 1 for liveness. Minimum value
     is 1.

   tcpSocket	<Object>
     TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported

   timeoutSeconds	<integer>
     Number of seconds after which the probe times out. Defaults to 1 second.
     Minimum value is 1. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

startupProbe

新增 startupProbe, 主要解决 livenessProbe 启动时,如果无法正常启动或服务启动时间较长引起的容器重新启动的问题。

Code Block
languagebash
themeRDark
apiVersion: apps/v1
kind: Deployment
metadata:
  name: busybox-lifecycles-nginx-sleep
spec:
  replicas: 2
  selector:
    matchLabels:
      app: busybox-lifecycles-nginx-sleep
      env-o: nginx-sleep
  template:
    metadata:
      labels:
        app: busybox-lifecycles-nginx-sleep
        env-o: nginx-sleep
      annotations:
        consul.hashicorp.com/connect-inject: "true"
    spec:
      terminationGracePeriodSeconds: 120
      containers:
      - image: slzcc/terminal-ctl:ubuntu-20.04
        imagePullPolicy: Always
        command:
          - nginx
          - -g
          - daemon off;
        name: busybox
        lifecycle:
          preStop:
           exec:
             command:
             - /bin/sh -c sleep 120
        livenessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - nc -z -v -n 127.0.0.1 80
          failureThreshold: 10
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - nc -z -v -n 127.0.0.1 80
          initialDelaySeconds: 5
          periodSeconds: 10
        startupProbe:
          httpGet:
            path: /
            port: 801
          failureThreshold: 303
          periodSeconds: 10
      restartPolicy: Always

---
apiVersion: v1
kind: Service
metadata:
  name: busybox-lifecycles-nginx-sleep
  labels:
    app: busybox-lifecycles-nginx-sleep
spec:
  ports:
   - name: http
     port: 80
     targetPort: 80
     protocol: TCP
  selector:
    app: busybox-lifecycles-nginx-sleep