Bootstrap

K8S -理解StatefulSet - 部署有状态应用

在这里插入图片描述



什么是 有状态服务和 无状态服务

有状态服务(Stateful Service):
有状态服务是指在处理请求期间维护和跟踪用户状态或会话信息的服务。这意味着服务在多个请求之间保持状态,并且需要在请求之间共享和使用这些状态信息。通常,有状态服务会将用户数据存储在内存、数据库或其他持久化存储中,并使用该状态来处理后续请求。有状态服务通常需要进行会话管理和状态同步,以确保正确处理和维护用户状态。

无状态服务(Stateless Service):
无状态服务是指在处理请求期间不维护任何用户状态或会话信息的服务。每个请求被视为独立的、无关的操作,服务不会存储或依赖之前的请求状态。无状态服务仅根据当前请求的输入进行处理,并生成相应的输出。无状态服务更加简单和可伸缩,因为它们不需要跟踪和管理用户状态。每个请求都是独立的,可以在集群中的任何实例之间进行负载均衡。
总结:

有状态服务维护和使用用户状态或会话信息,需要进行状态同步和会话管理。
无状态服务不维护任何用户状态或会话信息,每个请求是独立的。

再简单地讲, 有状态服务需要考虑用户数据如何在各个service instance 中同步的问题。
而无状态服务是不需要考虑的, 所以无状态服务能基于QPS 无限制地进行增加实例, 但是有状态并不能这么做




K8S 的deployment 是不是只能用于无状态服务

Deployment 在 Kubernetes 中用于管理无状态应用的更新和扩缩容,但它也可以用于管理有状态应用。Deployment 控制器是 Kubernetes 中的一种资源对象,它提供了应用部署的声明性描述,并负责确保所需的 Pod 副本数目在集群中运行。

Deployment 控制器的主要功能是实现滚动更新(Rolling Update)和回滚(Rollback)。无状态应用通常使用 Deployment 来进行部署和更新,因为它们的实例可以平滑地替换,而不会对应用的状态造成影响。

然而,有状态应用也可以使用 Deployment 进行部署,尤其是在需要水平扩展和滚动更新有状态应用时。通过配置适当的存储卷(Volume)和持久性声明(PersistentVolumeClaim),可以确保有状态应用在 Pod 替换过程中不会丢失数据。

此外,对于有状态应用,Kubernetes 还提供了 StatefulSet 控制器,它专门用于部署和管理有状态应用。StatefulSet 可以为每个 Pod 分配一个唯一的标识符和稳定的网络标识,确保有状态应用的稳定性和持久性。StatefulSet 适用于需要有序部署、有状态存储和有状态网络标识的应用场景。

因此,Deployment 对于无状态应用是常用的部署方式,但也可以用于管理有状态应用。对于有状态应用,可以使用 StatefulSet 来获得更多的特性和保证。




StatefulSet 更适合用于有状态service的部署

原因如下:

  1. 稳定的网络标识符:StatefulSet 为每个 Pod 分配一个唯一的标识符,通常是以序号的形式命名,例如 myapp-0, myapp-1,以此类推。这些标识符在 Pod 的重新创建过程中保持不变,使得有状态应用能够保持稳定的网络标识。这对于一些需要固定标识符的应用非常重要,例如数据库集群中的主从关系或者分布式系统中的节点标识。

  2. 有序的部署和扩缩容:StatefulSet 支持有序的部署和扩缩容。在扩展 StatefulSet 时,Kubernetes 会按照指定的顺序创建新的 Pod,确保先创建的 Pod 首先可用。这对于一些有状态应用非常重要,例如数据库集群中的主从复制关系,需要确保先启动主节点再启动从节点。

  3. 持久性存储:StatefulSet 可以与持久性存储卷(Persistent Volume)和持久性卷声明(Persistent Volume Claim)结合使用,确保有状态应用在 Pod 的重新创建过程中保留数据。每个 Pod 可以绑定到一个独立的持久性存储卷,以确保数据的持久性和可靠性。

  4. 有状态应用的管理:StatefulSet 提供了一些管理有状态应用的功能。例如,可以使用 rollingUpdate 策略来控制有状态应用的滚动更新过程,确保更新的稳定性。此外,StatefulSet 还支持有状态应用的有序删除,以及与有状态应用相关的服务发现和 DNS 解析。




一个StatefulSet 的部署例子 - Creation

先编写1个yaml file:
stateful-nginx-without-pvc.yaml:

---
apiVersion: v1 # api version
kind: Service # type of this resource e.g. Pod/Deployment ..
metadata:
  name: nginx-stateful-service # name of the service
  labels:
    app: nginx-stateful-service
spec:
  ports: 
  - port: 80 # port of the service, used to access the service
    name: web-port
  clusterIP: None # the service is not exposed outside the cluster
  selector: # label of the Pod that the Service is selecting
    app: nginx-stateful # only service selector could skip the matchLabels:
---


apiVersion: apps/v1
kind: StatefulSet # it's for a stateful application, it's a controller
metadata:
  name: nginx-statefulset # name of the statefulset
  labels:
    app: nginx-stateful
spec: # detail description
  serviceName: "nginx-stateful-service" # name of the service that used to manange the dns,  
                                         # must be the same as the service name defined above
  replicas: 3 # desired replica count
  selector: # label of the Pod that the StatefulSet is managing
    matchLabels:
      app: nginx-stateful
  template: # Pod template
    metadata:
      labels:
        app: nginx-stateful
    spec:
      containers:
      - name: nginx-container
        image: nginx:1.25.4 # image of the container
        ports: # the ports of the container and they will be exposed
        - containerPort: 80 # the port used by the container service
          name: web-port

上面的yaml 文件创建了两个资源
1个是statefulset 部署对象
1个是service

但是并没有pvc data volume 对象, 严格上将上面部署的service还是无状态的, 但是本部分之focus on statefulset 的creation, 先忽略pvc.

置于为何还需要1个service

关键就是下面service name 的标签

  serviceName: "nginx-stateful-service" # name of the service that used to manange the dns,  
                                         # must be the same as the service name defined above

serviceName 字段的作用是为 StatefulSet 中的每个 Pod 提供一个稳定的网络标识符。当 StatefulSet 创建 Pod 时,每个 Pod 都会以其索引号作为后缀,形成一个唯一的 DNS 名称。这个 DNS 名称由以下组成:pod-name.service-name.namespace.svc.cluster.local。

通过指定 serviceName 字段,StatefulSet 与服务进行关联,使得每个 Pod 都能够使用服务的名称作为其 DNS 标识符。这样,每个 Pod 可以通过服务名称进行网络通信,而不需要知道其他 Pod 的具体名称或 IP 地址。




当执行这个配置文件后, 可以见到 pods 是顺序创建的, 创建好 pod0 再是 pod1 才是 pod2, 这就是1个和deployment pods创建很大的区别!


[gateman@manjaro-x13 statefulsets]$ kubectl apply -f stateful-nginx-without-pvc.yaml 
service/nginx-stateful-service created
statefulset.apps/nginx-statefulset created
[gateman@manjaro-x13 statefulsets]$ bash show.sh 
+ kubectl get pvc
No resources found in default namespace.
+ kubectl get svc
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)           AGE
bq-api-service-1         NodePort    10.103.40.130   <none>        32111:30604/TCP   15h
kubernetes               ClusterIP   10.96.0.1       <none>        443/TCP           76d
nginx-stateful-service   ClusterIP   None            <none>        80/TCP            6s
+ kubectl get sts
NAME                READY   AGE
nginx-statefulset   1/3     6s
+ kubectl get pods
NAME                                        READY   STATUS              RESTARTS   AGE
bq-api-service-deployment-978b76fcf-4qcsp   1/1     Running             0          5h20m
bq-api-service-deployment-978b76fcf-7x54w   1/1     Running             0          5h20m
bq-api-service-deployment-978b76fcf-xfxcw   1/1     Running             0          5h20m
nginx-statefulset-0                         1/1     Running             0          7s
nginx-statefulset-1                         0/1     ContainerCreating   0          5s
[gateman@manjaro-x13 statefulsets]$ 
[gateman@manjaro-x13 statefulsets]$ bash show.sh 
+ kubectl get pvc
No resources found in default namespace.
+ kubectl get svc
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)           AGE
bq-api-service-1         NodePort    10.103.40.130   <none>        32111:30604/TCP   15h
kubernetes               ClusterIP   10.96.0.1       <none>        443/TCP           76d
nginx-stateful-service   ClusterIP   None            <none>        80/TCP            15s
+ kubectl get sts
NAME                READY   AGE
nginx-statefulset   2/3     15s
+ kubectl get pods
NAME                                        READY   STATUS              RESTARTS   AGE
bq-api-service-deployment-978b76fcf-4qcsp   1/1     Running             0          5h20m
bq-api-service-deployment-978b76fcf-7x54w   1/1     Running             0          5h20m
bq-api-service-deployment-978b76fcf-xfxcw   1/1     Running             0          5h20m
nginx-statefulset-0                         1/1     Running             0          16s
nginx-statefulset-1                         1/1     Running             0          14s
nginx-statefulset-2                         0/1     ContainerCreating   0          7s
[gateman@manjaro-x13 statefulsets]$ bash show.sh 
+ kubectl get pvc
No resources found in default namespace.
+ kubectl get svc
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)           AGE
bq-api-service-1         NodePort    10.103.40.130   <none>        32111:30604/TCP   15h
kubernetes               ClusterIP   10.96.0.1       <none>        443/TCP           76d
nginx-stateful-service   ClusterIP   None            <none>        80/TCP            52s
+ kubectl get sts
NAME                READY   AGE
nginx-statefulset   3/3     52s
+ kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
bq-api-service-deployment-978b76fcf-4qcsp   1/1     Running   0          5h20m
bq-api-service-deployment-978b76fcf-7x54w   1/1     Running   0          5h20m
bq-api-service-deployment-978b76fcf-xfxcw   1/1     Running   0          5h20m
nginx-statefulset-0                         1/1     Running   0          53s
nginx-statefulset-1                         1/1     Running   0          51s
nginx-statefulset-2                         1/1     Running   0          44s

重点, 跟上面的deployments的pod是不同, deployment pods 是无序的, 后面跟的是无意义的字符串, 而statefulset 的pods 是严格按照顺序创建! POD名字后序是index 数字




测试service

当service 创建之后
默认在 nodes里是无法直接访问的, 例如

root@k8s-master:~# ping nginx-statefulset-0.nginx-stateful-service
ping: nginx-statefulset-0.nginx-stateful-service: Name or service not known
root@k8s-master:~# ping nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local
ping: nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local: Name or service not known

上面提到了, 这个例子中的service 的作用是为pods 提供1个稳定的DNS name
但是无论是
简写: $podName.$serviceName
nginx-statefulset-0.nginx-stateful-service

还是全写:$podName.$serviceName.$nameSpace.svc.cluster.local
nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local

在k8s-master 中是无法访问的

原因就是这些DNS 只能在 k8s 的pod内部访问!

为了测试, 我们临时创建1个pod with busybox

kubectl run -it --image busybox:1.28 dns-test --restart=Never /bin/sh

注意, busybox 默认是没安装bash 和 curl的, 但是有ping 和 nslookup

现在可以ping 通了


/ # ping nginx-statefulset-0.nginx-stateful-service
PING nginx-statefulset-0.nginx-stateful-service (10.244.2.109): 56 data bytes
64 bytes from 10.244.2.109: seq=0 ttl=64 time=14.367 ms
64 bytes from 10.244.2.109: seq=1 ttl=64 time=0.093 ms

nslookup 结果
可见 k8s 容器内 , DNS server 是10.96.0.10 kube-dns.kube-system.svc.cluster.local

它会把nginx-statefulset-0.nginx-stateful-service
解释为 10.244.2.109 nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local

而这个10.244.2.109 就是该pod的ip, 并且给了1个DNS 域名 nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local

/ # nslookup nginx-statefulset-0.nginx-stateful-service
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      nginx-statefulset-0.nginx-stateful-service
Address 1: 10.244.2.109 nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local
/ # nslookup nginx-statefulset-1.nginx-stateful-service
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      nginx-statefulset-1.nginx-stateful-service
Address 1: 10.244.3.62 nginx-statefulset-1.nginx-stateful-service.default.svc.cluster.local
/ # nslookup nginx-statefulset-2.nginx-stateful-service
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      nginx-statefulset-2.nginx-stateful-service
Address 1: 10.244.1.53 nginx-statefulset-2.nginx-stateful-service.default.svc.cluster.local

在容器内用该域名是可以访问ngnix服务的

/ # wget http://nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local
Connecting to nginx-statefulset-0.nginx-stateful-service.default.svc.cluster.local (10.244.2.109:80)
index.html           100% |******************************************************************************************************************************************************************|   615   0:00:00 ETA
/ # cat index.html 
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>




StatefulSet 的scale - 扩容和缩容

命令有两种

kubectl scale statefulset \$statefulsetName --replicas=5
# another method
kubectl patch statefulset statefulsetName -p '{"spec":{"replicas":5}}'

两种命令都可以用来扩容和缩容

例如:
用第一命令去把 replicas 从3 改成10

[gateman@manjaro-x13 statefulsets]$ kubectl scale statefulset nginx-statefulset --replicas=10
statefulset.apps/nginx-statefulset scaled
[gateman@manjaro-x13 statefulsets]$ 
[gateman@manjaro-x13 statefulsets]$ kubectl describe sts nginx-statefulset
Name:               nginx-statefulset
Namespace:          default
CreationTimestamp:  Sat, 22 Jun 2024 19:48:11 +0800
Selector:           app=nginx-stateful
Labels:             app=nginx-stateful
Annotations:        <none>
Replicas:           10 desired | 9 total
Update Strategy:    RollingUpdate
  Partition:        0
Pods Status:        9 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=nginx-stateful
  Containers:
   nginx-container:
    Image:         nginx:1.25.4
    Port:          80/TCP
    Host Port:     0/TCP
    Environment:   <none>
    Mounts:        <none>
  Volumes:         <none>
  Node-Selectors:  <none>
  Tolerations:     <none>
Volume Claims:     <none>
Events:
  Type    Reason            Age                From                    Message
  ----    ------            ----               ----                    -------
  Normal  SuccessfulCreate  36s (x2 over 98s)  statefulset-controller  create Pod nginx-statefulset-3 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  35s (x2 over 97s)  statefulset-controller  create Pod nginx-statefulset-4 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  34s                statefulset-controller  create Pod nginx-statefulset-5 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  32s                statefulset-controller  create Pod nginx-statefulset-6 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  31s                statefulset-controller  create Pod nginx-statefulset-7 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  29s                statefulset-controller  create Pod nginx-statefulset-8 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  28s                statefulset-controller  create Pod nginx-statefulset-9 in StatefulSet nginx-statefulset successful

通过describe 命令 可以见 当前运行的副本的确变成10 (9 running 还有1个正在创建)
而且pods 的创建是按顺序执行的




StatefulSet 的rollingUpdate 滚动更新

跟deployment 一样, 一样可以用set image 来执行

[gateman@manjaro-x13 bq-api-service]$ kubectl set image statefulset/nginx-statefulset nginx-container=nginx:1.26.1
statefulset.apps/nginx-statefulset image updated

从describe 信息来看, rolling update 也是按照pod的顺序来执行的

Events:
  Type    Reason            Age                From                    Message
  ----    ------            ----               ----                    -------
  Normal  SuccessfulCreate  13m (x2 over 23m)  statefulset-controller  create Pod nginx-statefulset-9 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  13m (x2 over 23m)  statefulset-controller  create Pod nginx-statefulset-8 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  13m                statefulset-controller  delete Pod nginx-statefulset-7 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  13m (x2 over 23m)  statefulset-controller  create Pod nginx-statefulset-7 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  12m                statefulset-controller  delete Pod nginx-statefulset-6 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  12m (x2 over 23m)  statefulset-controller  create Pod nginx-statefulset-6 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  12m (x2 over 24m)  statefulset-controller  create Pod nginx-statefulset-5 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  12m                statefulset-controller  delete Pod nginx-statefulset-5 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  12m (x2 over 24m)  statefulset-controller  delete Pod nginx-statefulset-4 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  12m (x3 over 25m)  statefulset-controller  create Pod nginx-statefulset-4 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  12m (x2 over 24m)  statefulset-controller  delete Pod nginx-statefulset-3 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  12m (x3 over 25m)  statefulset-controller  create Pod nginx-statefulset-3 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulDelete  12m                statefulset-controller  delete Pod nginx-statefulset-2 in StatefulSet nginx-statefulset successful
  Normal  SuccessfulCreate  12m (x2 over 3h)   statefulset-controller  create Pod nginx-statefulset-2 in StatefulSet nginx-statefulset successful

一样可以用kubectl annotate 命令去提供change cause

[gateman@manjaro-x13 bq-api-service]$ kubectl annotate statefulset/nginx-statefulset kubernetes.io/change-cause="downgraded to 1.26.1"
statefulset.apps/nginx-statefulset annotated

但是很可惜, 跟deployment 不一样, 不能在revision list 里显示 change cause的annotation

[gateman@manjaro-x13 bq-api-service]$ kubectl rollout history statefulset/nginx-statefulset 
statefulset.apps/nginx-statefulset 
REVISION  CHANGE-CAUSE
1         <none>
2         <none>

而且跟deployment 不同
statefulset的 revision 没有 replicasets 的对应

[gateman@manjaro-x13 statefulsets]$ kubectl get rs -o wide
NAME                                   DESIRED   CURRENT   READY   AGE   CONTAINERS                 IMAGES                                                                       SELECTOR
bq-api-service-deployment-6f6ffc7866   0         0         0       18h   bq-api-service-container   europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.7   app=bq-api-service,pod-template-hash=6f6ffc7866
bq-api-service-deployment-978b76fcf    3         3         3       8h    bq-api-service-container   europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.8   app=bq-api-service,pod-template-hash=978b76fcf
bq-api-service-deployment-c4979b697    0         0         0       18h   bq-api-service-container   europe-west2-docker.pkg.dev/jason-hsbc/my-docker-repo/bq-api-service:1.1.6   app=bq-api-service,pod-template-hash=c4979b697

只有deployment的record

;