一、环境准备
- Ubuntu 20.04 x5
- Etcd 3.4.16
- Kubernetes 1.21.1
- Containerd 1.3.3
1.1、处理 IPVS
由于 Kubernetes 新版本 Service 实现切换到 IPVS,所以需要确保内核加载了 IPVS modules;以下命令将设置系统启动自动加载 IPVS 相关模块,执行完成后需要重启。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| # Kernel modules cat > /etc/modules-load.d/50-kubernetes.conf <<EOF # Load some kernel modules needed by kubernetes at boot nf_conntrack br_netfilter ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed EOF # sysctl cat > /etc/sysctl.d/50-kubernetes.conf <<EOF net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 fs.inotify.max_user_watches=525000 EOF
|
重启完成后务必检查相关 module 加载以及内核参数设置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| # check ipvs modules ➜ ~ lsmod | grep ip_vs ip_vs_sed 16384 0 ip_vs_nq 16384 0 ip_vs_fo 16384 0 ip_vs_sh 16384 0 ip_vs_dh 16384 0 ip_vs_lblcr 16384 0 ip_vs_lblc 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 0 ip_vs_wlc 16384 0 ip_vs_lc 16384 0 ip_vs 155648 22 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed nf_conntrack 139264 1 ip_vs nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs libcrc32c 16384 5 nf_conntrack,btrfs,xfs,raid456,ip_vs # check sysctl ➜ ~ sysctl -a | grep ip_forward net.ipv4.ip_forward = 1 net.ipv4.ip_forward_update_priority = 1 net.ipv4.ip_forward_use_pmtu = 0 ➜ ~ sysctl -a | grep bridge-nf-call net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1
|
1.2、安装 Containerd
Containerd 在 Ubuntu 20 中已经在默认官方仓库中包含,所以只需要 apt 安装即可:
| # 其他软件包后面可能会用到,所以顺手装了 apt install containerd bridge-utils nfs-common tree -y
|
安装成功后可以通过执行 ctr images ls
命令验证, 本章节不会对 Containerd 配置做说明,Containerd 配置文件将在 Kubernetes 安装时进行配置。
二、安装 kubernetes
2.1、安装 Etcd 集群
Etcd 对于 Kubernetes 来说是核心中的核心,所以个人还是比较喜欢在宿主机安装;宿主机安装情况下为了方便我打包了一些 *-pack
的工具包,用于快速处理:
安装 CFSSL 和 ETCD
| # 下载安装包 wget https://github.com/mritd/etcd-pack/releases/download/v3.4.16/etcd_v3.4.16.run wget https://github.com/mritd/cfssl-pack/releases/download/v1.5.0/cfssl_v1.5.0.run # 安装 cfssl 和 etcd chmod +x *.run ./etcd_v3.4.16.run install ./cfssl_v1.5.0.run install
|
安装完成后, 自行调整 /etc/cfssl/etcd/etcd-csr.json
相关 IP,然后执行同目录下 create.sh
生成证书。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| ➜ ~ cat /etc/cfssl/etcd/etcd-csr.json { "key": { "algo": "rsa", "size": 2048 }, "names": [ { "O": "etcd", "OU": "etcd Security", "L": "Beijing", "ST": "Beijing", "C": "CN" } ], "CN": "etcd", "hosts": [ "127.0.0.1", "localhost", "*.etcd.node", "*.kubernetes.node", "10.0.0.11", "10.0.0.12", "10.0.0.13" ] } # 复制到 3 台 master ➜ ~ for ip in `seq 1 3`; do scp /etc/cfssl/etcd/*.pem [email protected]$ip:/etc/etcd/ssl; done
|
证书生成完成后调整每台机器的 Etcd 配置文件,然后修复权限启动。
| # 复制配置 for ip in `seq 1 3`; do scp /etc/etcd/etcd.cluster.yaml [email protected]$ip:/etc/etcd/etcd.yaml; done # 修复权限 for ip in `seq 1 3`; do ssh [email protected]$ip chown -R etcd:etcd /etc/etcd; done # 每台机器启动 systemctl start etcd
|
启动完成后通过 etcdctl
验证集群状态:
| # 稳妥点应该执行 etcdctl endpoint health ➜ ~ etcdctl member list 55fcbe0adaa45350, started, etcd3, https://10.0.0.13:2380, https://10.0.0.13:2379, false cebdf10928a06f3c, started, etcd1, https://10.0.0.11:2380, https://10.0.0.11:2379, false f7a9c20602b8532e, started, etcd2, https://10.0.0.12:2380, https://10.0.0.12:2379, false
|
2.2、安装 kubeadm
kubeadm 国内用户建议使用 aliyun 的安装源:
| # kubeadm apt-get install -y apt-transport-https curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF apt update # ebtables、ethtool kubelet 可能会用,具体忘了,反正从官方文档上看到的 apt install kubelet kubeadm kubectl ebtables ethtool -y
|
2.3、安装 kube-apiserver-proxy
kube-apiserver-proxy 是我自己编译的一个仅开启四层代理的 Nginx,其主要负责监听 127.0.0.1:6443
并负载到所有的 Api Server 地址( 0.0.0.0:5443
):
| wget https://github.com/mritd/kube-apiserver-proxy-pack/releases/download/v1.20.0/kube-apiserver-proxy_v1.20.0.run chmod +x *.run ./kube-apiserver-proxy_v1.20.0.run install
|
安装完成后根据 IP 地址不同自行调整 Nginx 配置文件,然后启动:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| ➜ ~ cat /etc/kubernetes/apiserver-proxy.conf error_log syslog:server=unix:/dev/log notice; worker_processes auto; events { multi_accept on; use epoll; worker_connections 1024; } stream { upstream kube_apiserver { least_conn; server 10.0.0.11:5443; server 10.0.0.12:5443; server 10.0.0.13:5443; } server { listen 0.0.0.0:6443; proxy_pass kube_apiserver; proxy_timeout 10m; proxy_connect_timeout 1s; } } systemctl start kube-apiserver-proxy
|
2.4、安装 kubeadm-config
kubeadm-config 是一系列配置文件的组合以及 kubeadm 安装所需的必要镜像文件的打包,安装完成后将会自动配置 Containerd、ctrictl 等:
| wget https://github.com/mritd/kubeadm-config-pack/releases/download/v1.21.1/kubeadm-config_v1.21.1.run chmod +x *.run # --load 选项用于将 kubeadm 所需镜像 load 到 containerd 中 ./kubeadm-config_v1.21.1.run install --load
|
2.4.1、containerd 配置
Containerd 配置位于 /etc/containerd/config.toml
,其配置如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| version = 2 # 指定存储根目录 root = "/data/containerd" state = "/run/containerd" # OOM 评分 oom_score = -999 [grpc] address = "/run/containerd/containerd.sock" [metrics] address = "127.0.0.1:1234" [plugins] [plugins."io.containerd.grpc.v1.cri"] # sandbox 镜像 sandbox_image = "k8s.gcr.io/pause:3.4.1" [plugins."io.containerd.grpc.v1.cri".containerd] snapshotter = "overlayfs" default_runtime_name = "runc" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes] [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" # 开启 systemd cgroup [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true
|
2.4.2、crictl 配置
在切换到 Containerd 以后意味着以前的 docker
命令将不再可用,containerd 默认自带了一个 ctr
命令,同时 CRI 规范会自带一个 crictl
命令; crictl
命令配置文件存放在 /etc/crictl.yaml
中:
| runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock pull-image-on-create: true
|
2.4.3、kubeadm 配置
kubeadm 配置目前分为 2 个,一个是用于首次引导启动的 init 配置,另一个是用于其他节点 join 到 master 的配置;其中比较重要的 init 配置如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
| # /etc/kubernetes/kubeadm.yaml apiVersion: kubeadm.k8s.io/v1beta2 kind: InitConfiguration # kubeadm token create bootstrapTokens: - token: "c2t0rj.cofbfnwwrb387890" nodeRegistration: # CRI 地址(Containerd) criSocket: unix:///run/containerd/containerd.sock kubeletExtraArgs: runtime-cgroups: "/system.slice/containerd.service" rotate-server-certificates: "true" localAPIEndpoint: advertiseAddress: "10.0.0.11" bindPort: 5443 # kubeadm certs certificate-key certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b --- apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration clusterName: "kuberentes" kubernetesVersion: "v1.21.1" certificatesDir: "/etc/kubernetes/pki" # Other components of the current control plane only connect to the apiserver on the current host. # This is the expected behavior, see: https://github.com/kubernetes/kubeadm/issues/2271 controlPlaneEndpoint: "127.0.0.1:6443" etcd: external: endpoints: - "https://10.0.0.11:2379" - "https://10.0.0.12:2379" - "https://10.0.0.13:2379" caFile: "/etc/etcd/ssl/etcd-ca.pem" certFile: "/etc/etcd/ssl/etcd.pem" keyFile: "/etc/etcd/ssl/etcd-key.pem" networking: serviceSubnet: "10.66.0.0/16" podSubnet: "10.88.0.1/16" dnsDomain: "cluster.local" apiServer: extraArgs: v: "4" alsologtostderr: "true" # audit-log-maxage: "21" # audit-log-maxbackup: "10" # audit-log-maxsize: "100" # audit-log-path: "/var/log/kube-audit/audit.log" # audit-policy-file: "/etc/kubernetes/audit-policy.yaml" authorization-mode: "Node,RBAC" event-ttl: "720h" runtime-config: "api/all=true" service-node-port-range: "30000-50000" service-cluster-ip-range: "10.66.0.0/16" # insecure-bind-address: "0.0.0.0" # insecure-port: "8080" # The fraction of requests that will be closed gracefully(GOAWAY) to prevent # HTTP/2 clients from getting stuck on a single apiserver. goaway-chance: "0.001" # extraVolumes: # - name: "audit-config" # hostPath: "/etc/kubernetes/audit-policy.yaml" # mountPath: "/etc/kubernetes/audit-policy.yaml" # readOnly: true # pathType: "File" # - name: "audit-log" # hostPath: "/var/log/kube-audit" # mountPath: "/var/log/kube-audit" # pathType: "DirectoryOrCreate" certSANs: - "*.kubernetes.node" - "10.0.0.11" - "10.0.0.12" - "10.0.0.13" timeoutForControlPlane: 1m controllerManager: extraArgs: v: "4" node-cidr-mask-size: "19" deployment-controller-sync-period: "10s" experimental-cluster-signing-duration: "8670h" node-monitor-grace-period: "20s" pod-eviction-timeout: "2m" terminated-pod-gc-threshold: "30" scheduler: extraArgs: v: "4" --- apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration failSwapOn: false oomScoreAdj: -900 cgroupDriver: "systemd" kubeletCgroups: "/system.slice/kubelet.service" nodeStatusUpdateFrequency: 5s rotateCertificates: true evictionSoft: "imagefs.available": "15%" "memory.available": "512Mi" "nodefs.available": "15%" "nodefs.inodesFree": "10%" evictionSoftGracePeriod: "imagefs.available": "3m" "memory.available": "1m" "nodefs.available": "3m" "nodefs.inodesFree": "1m" evictionHard: "imagefs.available": "10%" "memory.available": "256Mi" "nodefs.available": "10%" "nodefs.inodesFree": "5%" evictionMaxPodGracePeriod: 30 imageGCLowThresholdPercent: 70 imageGCHighThresholdPercent: 80 kubeReserved: "cpu": "500m" "memory": "512Mi" "ephemeral-storage": "1Gi" --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration # kube-proxy specific options here clusterCIDR: "10.88.0.1/16" mode: "ipvs" oomScoreAdj: -900 ipvs: minSyncPeriod: 5s syncPeriod: 5s scheduler: "wrr"
|
init 配置具体含义请自行参考官方文档,相对于 init 配置,join 配置比较简单, 不过需要注意的是如果需要 join 为 master 则需要 controlPlane
这部分,否则请注释掉 controlPlane
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| # /etc/kubernetes/kubeadm-join.yaml apiVersion: kubeadm.k8s.io/v1beta2 kind: JoinConfiguration controlPlane: localAPIEndpoint: advertiseAddress: "10.0.0.12" bindPort: 5443 certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b discovery: bootstrapToken: apiServerEndpoint: "127.0.0.1:6443" token: "c2t0rj.cofbfnwwrb387890" # Please replace with the "--discovery-token-ca-cert-hash" value printed # after the kubeadm init command is executed successfully caCertHashes: - "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c" nodeRegistration: criSocket: unix:///run/containerd/containerd.sock kubeletExtraArgs: runtime-cgroups: "/system.slice/containerd.service" rotate-server-certificates: "true"
|
2.5、拉起 master
在调整好配置后,拉起 master 节点只需要一条命令:
| kubeadm init --config /etc/kubernetes/kubeadm.yaml --upload-certs --ignore-preflight-errors=Swap
|
拉起完成后记得保存相关 Token 以便于后续使用。
2.6、拉起其他 master
在第一个 master 启动完成后,使用 join
命令让其他 master 加入即可; 需要注意的是 kubeadm-join.yaml
配置中需要替换 caCertHashes
为第一个 master 拉起后的 discovery-token-ca-cert-hash
的值。
| kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
|
2.7、拉起其他 node
node 节点拉起与拉起其他 master 节点一样,唯一不同的是需要注释掉配置中的 controlPlane
部分。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| # /etc/kubernetes/kubeadm-join.yaml apiVersion: kubeadm.k8s.io/v1beta2 kind: JoinConfiguration #controlPlane: # localAPIEndpoint: # advertiseAddress: "10.0.0.12" # bindPort: 5443 # certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b discovery: bootstrapToken: apiServerEndpoint: "127.0.0.1:6443" token: "c2t0rj.cofbfnwwrb387890" # Please replace with the "--discovery-token-ca-cert-hash" value printed # after the kubeadm init command is executed successfully caCertHashes: - "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c" nodeRegistration: criSocket: unix:///run/containerd/containerd.sock kubeletExtraArgs: runtime-cgroups: "/system.slice/containerd.service" rotate-server-certificates: "true"
|
| kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
|
2.8、其他处理
由于 kubelet 开启了证书轮转,所以新集群会有大量 csr 请求,批量允许即可:
| kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
|
同时为了 master 节点也能负载 pod,需要调整污点:
| kubectl taint nodes --all node-role.kubernetes.io/master-
|
后续 CNI 等不在本文内容范围内。
三、Containerd 常用操作
| # 列出镜像 ctr images ls # 列出 k8s 镜像 ctr -n k8s.io images ls # 导入镜像 ctr -n k8s.io images import xxxx.tar # 导出镜像 ctr -n k8s.io images export kube-scheduler.tar k8s.gcr.io/kube-scheduler:v1.21.1
|
四、资源仓库
本文中所有 *-pack
仓库地址如下: