Kubernetes 从版本 v1.20 之后,弃用 Docker 这个容器运行时。参考别慌: Kubernetes 和 Docker
考虑到此改变带来的影响,Kubernetes 使用了一个加长的废弃时间表。 在 Kubernetes 1.22 版之前,它不会被彻底移除;换句话说,dockershim 被移除的最早版本会是 2021 年底发布的 1.23 版。 更新:dockershim 计划在 Kubernetes 1.24 版被移除
CentOS通过containerd安装k8s集群
前置条件
-
关闭swap
# 临时关闭 sudo swapoff -a # 永久关闭 sudo sed -ri 's/.*swap.*/#&/' /etc/fstab
-
关闭禁用各防火墙, 各系统不一样
# 临时关闭 systemctl stop firewalld.service # 永久关闭 systemctl disable --now firewalld.service
-
关闭selinux
# Set SELinux in permissive mode (effectively disabling it) sudo setenforce 0 sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
-
主机名修改(可不修改), 根据实际情况修改
master
# 设置master的hostname,并且修改hosts文件 sudo hostnamectl set-hostname k8s-master vi /etc/hosts 192.168.8.51 k8s-master 192.168.8.61 k8s-work1 192.168.8.62 k8s-work2
worker
# 设置worker01/02的hostname,并且修改hosts文件 sudo hostnamectl set-hostname k8s-work1 sudo hostnamectl set-hostname k8s-work2 vi /etc/hosts 192.168.8.51 k8s-master 192.168.8.61 k8s-work1 192.168.8.62 k8s-work2
安装containerd
参考
前置条件
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
安装containerd
sudo yum install -y yum-utils
sudo yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install containerd.io -y
修改配置
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
使用systemd的cgroup driver, 文件/etc/containerd/config.toml
, 增加SystemdCgroup = true
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
...
修改镜像源, 具体版本可能都不太一样, 可以去阿里云镜像获取最新的
...
[plugins]
...
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.5"
...
重启containerd
sudo systemctl restart containerd
作为系统服务
sudo systemctl enable --now containerd
安装kubeadmin
参考 CentOS7中用kubeadm安装Kubernetes
安装 kubelet
, kubeadm
, kubectl
等工具, 修改为阿里源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# Set SELinux in permissive mode (effectively disabling it)
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
sudo systemctl enable --now kubelet
## 指定版本
# export KUBE_VERSION=1.22.0
# sudo yum install -y kubelet-${KUBE_VERSION} kubeadm-${KUBE_VERSION} kubectl-${KUBE_VERSION} --disableexcludes=kubernetes
crictl
安装kubeadm会自动安装crictl, 通过crictl进行调试
文件/etc/crictl.yaml
cat <<EOF > /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
验证命令crictl pods
以上内容在master和node节点都需要执行
master执行
通过kubeadm生成默认config文件
kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml
修改kubeadm.yml文件, 修改后如下(根据个人环境不同, 可能不太一样)
- 修改
criSocket
:默认使用docker做为runtime,修改为containerd.sock
,使用containerd做为runtime - 修改
imageRepository
,改为aliyun的镜像仓库地址 - 修改
podSubnet
以及serviceSubnet
,根据的自己的环境进行设置 - 设置
cgroupDriver
为systemd
, 现已默认为systemd
可不用修改 - 修改
nodeRegistration.name
, 为master节点的名字
示例文件如下
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 0.0.0.0
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
# master hostname
name: k8s-master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.22.0
networking:
dnsDomain: cluster.local
# kube-flannel 网段, 默认配置
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
- 查看、拉取相关镜像(可跳过,建议使用本地镜像,因为coredns镜像的原因,kubeadm init会拉取失败,参考下一步)
kubeadm config images list --image-repository registry.aliyuncs.com/google_containers
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
- 拉取后,如果使用本地镜像,请修改kubeadm.yml中的镜像拉取策略:imagePullPolicy,
- 准备好相关配置后,执行如下命令,初始化master集群
kubeadm init --config=kubeadm.yml --upload-certs --ignore-preflight-errors=ImagePull
--ignore-preflight-errors
因为我使用的是默认imagePullPolicy, coredns会拉取失败,所以选择忽略- 建议还是先执行第5步,将所有镜像拉取到本地,使用本地镜像,初始化时间差不多
- 到这一步基本配置完成,如果没有特殊问题,master集群应该正常启动
- 通过crictl可以看到相关的pod已经启动
- 添加到
profile
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
source /etc/profile
- 配置网络
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
- 网段
10.244.0.0/16
,kubeadm init
增加--pod-network-cidr 10.244.0.0/16
参数, 或者kubeadm.yml
增加networking.podSubnet: 10.244.0.0/16
- 优先于
kubeadm join
- 通过
kubeadm join
加入master时,要加上参数--cri-socket /run/containerd/containerd.sock
, 可以不加- 指定使用containerd做为container runtime
- 命令如下,master地址及token cert,master执行完Kubeadm init后,会有提示,在后面追加
--cri-socket /run/containerd/containerd.sock
执行kubeadm join 172.17.18.237:8080 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f63893c3540a5b5032ba25c86293d54f118728476a6544478a93e8d051984c55 --cri-socket /run/containerd/containerd.sock
- 重新生成token,
kubeadm token create --print-join-command
, 查看token,kubeadm token list
worker执行
# 需要根据实际情况执行
kubeadm join 172.17.18.237:8080 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f63893c3540a5b5032ba25c86293d54f118728476a6544478a93e8d051984c55
# 复制主节点的/etc/kubernetes/admin.conf到从节点下
错误排查
kubectl get nodes
kubectl get pods -n kube-system
kubectl get pods -A
kubectl get pods --all-namespaces
crictl pods
crictl ps -a
# 查看kubelet错误日志
journalctl -fu kubelet
journalctl -xeu kubelet
# 查看容器详情
kubectl describe pod kube-flannel-ds-jpp96
# 查看pods错误日志 -n 空间名 -c容器名
kubectl -n kube-system logs kube-flannel-ds-jpp96 -c install-cni
# 查看pod详情
kubectl get pod --all-namespaces -o wide
# 重新部署coredns
kubectl -n kube-system rollout restart deployment coredns
重置kubeadm配置
# 删除kube-flannel
kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 重置kubeadm
kubeadm reset
# 查看网卡
ip link
ip address
ip l
ip a
# 删除增加的网卡
ip link del cni0
ip link del flannel.1
# 更彻底的卸载方法
# 卸载服务,所有节点
kubeadm reset -f
# 清空iptables规则,所有节点
iptables -F
iptables -X
# 清空ipvs规则,所有节点
ipvsadm -C
# 清空CNI规则,所有节点
rm -rf /etc/cni/net.d
# 清空CNI规则,所有节点
rm -rf $HOME/.kube/config
删除kubeadm
#!/bin/bash
kubeadm reset -f
modprobe -r ipip
lsmod
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /usr/bin/kube*
rm -rf /etc/cni
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd
yum -y remove kubeadm kubectl kubelet
安装Dashboard
# 安装dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended.yaml
# 访问路径
http://127.0.0.1:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
# get token
# 此token权限有问题
# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | awk '/^deployment-controller-token-/{print $1}') | awk '$1=="token:"{print $2}'
# 开启访问路径
kubectl proxy &
权限
#清理旧提权
kubectl delete clusterrolebinding serviceaccount-cluster-admin
#创建集群用户
kubectl create clusterrolebinding serviceaccount-cluster-admin \
--clusterrole=cluster-admin \
--user=system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard
#查token用于登录
kubectl describe secret kubernetes-dashboard -n kubernetes-dashboard | awk '$1=="token:"{print $2}'
以上可通过token登录dashboard
如果希望dashboard通过kubeconfig登录, 参考以下方式, (前提创建好了serviceaccount-cluster-admin生成的token)
# 设置集群参数
export KUBE_APISERVER="https://192.168.205.200:16443"
# 设置文件名
export DASHBOARD_FILE_PATH="/root/dashboard-admin.conf"
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.crt \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${DASHBOARD_FILE_PATH}
# 设置客户端认证参数
kubectl config set-credentials kubernetes-dashboard \
--token=$(kubectl describe secret kubernetes-dashboard -n kubernetes-dashboard | awk '$1=="token:"{print $2}') \
--kubeconfig=${DASHBOARD_FILE_PATH}
# 设置上下文参数
kubectl config set-context kubernetes-dashboard@kubernetes \
--cluster=kubernetes \
--user=kubernetes-dashboard \
--namespace=kubernetes-dashboard \
--kubeconfig=${DASHBOARD_FILE_PATH}
# 设置默认上下文
kubectl config use-context kubernetes-dashboard@kubernetes \
--kubeconfig=${DASHBOARD_FILE_PATH}
# 最后将/root/dashboard-admin.conf复制到客户端登录即可
kubectl
设置名字空间偏好
kubectl config set-context --current --namespace=<名字空间名称>
# 验证配置
kubectl config view | grep namespace:
# 显示所有的资源名和缩写
kubectl api-resources
删除pod
kubectl get pods 发现有很多pod出问题了
-
要删除所有Evicted或者 OutOfCpu 的pods应用如下语句:
# 打印所有Evicted 的 pod kubectl get pods | grep Evicted | awk '{print $1}' # 批量删掉pod kubectl get pods | grep Evicted | awk '{print $1}' | xargs kubectl delete pod # 打印所有OutOfCpu 的 pod kubectl get pods | grep OutOfcpu | awk '{print $1}' # 批量删掉pod kubectl get pods | grep OutOfcpu | awk '{print $1}' | xargs kubectl delete pod
-
一直Terminating可使用kubectl中的强制删除命令
# 删除POD kubectl delete pod PODNAME --force --grace-period=0 # 删除NAMESPACE kubectl delete namespace NAMESPACENAME --force --grace-period=0
-
若以上方法无法删除,可使用第二种方法,直接从ETCD中删除源数据
# 删除default namespace下的pod名为pod-to-be-deleted-0 ETCDCTL_API=3 etcdctl del /registry/pods/default/pod-to-be-deleted-0 # 删除需要删除的NAMESPACE etcdctl del /registry/namespaces/NAMESPACENAME
master 安装haproxy和keepalived
官方提供了2种高可用的部署方式,一种是外部ETCD的方式,即部署一个单独的ETCD集群,另一种就是混合部署,ETCD和apiserver一起部署。我们采用第二种方式部署,一是部署简单,不需要在单独部署ETCD,另一个因素就是节约服务器。
服务器规划
首先准备几台服务器,计划部署3台master,3台keepalived和haproxy,为了节约服务器,我将keepalived和haproxy和master一起部署,k8s-vip的ip地址是通过keepalived抢占的方式获取的。服务器规划如下:
角色 | ip地址 |
---|---|
k8s-vip | 192.168.8.100 |
k8s-master01 | 192.168.8.7 |
k8s-master02 | 192.168.8.8 |
k8s-master03 | 192.168.8.9 |
环境准备
设置hostname
hostnamectl set-hostname <hostname>
修改/etc/hosts
文件
vi /etc/hosts
## 添加内容
192.168.8.7 k8s-master01
192.168.8.8 k8s-master02
192.168.8.9 k8s-master03
192.168.8.100 k8s-vip
关闭防火墙
## 配置内核参数,将桥接的IPv4流量传递到iptables的链
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
## 手动加载配置文件
sysctl --system
## 防火墙关闭
systemctl stop firewalld
systemctl disable firewalld
## 将 SELinux 设置为 permissive 模式(相当于将其禁用)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
## 关闭交换空间
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
## ip转发
echo '1' > /proc/sys/net/ipv4/ip_forward
安装keepalived 和 haproxy
安装keepalived haproxy
yum install keepalived haproxy -y
配置 keepalived
cat <<EOF > /etc/keepalived/keepalived.conf
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state ${STATE}
interface ${INTERFACE}
virtual_router_id ${ROUTER_ID}
priority ${PRIORITY}
authentication {
auth_type PASS
auth_pass ${AUTH_PASS}
}
virtual_ipaddress {
${APISERVER_VIP}
}
track_script {
check_apiserver
}
}
EOF
其中在上面的文件中替换自己相应的内容:
${STATE}
如果是主节点 则为MASTER
其他则为BACKUP
。我这里选择k8s-master01
为MASTER
;k8s-master02
、k8s-master03
为BACKUP
;${INTERFACE}
是网络接口,即服务器网卡的,我的服务器均为eth0;${ROUTER_ID}
这个值只要在keepalived集群中保持一致即可,我使用的是默认值51;${PRIORITY}
优先级,在master上比在备份服务器上高就行了。我的master设为100,备份服务50;${AUTH_PASS}
这个值只要在keepalived集群中保持一致即可;${APISERVER_VIP}
就是VIP的地址,我的为:192.168.8.100。
配置 keepalived健康检查
cat <<EOF > /etc/keepalived/check_apiserver.sh
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi
EOF
其中
${APISERVER_VIP}
就是VIP的地址,192.168.8.100
;${APISERVER_DEST_PORT}
这个是和apiserver交互的端口号,其实就是HAProxy绑定的端口号,因为HAProxy和k8s一起部署,这里做一个区分,我使用了16443,这个下面会说到。
配置haproxy
编辑/etc/haproxy/haproxy.cfg
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
bind *:${APISERVER_DEST_PORT}
mode tcp
option tcplog
default_backend apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
上面的配置需要修改为自己的配置:
${APISERVER_DEST_PORT}
这个值同上面的健康检查脚本里面的值一样,我这里使用16443;${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT}
其实就是你的k8s主节点的配置,
比如我的配置是:
### server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
server k8s-master01 192.168.8.7:6443 check
server k8s-master02 192.168.8.8:6443 check
server k8s-master03 192.168.8.9:6443 check
上面的配置完成后启动keepalived和haproxy,并设置为自动启动。
systemctl enable haproxy --now
systemctl enable keepalived --now
之后再安装k8s
kubeadm.yml
文件中的ClusterConfiguration
添加vip的ip地址, controlPlaneEndpoint: "192.168.8.100:16443"
...
kind: ClusterConfiguration
controlPlaneEndpoint: "192.168.8.100:16443"
...
其他master节点加入的时候, 指定--control-plane
, worker
不需要指定
安装ingress
本文使用bitnami
# install helm
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
# add repository
helm repo add bitnami https://charts.bitnami.com/bitnami
# install
# helm install my-release bitnami/nginx-ingress-controller
# 可以通过设置更改默认参数
# curl -O https://raw.githubusercontent.com/bitnami/charts/master/bitnami/nginx-ingress-controller/values.yaml
# 修改values.yaml中的参数, 比如可以更改为 DaemonSet
# kind: Deployment
# kind: DaemonSet
# 然后将daemonset.useHostPort 设为 true
cat <<EOF > nginx-ingress-values.yaml
kind: DaemonSet
daemonset:
useHostPort: true
EOF
# 安装
helm upgrade --install nginx-ingress-controller -f nginx-ingress-values.yaml bitnami/nginx-ingress-controller --namespace nginx-ingress --create-namespace
# 卸载
helm uninstall nginx-ingress-controller
测试是否成功
# 创建deployment
kubectl create deployment demo --image=httpd --port=80
kubectl expose deployment demo
# 创建ingress
kubectl create ingress demo --class=nginx --rule="www.demo.io/*=demo:80"
# 清理
kubectl delete deployment demo
kubectl delete service demo
kubectl delete ingress demo
ingress节点高可用
k8s节点安装keepalived
, 实现ingress nginx
高可用, 比如设置为192.168.205.200
, 网卡ens33
, 参考Kubernetes Ingress-Nginx实现高可用
# 安装
yum -y install keepalived
# 修改配置
vim /etc/keepalived/keepalived.conf
# ---------------------------------
# 配置master内容
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass g81aoqu74w0CevMR
}
virtual_ipaddress {
192.168.205.200
}
}
# ---------------------------------
# 配置backup内容
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 50
advert_int 1
authentication {
auth_type PASS
auth_pass g81aoqu74w0CevMR
}
virtual_ipaddress {
192.168.205.200
}
}
# ---------------------------------
# 启动服务
systemctl enable keepalived --now
dashboard 通过ingress访问
参考ingress configuration for dashboard
cat <<EOF > dashboard-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kubernetes-dashboard
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
tls:
- hosts:
- "dashboard.my.example.com"
secretName: kubernetes-dashboard-secret
rules:
- host: "dashboard.my.example.com"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard
port:
number: 443
EOF
# 应用
kubectl apply -f dashboard-ingress.yaml
# 删除ingress
# kubectl delete -f dashboard-ingress.yaml
修改本地hosts文件到xxx.xxx.xxx.xxx dashboard.my.example.com
通过https://dashboard.my.example.com访问
解决证书问题, 自签名证书
# 创建自签名证书
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/nginx.key -out /tmp/nginx.crt -subj "/CN=dashboard.my.example.com/O=dashboard.my.example.com" -addext "subjectAltName = DNS:dashboard.my.example.com"
# 创建secret, 和ingress中的对应
kubectl create secret tls kubernetes-dashboard-secret -nkubernetes-dashboard --key /tmp/nginx.key --cert /tmp/nginx.crt
本地安装证书即可
为了解决多个域名, 需要安装的问题, 可以先生成根证书, 然后安装根证书, 即可
export MY_SSL_FILE_ROOT=/tmp/k8s-tls
[ -d ${MY_SSL_FILE_ROOT} ] || mkdir ${MY_SSL_FILE_ROOT}
# 生成CA根证书
openssl req -x509 -nodes -days 365 -newkey rsa:4096 -keyout ${MY_SSL_FILE_ROOT}/my_ca.pem -out ${MY_SSL_FILE_ROOT}/my_ca.crt -subj "/CN=myca/O=myca"
每个应用生成一次
# 某个服务生成key
export MY_SECRET=kubernetes-dashboard-secret
export MY_NAMESPACE=kubernetes-dashboard
export MY_HOST=dashboard.my.example.com
export MY_SSL_FILE_ROOT=/tmp/k8s-tls
[ -d ${MY_SSL_FILE_ROOT} ] || mkdir ${MY_SSL_FILE_ROOT}
echo subjectAltName = DNS:${MY_HOST} >> ${MY_SSL_FILE_ROOT}/extfile.cnf
echo extendedKeyUsage = serverAuth >> ${MY_SSL_FILE_ROOT}/extfile.cnf
openssl genrsa -out ${MY_SSL_FILE_ROOT}/${MY_SECRET}.pem 4096
openssl req -subj "/CN=${MY_HOST}" -sha256 -new -key ${MY_SSL_FILE_ROOT}/${MY_SECRET}.pem -out ${MY_SSL_FILE_ROOT}/${MY_SECRET}.csr
openssl x509 -req -days 365 -sha256 -in ${MY_SSL_FILE_ROOT}/${MY_SECRET}.csr -CA ${MY_SSL_FILE_ROOT}/my_ca.crt -CAkey ${MY_SSL_FILE_ROOT}/my_ca.pem -CAcreateserial -out ${MY_SSL_FILE_ROOT}/${MY_SECRET}-cert.pem -extfile ${MY_SSL_FILE_ROOT}/extfile.cnf
cat ${MY_SSL_FILE_ROOT}/${MY_SECRET}-cert.pem ${MY_SSL_FILE_ROOT}/my_ca.crt > ${MY_SSL_FILE_ROOT}/${MY_SECRET}-chain.pem
# 删除secret
kubectl delete secret ${MY_SECRET} -n ${MY_NAMESPACE}
# 创建secret
kubectl create secret tls ${MY_SECRET} -n ${MY_NAMESPACE} --key ${MY_SSL_FILE_ROOT}/${MY_SECRET}.pem --cert ${MY_SSL_FILE_ROOT}/${MY_SECRET}-chain.pem
# 检验证书
openssl x509 -noout -text -in ${MY_SSL_FILE_ROOT}/${MY_SECRET}-chain.pem
ingress添加tls
...
spec:
tls:
- hosts:
- "dashboard.my.example.com"
secretName: kubernetes-dashboard-secret
...
或者使用cert-manager
, 参考如下
安装cert-manager
# 添加helm仓库
helm repo add jetstack https://charts.jetstack.io
# 更新
helm repo update
# 安装
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.6.1 \
--set installCRDs=true
# 卸载
helm --namespace cert-manager delete cert-manager
安装cluster-issuer, Issuer 只能用来签发自己所在 namespace 下的证书,ClusterIssuer 可以签发任意 namespace 下的证书,类似k8s中role和clusterrole
# 生成CA证书, 参考上述内容
# 生成secret
kubectl create secret tls ca-key-pair \
--cert=/tmp/k8s-tls/my_ca.crt \
--key=/tmp/k8s-tls/my_ca.key \
--namespace=cert-manager
# 创建签发机构
cat <<EOF > cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: ca-issuer
namespace: cert-manager
spec:
ca:
secretName: ca-key-pair
EOF
# 启用
kubectl apply -f cluster-issuer.yaml
nginx-ingress, 自动签署
cat <<EOF > ingress-ca-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 1
selector:
matchLabels:
run: my-nginx
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: my-nginx
labels:
app: my-nginx
spec:
ports:
- port: 80
protocol: TCP
name: http
selector:
run: my-nginx
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-nginx
annotations:
kubernetes.io/ingress.class: "nginx"
kubernetes.io/tls-acme: "true"
cert-manager.io/cluster-issuer: "ca-issuer"
spec:
rules:
- host: nginx.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-nginx
port:
number: 80
tls:
- secretName: nginx-secret
hosts:
- nginx.local
EOF
# 应用
kubectl apply -f ingress-ca-demo.yaml
# 访问
https://nginx.local
# 删除示例
kubectl delete -f ingress-ca-demo.yaml
rancher
通过helm安装rancher
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
kubectl create namespace cattle-system
# install cert-manager
# helm install
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.my.org \
--set bootstrapPassword=admin
# waiting
kubectl -n cattle-system rollout status deploy/rancher
# verify
kubectl -n cattle-system get deploy rancher
通过docker安装rancher
# docker install
curl https://releases.rancher.com/install-docker/20.10.sh | sh
参考Installing Rancher on a Single Node Using Docker
# 安装docker
docker run -d --restart=unless-stopped \
-p 80:80 -p 443:443 \
--privileged \
rancher/rancher:latest
RKE安装rancher cluster
通过Cluster Management
创建create
, 创建cluster, Use existing nodes and create a cluster using RKE
-> Custom
修改Cluster Name
, Next
创建
选择etcd
, Control Plane
, Worker
, 在其他docker
中执行
sudo docker run -d --privileged --restart=unless-stopped \
--net=host -v /etc/kubernetes:/etc/kubernetes \
-v /var/run:/var/run rancher/rancher-agent:v2.6.2 \
--server https://192.168.205.161 --token dv48n7zw6ktmszqfzwv8tg2x6jmbldxdgvhswjcfjls2hfvnlwvggj \
--ca-checksum 65e8326e19d8f7072c503775b4d5c871237823da179701ee33120ef14049f3be \
--etcd --controlplane --worker
rancher cluster卸载
通过docker安装的cluster卸载, 参考Removing Kubernetes Components from Nodes
# clean cluster
# docker
docker rm -f $(docker ps -qa)
# docker rmi -f $(docker images -q)
docker volume rm $(docker volume ls -q)
# umount
for mount in $(mount | grep tmpfs | grep '/var/lib/kubelet' | awk '{ print $3 }') /var/lib/kubelet /var/lib/rancher; do umount $mount; done
# Directories and Fileslink
rm -rf /etc/ceph \
/etc/cni \
/etc/kubernetes \
/opt/cni \
/opt/rke \
/run/secrets/kubernetes.io \
/run/calico \
/run/flannel \
/var/lib/calico \
/var/lib/etcd \
/var/lib/cni \
/var/lib/kubelet \
/var/lib/rancher/rke/log \
/var/log/containers \
/var/log/kube-audit \
/var/log/pods \
/var/run/calico
# network
ip link delete flannel.1
rancher自定义域名
rancher server, 参考从0开始安装rancher通过自签名证书
docker run -d --restart=unless-stopped \
-p 80:80 -p 443:443 \
-v /tmp/k8s-tls/rancher-my-local-cert.pem:/etc/rancher/ssl/cert.pem \
-v /tmp/k8s-tls/rancher-my-local.pem:/etc/rancher/ssl/key.pem \
-v /tmp/k8s-tls/my_ca.crt:/etc/rancher/ssl/cacerts.pem \
-v /tmp/k8s-tls:/container/certs \
-e SSL_CERT_DIR="/container/certs" \
--privileged \
rancher/rancher:v2.6.2 \
rancher agent, dns报错参考Agent 无法连接 Rancher server,业务集群无法连接 Rancher 自定义域名
# 添加自定义证书位置
sudo docker run -d --privileged --restart=unless-stopped \
--net=host -v /etc/kubernetes:/etc/kubernetes \
-v /tmp/k8s-tls:/container/certs \
-e SSL_CERT_DIR="/container/certs" \
-v /var/run:/var/run rancher/rancher-agent:v2.6.2 \
--server https://rancher.my.local --token dv48n7zw6ktmszqfzwv8tg2x6jmbldxdgvhswjcfjls2hfvnlwvggj \
--ca-checksum 65e8326e19d8f7072c503775b4d5c871237823da179701ee33120ef14049f3be \
--etcd --controlplane --worker
# 启动之后会报错, 由于没有配置dns服务器
# 通过以下命令查看是否有 ERROR: https://rancher.my.local/ping is not accessible (Could not resolve host: rancher.my.local)
docker logs -f $(docker ps --filter label=io.kubernetes.container.name=cluster-register -aq)
# 进入docker容器执行
docker exec -it $(docker ps --filter label=io.kubernetes.container.name=agent -q) bash
# 然后执行
export rancher_server_hostname=rancher.my.local
export rancher_server_ip=192.168.205.161
kubectl -n cattle-system patch deployments cattle-cluster-agent --patch '{
"spec": {
"template": {
"spec": {
"hostAliases": [
{
"hostnames":
[
"'${rancher_server_hostname}'"
],
"ip": "'${rancher_server_ip}'"
}
]
}
}
}
}'
kubectl -n cattle-system patch daemonsets cattle-node-agent --patch '{
"spec": {
"template": {
"spec": {
"hostAliases": [
{
"hostnames":
[
"'${rancher_server_hostname}'"
],
"ip": "'${rancher_server_ip}'"
}
]
}
}
}
}'
# 或者将kubeconfig下载到本地
docker run --rm --net=host -v $(docker inspect kubelet --format '{{ range .Mounts }}{{ if eq .Destination "/etc/kubernetes" }}{{ .Source }}{{ end }}{{ end }}')/ssl:/etc/kubernetes/ssl:ro --entrypoint bash $(docker inspect $(docker images -q --filter=label=io.cattle.agent=true) --format='{{index .RepoTags 0}}' | tail -1) -c 'kubectl --kubeconfig /etc/kubernetes/ssl/kubecfg-kube-node.yaml get configmap -n kube-system full-cluster-state -o json | jq -r .data.\"full-cluster-state\" | jq -r .currentState.certificatesBundle.\"kube-admin\".config | sed -e "/^[[:space:]]*server:/ s_:.*_: \"https://127.0.0.1:6443\"_"' > ~/.kube/config
# 然后在本机执行
参考
- Bootstrapping clusters with kubeadm
- CentOS7中用kubeadm安装Kubernetes
- K8s 部署(基于v1.22.0 + containerd)
- centos install containerd
- 容器运行时containerd
- 安装扩展(Addons)
- 集群网络系统
- github flannel
- flannel网络的安装和删除
- kubeadm/kube-flannel.yaml
- kubernetes使用flannel网络插件服务状态显示CrashLoopBackOff
- coredns状态为CrashLoopBackOff并不断重启
- Web 界面 (Dashboard)
- kubeadm的方式搭建k8s集群
- Kubernetes 网络排错指南
- 对 kubeadm 进行故障排查
- 了解 Kubernetes 的 limits 和 requests
- dashboard Creating sample user
- 在 Windows 下使用 WSL2 搭建 Kubernetes 集群
- PV、PVC、StorageClass讲解
- Kubernetes hostpath和local volume区别
- k8s高可用集群搭建
- High Availability Considerations
- kubernetes系列教程(十六)基于nginx ingress实现服务暴露
- kubernetes系列教程(十七)基于haproxy实现ingress服务暴露
- 图解 Kubernetes Ingress