GroupBlog

软硬件要求

硬件要求如下：

CUP：
- Master: 至少2核，推荐4核及以上
- Node：至少4核
内存：
- Master：至少4GB
- Node：至少4GB，推荐16G以上

系统要求

操作系统

基于x86_64版本的Linux发行版，内核版本3.10及以上，推荐 RHEL 7/ CentOS 7

关闭交换分区

先临时关闭
```
[root@localhost ~]# swapoff -a
```

删除系统对swap的加载

打开/etc/fstab, 注释掉 /dev/mapper/centos-swap所在行:

/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=34ceffd3-4311-4135-bf10-863c6a39568e /boot                   xfs     defaults        0 0
# /dev/mapper/centos-swap swap                    swap    defaults        0 0

关闭防火墙

[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl disable firewalld
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@localhost ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

7月 05 11:49:14 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall d.....
7月 05 11:49:15 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
7月 05 04:29:02 localhost.localdomain systemd[1]: Stopping firewalld - dynamic firewall d.....
7月 05 04:29:02 localhost.localdomain systemd[1]: Stopped firewalld - dynamic firewall daemon.
Hint: Some lines were ellipsized, use -l to show in full.

执行systemctl status firewalld显示Active为inactive (dead)则表明关闭成功

关闭SELINUX

修改系统文件/etc/sysconfig/selinux，将SELINUX的值由enforcing修改为：disabled，然后重启系统

iptables相关设置

打开/etc/sysctl.conf，加入以下内容：

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

执行如下命令，使设置生效：

[root@master ~]# sysctl -p
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

使用kubeadm工具安装kubernetes集群

安装docker

设置docker yum源

[root@localhost ~]# yum install -y yum-utils # 安装yum-util，提供yum-config-manager功能 
[root@localhost ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # 安装docker阿里云yum源
[root@localhost ~]# yum makecache # 创建yum缓存

安装docker-ce

[root@localhost ~]# yum install -y docker-ce

启动docker

[root@localhost ~]# systemctl enable docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
[root@localhost ~]# systemctl start docker
[root@localhost ~]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since 日 2020-07-05 04:58:22 EDT; 7s ago
     Docs: https://docs.docker.com
 Main PID: 1651 (dockerd)
    Tasks: 8
   Memory: 142.5M
   CGroup: /system.slice/docker.service
           └─1651 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.366983035-...pc
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.366996606-...pc
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.367003590-...pc
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.382422122-...."
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.500901033-...s"
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.555902364-...."
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.656392032-...12
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.656461464-...n"
7月 05 04:58:22 localhost.localdomain dockerd[1651]: time="2020-07-05T04:58:22.681482187-...k"
7月 05 04:58:22 localhost.localdomain systemd[1]: Started Docker Application Container Engine.
Hint: Some lines were ellipsized, use -l to show in full.

注：需要在master和node上都安装docker

安装kubeadm及相关工具

配置kubeadm yum国内源

[root@localhost ~]# cat /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes Respository
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enable=1
gpgcheck=0

安装kubeadm工具

[root@localhost ~]# yum makecache
[root@localhost ~]# yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

注：需要在master和node上都安装 kubeadm相关工具

编写初始化文件

执行如下命令，可获取默认的初始化参数内容：

[root@localhost ~]# kubeadm config print init-defaults
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: localhost.localdomain
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.18.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}

可以看到，参数内容非常多，我们在编写初始化文件的时候，不需要这么复杂，只需要在配置中定义非默认的部分就可以了，这里我只修改 kubernetes 的版本和网络信息，初始化文件内容如下：

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers # 这里我们使用阿里云镜像源
kubernetesVersion: v1.18.3 # kubernetes版本
networking:
  podSubnet: "10.100.0.0/16" # pod的网络地址，建议选个和当前宿主机不同的网段，便于区分

将以上内容存储为：init-config.yaml，供后面master的初始化使用。

下载kubernetes相关镜像

注：此操作非必须，先下载镜像是为了减少初始化的时长

[root@localhost ~]# kubeadm config images pull --config=init-config.yaml
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.3
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.3
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.3
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7

安装Master

[root@localhost ~]# hostnamectl set-hostname master # 更改主机名，设置为master，然后重新登录
[root@master ~]# kubeadm init --config=init-config.yaml
[init] Using Kubernetes version: v1.18.3
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.199.234]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [192.168.199.234 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [192.168.199.234 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0705 05:37:48.201119   10375 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0705 05:37:48.201899   10375 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.003542 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: uj5bld.l2df2pehvs951bao
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.199.234:6443 --token uj5bld.l2df2pehvs951bao \
    --discovery-token-ca-cert-hash sha256:6a3f5ca4896fd21e2bf85b8cccff0f8a77b091c554eb1f9167fdc830a54612c9

当出现kubeadm join相关内容后，则表明安装成功，根据内容提示，我们还需要执行以下命令：

[root@master ~]# mkdir -p $HOME/.kube
[root@master ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

执行完如上操作之后，我们的master已经基本安装成功，下面我们继续安装node节点

安装Node，加入集群

注：关于安装 docker 和 kubeadm相关工具及启动docker的方法这里不再赘述，同上

加入集群的方法特别简单，因为在安装完master成功的输出中，我们已经可以看到node加入的命令：

[root@node01 ~]# kubeadm join 192.168.199.234:6443 --token uj5bld.l2df2pehvs951bao \
>     --discovery-token-ca-cert-hash sha256:6a3f5ca4896fd21e2bf85b8cccff0f8a77b091c554eb1f9167fdc830a54612c9
W0705 06:03:40.671730    9662 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING Hostname]: hostname "node01" could not be reached
	[WARNING Hostname]: hostname "node01": lookup node01 on 192.168.199.1:53: no such host
	[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

node已经成功加入集群，就是这么简单！我们在master上执行 kubectl get nodes，则可看到node信息：

[root@master ~]# kubectl get nodes
NAME     STATUS     ROLES    AGE     VERSION
master   NotReady   master   16m     v1.18.5
node01   NotReady   <none>   4m28s   v1.18.5

我们已经可以看到node01已经加入集群中了。

安装网络插件

执行 get nodes命令，我们可以看到，node的状态都为NotReady，这是因为我们还未安装CNI网络插件

[root@master ~]# kubectl get nodes
NAME     STATUS     ROLES    AGE   VERSION
master   NotReady   master   27m   v1.18.5
node01   NotReady   <none>   14m   v1.18.5

执行如下命令，安装CNI网络插件：

[root@master ~]# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

验证kubernetes集群是否安装完成

执行如下命令，验证kubernetes集群的相关Pod是否都创建并正常运行：

[root@master ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   coredns-546565776c-9vw5w         1/1     Running   0          32m
kube-system   coredns-546565776c-gnbg9         1/1     Running   0          32m
kube-system   etcd-master                      1/1     Running   0          32m
kube-system   kube-apiserver-master            1/1     Running   0          32m
kube-system   kube-controller-manager-master   1/1     Running   0          32m
kube-system   kube-proxy-9v55c                 1/1     Running   0          32m
kube-system   kube-proxy-fgv5f                 1/1     Running   0          20m
kube-system   kube-scheduler-master            1/1     Running   0          32m
kube-system   weave-net-htxvq                  2/2     Running   0          2m32s
kube-system   weave-net-vbd4b                  2/2     Running   0          2m32s

如发现有状态错误的Pod，可执行kubectl --namespaces=kube-system describe pod <pod_name>来查看错误原因

如果遇到pod数量不够或状态始终无法恢复的情况，可使用kubeadm reset命令将主机恢复原状，执行kubeadm init命令重新初始化即可

如果以上pod数量和状态均正常，那么我们执行 get pods命令，怎可以看到所有的node都已经ready：

[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   35m   v1.18.5
node01   Ready    <none>   23m   v1.18.5

至此，我们的kubernetes集群已经安装完成。

参考资料：
Kubernetes权威指南：从Docker到Kubernetes实践全接触（第4版）第2章 Kubernetes安装配置指南