前言
kubeadm是Kubernetes官方推出的快速部署Kubernetes集群工具,其思路是将Kubernetes相关服务容器化(Kubernetes静态Pod)以简化部署。kubeadm当前处于beta阶段,不建议生产环境使用(比如etcd单点)。使用kubeadm部署Kubernetes集群非常简单方便,本文记录了在Red Hat 7上用kubeadm部署一个安全的Kubernetes集群的全过程,CentOS 7部署过程与之类似。
如需kubeadm部署一个高可用的集群,可采用如下方式
kubeadm init --api-advertise-addresses=vip --external-etcd-endpoints=http://x.x.x.x:2379,http://x.x.x.x:2379,http://x.x.x.x:2379 --pod-network-cidr 10.244.0.0/16
注:–api-advertise-addresses 本身支持多个api service ip,但操作kubeadm join加入集群节点会失败, 所以对外服务只配置为一个vip。
部署实例
etcd Version: 3.0.17 kubeadm: v1.7.1 kubernetes: v1.7.1 Flannel: v0.8.0 Docker: 17.03.1-ce
准备工作
1. Red Hat Enterprise Linux Server release 7.1 (Maipo)( 1GB+ RAM)
2. 集群机器之间网络互通
目标
1. 部署一个安全的Kubernetes v1.7.1集群
2. 部署pod网络以便pod之间可以互通
部署步骤
安装docker
Kubeadm目前并未在Docker 1.13,17.03+等高级版本进行验证,Kubernetes官方推荐的Docker1.10,1.11,1.12版本。对于Centos,推荐Docker Storage Driver 采用Devicemapper的direct-lvm模式。Docker以及direct-vm参见Docker官网。
Docker安装请参见官网:https://docs.docker.com/engine … ntos/
通过docker version查看docker版本:
sudo docker version Client: Version: 17.03.1-ce API version: 1.27 Go version: go1.7.5 Git commit: c6d412e Built: Mon Mar 27 17:05:44 2017 OS/Arch: linux/amd64 Server: Version: 17.03.1-ce API version: 1.27 (minimum version 1.12) Go version: go1.7.5 Git commit: c6d412e Built: Mon Mar 27 17:05:44 2017 OS/Arch: linux/amd64 Experimental: false
direct-lvm安装详见官网,在此略过: https://docs.docker.com/engine … iver/
... ... Server Version: 17.03.1-ce Storage Driver: devicemapper Pool Name: docker-thinpool Pool Blocksize: 524.3 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 1.429 GB Data Space Total: 306 GB Data Space Available: 304.6 GB Metadata Space Used: 782.3 kB Metadata Space Total: 3.217 GB Metadata Space Available: 3.216 GB Thin Pool Minimum Free Space: 30.6 GB Udev Sync Supported: true Deferred Removal Enabled: true Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Library Version: 1.02.135-RHEL7 (2016-11-16) Logging Driver: journald Cgroup Driver: cgroupfs Plugins: Volume: local Network: host null bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp ... ...
安装kubelet,kubeadm
1. 添加yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF
2. 关闭SELinux
sudo setenforce 0
3. yum安装 kubelet kubeadm
sudo yum install -y kubelet kubeadm
或按需安装指定版本,查看kubeadm, kubelet, kubernets-cni版本:
yum list [kubeadm|kubelet|kubernets-cni] --showduplicates |sort -r
4. 编辑/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,修改 “cgroup-driver”值 由systemd变为cgroupfs
KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs
5. Enable kubelet后启动kubelet
sudo systemctl enable kubelet && sudo systemctl start kubelet
6. 检查kubelet是否启动成功: sudo systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Sat 2017-07-15 01:22:15 UTC; 20min ago Docs: http://kubernetes.io/docs/ Main PID: 16755 (kubelet) Memory: 41.6M CGroup: /system.slice/kubelet.service ├─16755 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --ne... └─16891 journalctl -k -f
注:systemctl start kubelet kubelet命令一定要执行,可能会因为如下原因无法启动,不用担心,执行kubeadm init或kubeadm join命令后kubelet会自动被systemd重启成功。
error: failed to run Kubelet: invalid kubeconfig: stat /etc/kubernetes/kubelet.conf: no such file or directory
初始化集群
1. 使用sudo kubeadm init命令初始化集群,可以指定Kubernetes master IP:–apiserver-advertise-address=<ip-address>, 如果选择Flannel做为Pod网络,需指定specify –pod-network-cidr=10.244.0.0/16. 示例如下:
sudo kubeadm init --apiserver-advertise-address 192.168.17.139 --pod-network-cidr 10.244.0.0/16
kubeadm init会做一系列的预检查已确保满足集群部署条件,检查过程大概会执行几分钟。
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [init] Using Kubernetes version: v1.7.1 [init] Using Authorization modes: [Node RBAC] [preflight] Running pre-flight checks [preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.03.1-ce. Max validated version: 1.12 [preflight] WARNING: docker service is not enabled, please run 'systemctl enable docker.service' [preflight] Starting the kubelet service [certificates] Generated CA certificate and key. [certificates] Generated API server certificate and key. [certificates] API Server serving cert is signed for DNS names [bjo-ep-dep-039.dev.fwmrm.net kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.17.139] [certificates] Generated API server kubelet client certificate and key. [certificates] Generated service account token signing key and public key. [certificates] Generated front-proxy CA certificate and key. [certificates] Generated front-proxy client certificate and key. [certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf" [apiclient] Created API client, waiting for the control plane to become ready [apiclient] All control plane components are healthy after 31.001311 seconds [token] Using token: 472def.6bbb304791b76492 [apiconfig] Created RBAC rules [addons] Applied essential addon: kube-proxy [addons] Applied essential addon: kube-dns Your Kubernetes master has initialized successfully! To start using your cluster, you need to run (as a regular user): mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: http://kubernetes.io/docs/admin/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join --token 472def.6bbb304791b76492 192.168.17.139:6443
Kubernetes根据在/etc/kubernetes/manifests目录下的manifests生成API server, controller manager and scheduler等静态pod。
sudo ls /etc/kubernetes/manifests etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
kube-apiserver.yaml文件内容为例,可以查看到启动kube-apiserver需要的参数(可定制,修改该文件即可,kubelet会监控该文件变化,一旦变化会立即重新生成pod)、image、health check探针、QoS等配置,也可以把image提前下载下来加快部署速度。
spec: containers: - command: - kube-apiserver - --experimental-bootstrap-token-auth=true - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --requestheader-allowed-names=front-proxy-client - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --secure-port=6443 - --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota - --requestheader-group-headers=X-Remote-Group - --allow-privileged=true - --requestheader-username-headers=X-Remote-User - --requestheader-extra-headers-prefix=X-Remote-Extra- - --service-account-key-file=/etc/kubernetes/pki/sa.pub - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key - --insecure-port=0 - --client-ca-file=/etc/kubernetes/pki/ca.crt - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --service-cluster-ip-range=10.96.0.0/12 - --authorization-mode=Node,RBAC - --advertise-address=192.168.17.139 - --etcd-servers=http://127.0.0.1:2379 - --service-node-port-range=20000-65535 image: gcr.io/google_containers/kube-apiserver-amd64:v1.7.1 livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 6443 scheme: HTTPS initialDelaySeconds: 15 timeoutSeconds: 15 name: kube-apiserver resources: requests: cpu: 250m
kube-apiserver yaml文件kube-apiserver.yaml中的选项–insecure-port设置为0,说明kube-apiserver并未监听默认http 8080端口,而是监听了https 6443端口
sudo netstat -nltp | grep 6443 tcp 0 0 0.0.0.0:6443 0.0.0.0:* LISTEN 20936/kube-apiserve
注:从上面信息可知kube-dns已经以pod形式运行但处于pending状态,主要因为pod网络flannel还未部署,另外因下文中的Master Isolation特性导致kube-dns无节点可部署。加入节点以及解除Master Isolation均可以使kube-dns成功运行、处于running状态
2. kubeadm init命令最后生成的“kubeadm join –token 472def.6bbb304791b76492 192.168.17.139:6443”需要记录,用于节点加入集群,token也可通过 sudo kubeadm token list获取。
!769 $ sudo kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION 472def.6bbb304791b76492 <forever> <never> authentication,signing The default bootstrap token generated by 'kubeadm init'.
注:如果之前用’sudo kubeadm init’或’sudo kubeadm join’.已经初始化过集群或加过节点,预检查会失败,需用sudo kubeadm reset命令来revert:
[preflight] Running pre-flight checks [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Removing kubernetes-managed containers [reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/lib/etcd] [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
3. kubeadm init命令执行完毕后,需执行如下命令(注:mkdir不要加sudo,$HOME不需要做替换变量替换)
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
查看全部namespace下的pod:
kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system etcd-bjo-ep-dep-039.dev.fwmrm.net 1/1 Running 0 2h kube-system kube-apiserver-bjo-ep-dep-039.dev.fwmrm.net 1/1 Running 0 2h kube-system kube-controller-manager-bjo-ep-dep-039.dev.fwmrm.net 1/1 Running 0 2h kube-system kube-dns-2425271678-8k4dn 0/3 Pending 0 2h kube-system kube-proxy-vd39t 1/1 Running 0 2h kube-system kube-scheduler-bjo-ep-dep-039.dev.fwmrm.net 1/1 Running 0 2h
Master Isolation
由于安全原因,默认情况下pod不会被schedule到master节点上,可以通过下面命令解除这种限制:kubectl taint nodes –all node-role.kubernetes.io/master
kubectl taint nodes --all node-role.kubernetes.io/master- node "bjo-ep-dep-039.dev.fwmrm.net" untainted
安装pod网络Flannel
1.通过如下2条命令执安装Flannel:
kubectl apply -f https://raw.githubusercontent. … c.yml
kubectl apply -f https://raw.githubusercontent. … l.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml clusterrole "flannel" created clusterrolebinding "flannel" created
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml serviceaccount "flannel" created configmap "kube-flannel-cfg" created daemonset "kube-flannel-ds" created
2. 验证dns是否工作
创建busybox.yml,内容如下:
apiVersion: v1 kind: Pod metadata: name: busybox namespace: default spec: containers: - image: busybox command: - sleep - "3600" imagePullPolicy: IfNotPresent name: busybox restartPolicy: Always
通过kubectl create -f busybox.yaml创建pod,并验证通过kubectl exec -ti busybox — nslookup kubernetes.default验证dns是否工作。
kubectl exec -ti busybox -- nslookup kubernetes.default Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: kubernetes.default Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
加入节点到集群
1. 使用前面kubeadm init产生的token加入节点sudo kubeadm join 192.168.17.139:6443 –token 472def.6bbb304791b76492
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [preflight] Running pre-flight checks [preflight] WARNING: hostname "" could not be reached [preflight] WARNING: hostname "" lookup : no such host [preflight] Some fatal errors occurred: hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*') [preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`
预检查报了一个Fatal错误,这应该是kubeadm1.7的一个bug,可用–skip-preflight-checks取消预检查
sudo kubeadm join --skip-preflight-checks --token 472def.6bbb304791b76492 192.168.17.139:6443 [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [preflight] Skipping pre-flight checks [discovery] Trying to connect to API Server "192.168.17.139:6443" [discovery] Created cluster-info discovery client, requesting info from "https://192.168.17.139:6443" [discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.17.139:6443" [discovery] Successfully established connection with API Server "192.168.17.139:6443" [bootstrap] Detected server version: v1.7.1 [bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1) [csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request [csr] Received signed certificate from the API server, generating KubeConfig... [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" Node join complete: * Certificate signing request sent to master and response received. * Kubelet informed of new secure connection details. Run 'kubectl get nodes' on the master to see this machine join.
注:node上需要安装kubeadm,kubelet安装并启动,否则join提示成功但实际加入失败,前文已有说明。
2. 通过kubectl get nodes查看节点是否成功加入集群
kubectl get nodes NAME STATUS AGE VERSION bjo-ep-dep-039.dev.fwmrm.net Ready 1d v1.7.1 bjo-ep-dep-040.dev.fwmrm.net Ready 19m v1.7.1 bjo-ep-svc-017.dev.fwmrm.net Ready 1h v1.7.1
3. 测试集群是否正常工作
kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes.github.io/master/docs/concepts/workloads/controllers/nginx-deployment.yaml
通过kubectl get po -o wide查看,3个Nginx实例分别部署到3个node上
kubectl get po -o wide NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-431080787-2z167 1/1 Running 0 3m 10.244.0.15 bjo-ep-dep-039.dev.fwmrm.net nginx-deployment-431080787-55fl8 1/1 Running 0 3m 10.16.103.5 bjo-ep-svc-017.dev.fwmrm.net nginx-deployment-431080787-bcmfx 1/1 Running 0 3m 10.16.103.4 bjo-ep-svc-017.dev.fwmrm.net
集群外访问集群
默认情况下因安全原因,集群外部不可以直接操作集群,如需要集群外部操作,可通过如下方式
scp root@<master ip>:/etc/kubernetes/admin.conf . kubectl --kubeconfig ./admin.conf get nodes
示例如下:
xiazhang-mac:~ xiazhang$ kubectl --kubeconfig ./admin.conf -n kube-system get nodes NAME STATUS AGE VERSION bjo-ep-dep-039.dev.fwmrm.net Ready 2d v1.7.1 bjo-ep-dep-040.dev.fwmrm.net Ready 22h v1.7.1 bjo-ep-svc-017.dev.fwmrm.net Ready 23h v1.7.1
或通过kubectl proxy设置代理:
scp root@<master ip>:/etc/kubernetes/admin.conf . kubectl --kubeconfig ./admin.conf proxy
示例如下:
xiazhang-mac:~ xiazhang$ kubectl --kubeconfig ./admin.conf proxy Starting to serve on 127.0.0.1:8001 kubectl config set-cluster default-cluster --server=http://localhost:8001
另开一个Terminal,执行 kubectl get nodes
xiazhang-mac:~ xiazhang$ kubectl get nodes NAME STATUS AGE VERSION bjo-ep-dep-039.dev.fwmrm.net Ready 2d v1.7.1 bjo-ep-dep-040.dev.fwmrm.net Ready 22h v1.7.1 bjo-ep-svc-017.dev.fwmrm.net Ready 23h v1.7.1
Kubernetes Dashboard
1. 使用官方yml文件 https://github.com/kubernetes/dashboard,增加NodePort暴露service端口,本文以31000为例。
kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort ports: - port: 80 targetPort: 9090 nodePort: 31000 selector: k8s-app: kubernetes-dashboard
2. Kubernetes 1.6开始API Server启用了RBAC授权,kubernetes-dashboard.yaml中并未定义需要授权的ServiceAccount,如果访问Dashborad:http://ClusterIP:NodePort会被拒绝。
User "system:serviceaccount:kube-system:default" cannot list statefulsets.apps in the namespace "default". (get statefulsets.apps)
定义dashboard-rbac.yaml,并kubectl create -f dashboard-rbac.yaml
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: dashboard-admin roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: default namespace: kube-system
3. 部署Heapster in a Kubernetes
首先下载如下目录文件
https://github.com/kubernetes/heapster/blob/master/deploy/kube-config/rbac/heapster-rbac.yaml https://github.com/kubernetes/heapster/tree/master/deploy/kube-config/influxdb
拷贝heapster-rbac.yaml 到influxdb文件夹下,然后执行:
kubectl create -f deploy/kube-config/influxdb
注:部署grafana如报如下错误
Starting a utility program that will configure Grafana Starting Grafana in foreground mode t=2017-07-17T07:28:47+0000 lvl=crit msg="Failed to parse /etc/grafana/grafana.ini, open /etc/grafana/grafana.ini: no such file or directory%!(EXTRA []interface {}=[])"
可通过替换image解决,编辑deploy/kube-config/influxdb/grafana.yaml
spec: containers: - name: grafana #image: gcr.io/google_containers/heapster-grafana-amd64:v4.2.0 image: gcr.io/google_containers/heapster-grafana-amd64:v4.0.2
4. 访问Dashboard UI: http://192.168.17.139:31000/%2 … D_all
集群部署到此完成,有问题欢迎留言交流。
参考资料
https://kubernetes.io/docs/set … eadm/
https://kubernetes.io/docs/admin/kubeadm/
https://kubernetes.io/docs/set … eadm/
https://kubernetes.io/docs/tas … ectl/
https://stackoverflow.com/ques … board
http://www.cnblogs.com/caiwenhao/p/6196014.html