请在使用站内资源的同时不要恶意进行爬取或倒链等行为,感谢支持!
相关文档:
Ceph Rook Install
Rook 的安装比较简单,但需要根据自身业务逻辑进行配置。
$ git clone --depth 1 -b v1.5.11 https://github.com/rook/rook
当前 rook 最新稳定版本,使用 k8s 1.18.6 版本作为支持。ceph 为 15.2.11 版本
首先部署 crd 资源:
$ cd rook/cluster/examples/kubernetes/ceph $ kubectl apply -f crds.yaml -f common.yaml
部署 operator:
$ kubectl apply -f operator.yaml
修改集群配置项:
集群配置项主要修改了:
... # dashboard 不使用 ssl dashboard: enabled: true # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy) # urlPrefix: /ceph-dashboard # serve the dashboard at the given port. # port: 8443 # serve the dashboard using SSL ssl: false ... # 通过 host 网络模式暴露服务 network: # enable host networking provider: host ... # 和亲调度配置 placement: mon: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-mon operator: In values: - enabled osd: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-osd operator: In values: - enabled mgr: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-mgr operator: In values: - enabled ... # 自定义 osd 配置文件项 storage: # cluster level storage configuration and selection useAllNodes: false # 如果指定 osd 则配置一定要改为 false,否则它会检测一切可用的设备 useAllDevices: false # 如果指定 osd 则配置一定要改为 false,否则它会检测一切可用的设备 #deviceFilter: config: # crushRoot: "custom-root" # specify a non-default root label for the CRUSH map # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore. # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB # journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller # osdsPerDevice: "1" # this value can be overridden at the node or device level # encryptedDevice: "true" # the default value for this option is "false" # 设备的 node 名称对应的设备名称,此环境部署在 aws ssd 中。 nodes: - name: "ip-10-200-1-111.ap-southeast-1.compute.internal" devices: - name: "nvme2n1" - name: "ip-10-200-1-125.ap-southeast-1.compute.internal" devices: - name: "nvme2n1" - name: "ip-10-200-1-169.ap-southeast-1.compute.internal" devices: - name: "nvme2n1" ...
首先设置标签:
# mon 标签 $ kubectl label nodes {ip-10-200-1-111.ap-southeast-1.compute.internal,ip-10-200-1-125.ap-southeast-1.compute.internal,ip-10-200-1-169.ap-southeast-1.compute.internal} ceph-mon=enabled # osd 标签 $ kubectl label nodes {ip-10-200-1-111.ap-southeast-1.compute.internal,ip-10-200-1-125.ap-southeast-1.compute.internal,ip-10-200-1-169.ap-southeast-1.compute.internal} ceph-osd=enabled # mgr 标签 $ kubectl label nodes ip-10-200-1-111.ap-southeast-1.compute.internal ceph-mgr=enabled
标签的含义是指定 rook 在那些 node 中部署对应的服务。
部署 cluster:
$ kubectl apply -f cluster.yaml
部署 tools:
$ kubectl apply -f toolbox.yaml
tools 主要用于可以执行 ceph 命令行的工具。
部署 dashboard:
$ kubectl apply -f dashboard-external-http.yaml
dashboard 的密码可以通过如下方式获取:
$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
使用 cephfs
如果需要使用 cephfs 则需要部署 filesystem:
$ kubectl apply -f filesystem.yaml
删除 rook
删除 rook 执行如下步骤:
$ kubectl delete -f crds.yaml -f common.yaml $ kubectl delete -f operator.yaml $ kubectl delete -f cluster.yaml $ kubectl delete -f toolbox.yaml $ kubectl delete -f dashboard-external-https.yaml $ for i in $(kubectl get node|awk '{print $1}'); do kubectl label nodes $i ceph-mon- ; done $ for i in $(kubectl get node|awk '{print $1}'); do kubectl label nodes $i ceph-osd- ; done $ for i in $(kubectl get node|awk '{print $1}'); do kubectl label nodes $i ceph-mgr- ; done
问题总汇
相关问题解决方案。
OSD 初始化
osd 添加时如果 osd 初始化容器提示如下 log,则表明需要手动初始化磁盘设备:
2021-05-06 06:15:40.661913 D | cephosd: &{Name:nvme0n1p1 Parent:nvme0n1 HasChildren:false DevLinks:/dev/disk/by-uuid/8c1540fa-e2b4-407d-bcd1-59848a73e463 /dev/disk/by-path/pci-0000:00:04.0-nvme-1-part1 /dev/disk/by-ebs-volumeid/vol-0c8cc319d4648eb4d-p1 /dev/disk/by-id/nvme-nvme.1d0f-766f6c3063386363333139643436343865623464-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001-part1 /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0c8cc319d4648eb4d-part1 Size:53686025728 UUID: Serial:Amazon Elastic Block Store_vol0c8cc319d4648eb4d Type:part Rotational:false Readonly:false Partitions:[] Filesystem:xfs Vendor: Model:Amazon Elastic Block Store WWN:nvme.1d0f-766f6c3063386363333139643436343865623464-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme0n1p1 KernelName:nvme0n1p1 Encrypted:false} 2021-05-06 06:15:40.661935 D | cephosd: &{Name:nvme1n1 Parent: HasChildren:false DevLinks:/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0d61b51eb5f62f6df /dev/disk/by-path/pci-0000:00:1f.0-nvme-1 /dev/disk/by-ebs-volumeid/vol-0d61b51eb5f62f6df /dev/disk/by-id/nvme-nvme.1d0f-766f6c3064363162353165623566363266366466-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 /dev/disk/by-uuid/fcb8dc60-ac05-4230-a0b7-ae839b499555 Size:214748364800 UUID:0fcc3869-b534-444c-ba52-d255c3da041b Serial:Amazon Elastic Block Store_vol0d61b51eb5f62f6df Type:disk Rotational:false Readonly:false Partitions:[] Filesystem:ext4 Vendor: Model:Amazon Elastic Block Store WWN:nvme.1d0f-766f6c3064363162353165623566363266366466-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme1n1 KernelName:nvme1n1 Encrypted:false} 2021-05-06 06:15:40.661952 D | cephosd: &{Name:nvme2n1 Parent: HasChildren:false DevLinks:/dev/disk/by-ebs-volumeid/vol-03745fb74671495da /dev/disk/by-path/pci-0000:00:1e.0-nvme-1 /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol03745fb74671495da /dev/disk/by-id/nvme-nvme.1d0f-766f6c3033373435666237343637313439356461-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 Size:1074815565824 UUID:2bbfdfd9-fc51-4327-b91b-3f6d3656c57e Serial:Amazon Elastic Block Store_vol03745fb74671495da Type:disk Rotational:false Readonly:false Partitions:[] Filesystem: Vendor: Model:Amazon Elastic Block Store WWN:nvme.1d0f-766f6c3033373435666237343637313439356461-416d617a6f6e20456c617374696320426c6f636b2053746f7265-00000001 WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme2n1 KernelName:nvme2n1 Encrypted:false} 2021-05-06 06:15:40.661963 I | cephosd: skipping device "nvme0n1p1" because it contains a filesystem "xfs" 2021-05-06 06:15:40.661969 I | cephosd: skipping device "nvme1n1" because it contains a filesystem "ext4" 2021-05-06 06:15:40.661975 D | exec: Running command: lsblk /dev/nvme2n1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME 2021-05-06 06:15:40.663566 D | exec: Running command: ceph-volume inventory --format json /dev/nvme2n1 2021-05-06 06:15:41.248150 I | cephosd: skipping device "nvme2n1": ["Has BlueStore device label"]. 2021-05-06 06:15:41.248249 I | cephosd: configuring osd devices: {"Entries":{}} 2021-05-06 06:15:41.248259 I | cephosd: no new devices to configure. returning devices already configured with ceph-volume. 2021-05-06 06:15:41.248396 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm list --format json 2021-05-06 06:15:41.555152 D | cephosd: {} 2021-05-06 06:15:41.555183 I | cephosd: 0 ceph-volume lvm osd devices configured on this node 2021-05-06 06:15:41.555191 W | cephosd: skipping OSD configuration as no devices matched the storage settings for this node "ip-10-200-1-111.ap-southeast-1.compute.internal" 2021-05-06 06:15:41.555212 I | op-k8sutil: format and nodeName longer than 63 chars, nodeName ip-10-200-1-111.ap-southeast-1.compute.internal will be 9ecc18a5536c50fc84120a0fa51b18d1
执行:
$ sgdisk --zap-all /dev/nvme2n1
无法正常部署服务
如果部署 cluster 时存在如下问题:
$ kubectl get pod -n rook-ceph ... rook-ceph-crashcollector-9ecc18a5536c50fc84120a0fa51b18d1-tdr6h 0/1 Init:0/2 0 4m47s 10.200.1.111 ip-10-200-1-111.ap-southeast-1.compute.internal <none> <none> $ kubectl describe pod -n rook-ceph rook-ceph-crashcollector-9ecc18a5536c50fc84120a0fa51b18d1-tdr6h Warning FailedMount 62s (x10 over 5m12s) kubelet, ip-10-200-1-111.ap-southeast-1.compute.internal MountVolume.SetUp failed for volume "rook-ceph-crash-collector-keyring" : secret "rook-ceph-crash-collector-keyring" not found
表示 ceph 的相关信息已经被更改需要重新删除历史记录然后重新部署,解决方案:
# 删除对应机器上的资源 $ ll /var/lib/rook/ /var/lib/kubelet/plugins/ /var/lib/kubelet/plugins_registry/ $ rm -rf /var/lib/rook/* /var/lib/kubelet/plugins/* /var/lib/kubelet/plugins_registry/*
如果没有自动解决请手动执行重启:
$ kubectl rollout restart deployment/rook-ceph-operator -n rook-ceph
文档创建于 , 最后一次更新于 , 文档当前的状态 正式版 , 当前编写页面的版本 V1.3.1 。
Overview
Content Tools