K8S 部署 CDI
https://github.com/kubevirt/containerized-data-importer
前置准备
在部署 CDI 之前已经部署了一个 Ceph 集群,并在 K8S 上部署了 csr-ceph-rbd 来让 K8S 可以创建 StorageClass 来使用 Ceph RBD 资源。
部署
容器化数据导入程序(CDI)
简单讲 CDI 就是将虚拟机镜像导入 Kubevirt 的一个工具,它简化了用虚拟机镜像创建虚拟机的过程。
CDI 引入一个新的资源 DataVolumes
。
[root@base-k8s-master-1 ~]# export VERSION=$(basename $(curl -s -w %{redirect_url} https://github.com/kubevirt/containerized-data-importer/releases/latest))
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml
[root@base-k8s-master-1 ~]# kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
namespace/cdi created
customresourcedefinition.apiextensions.k8s.io/cdis.cdi.kubevirt.io created
clusterrole.rbac.authorization.k8s.io/cdi-operator-cluster created
clusterrolebinding.rbac.authorization.k8s.io/cdi-operator created
serviceaccount/cdi-operator created
role.rbac.authorization.k8s.io/cdi-operator created
rolebinding.rbac.authorization.k8s.io/cdi-operator created
deployment.apps/cdi-operator created
[root@base-k8s-master-1 ~]# kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml
cdi.cdi.kubevirt.io/cdi created
[root@base-k8s-master-1 ~]# kubectl get cdi -n cdi
NAME AGE PHASE
cdi 13h Deployed
这里有个插曲,如果底层运行时是 containerd 的话需要修改个配置:
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = true
修改完后重启 containerd,原因写在 TroubleShooting。
测试
创建 DataVolume
[root@base-k8s-master-1 kubevirt]# cat dv_fedora.yml
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: "fedora"
spec:
storage:
storageClassName: csi-rbd-sc
resources:
requests:
storage: 5Gi
source:
http:
url: "https://download.fedoraproject.org/pub/fedora/linux/releases/41/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-41-1.4.x86_64.qcow2"
[root@base-k8s-master-1 kubevirt]# kubectl apply -f dv_fedora.yml
datavolume.cdi.kubevirt.io/fedora created
查看过程
查看 DataVolume 和 pod 状态
[root@base-k8s-master-1 kubevirt]# kubectl get datavolumes.cdi.kubevirt.io fedora
NAME PHASE PROGRESS RESTARTS AGE
fedora ImportScheduled N/A 18s
[root@base-k8s-master-1 kubevirt]# kubectl get pod
NAME READY STATUS RESTARTS AGE
importer-prime-bfcf0ea4-cb66-463e-ad44-5d8a9e2c5250 1/1 Running 0 23s
查看 PV 和 PVC
[root@base-k8s-master-1 ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pvc-52573710-e9b7-48c6-a6fd-fecf7e2c896c 5Gi RWX Delete Bound default/prime-bfcf0ea4-cb66-463e-ad44-5d8a9e2c5250 csi-rbd-sc <unset> 5m43s
pvc-b74fec1a-f0be-40fd-a9e7-be8e2eb277c6 6Gi RWO Delete Bound default/prime-bfcf0ea4-cb66-463e-ad44-5d8a9e2c5250-scratch csi-rbd-sc <unset> 5m26s
[root@base-k8s-master-1 ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
fedora Pending csi-rbd-sc <unset> 4m33s
prime-bfcf0ea4-cb66-463e-ad44-5d8a9e2c5250 Bound pvc-52573710-e9b7-48c6-a6fd-fecf7e2c896c 5Gi RWX csi-rbd-sc <unset> 4m33s
prime-bfcf0ea4-cb66-463e-ad44-5d8a9e2c5250-scratch Bound pvc-b74fec1a-f0be-40fd-a9e7-be8e2eb277c6 6Gi RWO csi-rbd-sc <unset> 4m7s
查看 pod 日志
[root@base-k8s-master-1 ~]# kubectl logs -f importer-prime-6505ab8c-88b7-4e31-8367-535154f0d8eb
I0206 06:05:39.321826 1 importer.go:107] Starting importer
I0206 06:05:39.322993 1 importer.go:182] begin import process
I0206 06:05:46.409367 1 data-processor.go:348] Calculating available size
I0206 06:05:46.410506 1 data-processor.go:356] Checking out block volume size.
I0206 06:05:46.410528 1 data-processor.go:373] Target size 5368709120.
I0206 06:05:46.433475 1 data-processor.go:247] New phase: TransferScratch
I0206 06:05:46.435235 1 util.go:96] Writing data...
I0206 06:05:47.434201 1 prometheus.go:78] 0.42
I0206 06:05:48.434419 1 prometheus.go:78] 0.85
I0206 06:05:49.434666 1 prometheus.go:78] 0.99
I0206 06:05:50.436862 1 prometheus.go:78] 1.00
I0206 06:05:51.437004 1 prometheus.go:78] 1.00
I0206 06:05:52.438039 1 prometheus.go:78] 1.01
I0206 06:05:53.438889 1 prometheus.go:78] 1.01
I0206 06:05:54.439891 1 prometheus.go:78] 1.62
I0206 06:05:55.439985 1 prometheus.go:78] 2.82
I0206 06:05:56.440440 1 prometheus.go:78] 3.11
I0206 06:05:57.440840 1 prometheus.go:78] 3.44
I0206 06:05:58.442006 1 prometheus.go:78] 3.73
I0206 06:05:59.442894 1 prometheus.go:78] 4.01
...output omitted...
I0206 06:11:01.667658 1 prometheus.go:78] 99.99
I0206 06:11:02.686701 1 prometheus.go:78] 99.99
I0206 06:11:03.687612 1 prometheus.go:78] 99.99
I0206 06:11:04.687787 1 prometheus.go:78] 100.00
I0206 06:11:05.687949 1 prometheus.go:78] 100.00
I0206 06:11:06.688683 1 prometheus.go:78] 100.00
I0206 06:11:07.689410 1 prometheus.go:78] 100.00
I0206 06:11:09.954670 1 data-processor.go:247] New phase: Convert
I0206 06:11:09.954699 1 data-processor.go:253] Validating image
E0206 06:11:09.977108 1 prlimit.go:156] failed to kill the process; os: process already finished
I0206 06:11:09.977230 1 qemu.go:115] Running qemu-img with args: [convert -t writeback -p -O raw /scratch/tmpimage /dev/cdi-block-volume]
I0206 06:11:09.982507 1 qemu.go:273] 0.00
I0206 06:11:11.483424 1 qemu.go:273] 1.02
I0206 06:11:13.256227 1 qemu.go:273] 2.03
I0206 06:11:16.628050 1 qemu.go:273] 3.05
I0206 06:11:18.574501 1 qemu.go:273] 4.06
I0206 06:11:20.336702 1 qemu.go:273] 5.08
I0206 06:11:20.962854 1 qemu.go:273] 6.09
I0206 06:11:24.076379 1 qemu.go:273] 7.11
I0206 06:11:28.753410 1 qemu.go:273] 8.12
I0206 06:11:30.064958 1 qemu.go:273] 9.14
I0206 06:11:34.008080 1 qemu.go:273] 10.16
I0206 06:11:35.155598 1 qemu.go:273] 11.17
I0206 06:11:36.735839 1 qemu.go:273] 12.19
I0206 06:11:38.180840 1 qemu.go:273] 13.20
I0206 06:11:39.695558 1 qemu.go:273] 14.22
...output omitted...
I0206 06:14:23.062024 1 qemu.go:273] 90.39
I0206 06:14:24.714937 1 qemu.go:273] 91.41
I0206 06:14:25.997528 1 qemu.go:273] 92.42
I0206 06:14:27.465442 1 qemu.go:273] 93.44
I0206 06:14:29.116782 1 qemu.go:273] 94.45
I0206 06:14:30.876332 1 qemu.go:273] 95.47
I0206 06:14:32.805174 1 qemu.go:273] 96.48
I0206 06:14:34.427788 1 qemu.go:273] 97.50
I0206 06:14:35.923245 1 qemu.go:273] 98.52
I0206 06:14:37.523749 1 qemu.go:273] 99.53
E0206 06:14:39.052165 1 prlimit.go:156] failed to kill the process; os: process already finished
I0206 06:14:39.052218 1 data-processor.go:247] New phase: Resize
I0206 06:14:39.053624 1 data-processor.go:247] New phase: Complete
I0206 06:14:39.053893 1 importer.go:231] {"scratchSpaceRequired":false,"preallocationApplied":false,"message":"Import Complete"}
使用 CDI 导入的 DataVolume 创建虚拟机
创建虚拟机
cloud-init:https://platform9.com/blog/how-to-setup-kubevirt-with-pmk/
[root@base-k8s-master-1 kubevirt]# cat vm-dv.yml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: vm-dv
namespace: default
spec:
runStrategy: Halted
template:
metadata:
labels:
kubevirt.io/domain: vm-dv
kubevirt.io/size: small
spec:
architecture: amd64
domain:
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- masquerade: {}
name: default
machine:
type: q35
resources:
requests:
memory: 2048M
networks:
- name: default
pod: {}
volumes:
- dataVolume:
name: fedora
name: containerdisk
- cloudInitNoCloud:
userData: |-
#cloud-config
chpasswd: { expire: False }
hostname: vm-dv
fqdn: vm-dv.example.com
password: fedora
ssh_authorized_keys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCqK9cDuDDHO8NNIDhXvxcIAUDrzC+RY8ANhrXJ3fe7PGbslBNXBiPpNZNUahIUez4qrz92MXDOhyywf0OTjIYFsWqmof0ytCRzZeBRpeE0uEQIj793YN52yKIbQ797mThqmOCFAFx3ES9+/HtB6H/PWYGMvhyqmiFYNu46ttEKlu8TGQ6nh79eSc1AK+YT1UMVRm06aGp5mfhp2AuL5KTPsPHuJ9XXgPeam2rnaE1erde912qxX6k4PWtQknZ7ZSzlzgfbO2vVheNYoG9SLTZQHbvOws75pfHdI1T2SF9N3WeNRaIySBCGqxhPihavHa+Lg94kxFxLpqMdyKJzKpfT5NSpZZN1qV62JQbRWDzyOcio4ZLmUalF7IXaiD62wvYN64vS90zvUYxr4Gfzkrx9/Cp4+qPBS00KlJGcVU1vfQzdUv57HqVlYQN4Y0tqT7sYAa9/9tnSs/fYPol0NBYbuqxatmRaJltMbs46bOO6plYQJm6RCYbdanh7BqMcg4M= [email protected]
runcmd:
- "sudo echo data > /root/install.log"
name: cloudinitdisk
[root@base-k8s-master-1 kubevirt]# kubectl apply -f vm-dv.yml
virtualmachine.kubevirt.io/vm-dv created
[root@base-k8s-master-1 kubevirt]# kubectl get vms
NAME AGE STATUS READY
vm-dv 12s Stopped False
[root@base-k8s-master-1 kubevirt]# virtctl start vm-dv
VM vm-dv was scheduled to start
[root@base-k8s-master-1 ~]# kubectl get vms
NAME AGE STATUS READY
vm-dv 68s Running True
[root@base-k8s-master-1 ~]# kubectl get vmi
NAME AGE PHASE IP NODENAME READY
vm-dv 76s Running 10.100.223.24 base-k8s-worker-1.example.com True
验证虚拟机
[root@base-k8s-master-1 kubevirt]# kubectl get pod
NAME READY STATUS RESTARTS AGE
virt-launcher-vm-dv-wj6pw 2/2 Running 0 17m
# 验证 console 登录
[root@base-k8s-master-1 kubevirt]# virtctl console vm-dv
Successfully connected to vm-dv console. The escape sequence is ^]
vm-dv login: fedora
Password:
Last login: Thu Feb 6 09:05:51 on ttyS0
[fedora@vm-dv ~]$
# 验证 ssh 登录
[root@base-k8s-master-1 kubevirt]# kubectl expose pod virt-launcher-vm-dv-wj6pw --port 22
service/virt-launcher-vm-dv-wj6pw exposed
[root@base-k8s-master-1 kubevirt]# ssh [email protected]
Last login: Thu Feb 6 08:50:59 2025 from 10.100.239.0
# 验证主机名和第一次启动执行的命令
[fedora@vm-dv ~]$ sudo -i
[root@vm-dv ~]# cat /root/install.log
data
[root@vm-dv ~]# hostnamectl
Static hostname: vm-dv.example.com
Icon name: computer-vm
Chassis: vm 🖴
Machine ID: 42a14858ef495668bd65cb2b6c06b9a5
Boot ID: 258efbebe7924078b9726ffc938b2090
Product UUID: 42a14858-ef49-5668-bd65-cb2b6c06b9a5
Virtualization: kvm
Operating System: Fedora Linux 41 (Cloud Edition)
CPE OS Name: cpe:/o:fedoraproject:fedora:41
OS Support End: Tue 2025-05-13
OS Support Remaining: 3month 4d
Kernel: Linux 6.11.4-301.fc41.x86_64
Architecture: x86-64
Hardware Vendor: KubeVirt
Hardware Model: None
Firmware Version: 1.16.3-2.el9
Firmware Date: Tue 2014-04-01
Firmware Age: 10y 10month 1w
TroubleShooting
CDI 导入镜像时 import pod 崩溃
使用 CDI 导入镜像时发现 import pod 崩溃了,查看 pod 日志如下
[root@base-k8s-master-1 kubevirt]# kubectl logs importer-prime-68f6a7e0-07fd-4cef-962a-0a2e79ceff2e
I0206 05:34:53.988263 1 importer.go:107] Starting importer
E0206 05:34:53.993562 1 importer.go:137] exit status 1, blockdev: cannot open /dev/cdi-block-volume: Permission denied
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceBlock
pkg/util/file.go:135
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceByVolumeMode
pkg/util/util.go:99
main.main
cmd/cdi-importer/importer.go:135
runtime.main
GOROOT/src/runtime/proc.go:271
runtime.goexit
src/runtime/asm_amd64.s:1695
看到对设备没有权限,检查 pod 的安全上下文。
[root@base-k8s-master-1 kubevirt]# kubectl get pod importer-prime-68f6a7e0-07fd-4cef-962a-0a2e79ceff2e -o jsonpath='{.spec.containers[].securityContext}' | jq
{
"allowPrivilegeEscalation": false,
"capabilities": {
"drop": [
"ALL"
]
},
"runAsNonRoot": true,
"runAsUser": 107,
"seccompProfile": {
"type": "RuntimeDefault"
}
}
所以是因为非特权用户的原因,Google 搜索报错找到:
https://github.com/longhorn/longhorn/issues/8527
https://github.com/kubevirt/containerized-data-importer/issues/2378
https://kubernetes.io/blog/2021/11/09/non-root-containers-and-devices/
我正好用的是 containerd 运行时,所以去修改 containerd 配置。
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = true
修改后重启 containerd,问题修复。