CEPH - Mapping PGs and OSDs of a VM


This article shows how you can find out the list of objects from a RBD block device (eg: VM, volume), the placement group of each object and which OSD they stored. Thus, whenever a VM hits performance issue, you can trace the VM's OSD locations whether it is correlate to OSDs (eg: harddisk) in question. In my case, the VMs are running on OpenStack backed by Ceph Storage.
Find out the rbd location of the VM
In this example, I retrieve the details on VM
instance-00002495
hosting on one of OpenStack Hypervisor nodesnova1-03
.
root@nova1-030:~# virsh dumpxml instance-00002495 | grep disk
<disk type='network' device='disk'>
<source protocol='rbd' name='pool-cinder-volume-1/3c25453e-76a9-4d18-b0b3-98aa9e74efb1_disk'> <alias name='virtio-disk0'/>
</disk>
The rbd pool name : pool-cinder-volume-1
The rbd block device name: 3c25453e-76a9-4d18-b0b3-98aa9e74efb1_disk
- Go to Ceph Monitor node,
ceph1-001
. Double check the RBD pool name
root@ceph1-001:~# ceph osd lspools
3 pool-cinder-volume-1.bad,5 .rgw,7 .rgw.control,8 .rgw.gc,9 .log, 10 .intent-log,11 .usage,12 .users,13 .users.email,14 .users.swift,15 . users.uid,16 .rgw.root,17 ,18 .rgw.buckets.index,20 .rgw.buckets,22 pool-cinder-volume-1,23 pool-glance-image-1,24 bench_test,
- Get the information of the RBD device
root@ceph1-001:~# rbd -p pool-cinder-volume-1 info 3c25453e-76a9-4d18-b0b3-98aa9e74efb1_disk
rbd image '3c25453e-76a9-4d18-b0b3-98aa9e74efb1_disk':
size 51200 MB in 25600 objects
order 21 (2048 kB objects)
block_name_prefix: rbd_data.edc7802bfaa117
format: 2
features: layering
parent: pool-glance-image-1/83fe12ea-3556-4408-a5b2-0c2eb75eb321@snap
overlap: 2048 MB
The RDB szie at 50GB, spilt into 25600 objects. However the actual number of objects should be less than 25600. The 'block_name_prefix' is the prefix naming of all 25600 objects in this RBD devices, that means the name of all objects for this RBD will be started with rbd_data.edc7802bfaa117
- We can list out all objects associating with this RDB device, and write to a file.
[Integration]root@ceph1-001:~# rados -p pool-cinder-volume-1 ls |grep rbd_data.edc7802bfaa117 | head -5 rbd_data.edc7802bfaa117.00000000000002a4
rbd_data.edc7802bfaa117.000000000000031a
rbd_data.edc7802bfaa117.0000000000000313
rbd_data.edc7802bfaa117.000000000000018e
rbd_data.edc7802bfaa117.000000000000007b
- We have the full objects list of this VM, you can now find out Placement Group and the location OSD of each object. Let's take a look at one object
data.edc7802bfaa117.00000000000002a4
root@ceph1-001:~# ceph osd map pool-cinder-volume-1 rbd_data.edc7802bfaa117.00000000000002a4
osdmap e782165 pool 'pool-cinder-volume-1' (22) object 'rbd_data. edc7802bfaa117.00000000000002a4' -> pg 22.38464020 (22.20) -> up
([168,48,141], p168) acting ([168,48,141], p168)
Now you can see that this particular object is assigned to PG 22.20, which is pointing OSD 168, 48, 141 where 168 is the primary OSD for this PG
- You can simply double check the PG info like this:
root@ceph1-001:~# ceph pg map 22.20
osdmap e782168 pg 22.20 (22.20) -> up [168,48,141] acting [168,48,141]
- Although each object will be assigned into different PG thus different OSD device calculated by CRUSH; by using this way, you can generate the list of PGs and OSDs associating to this VM and give you some clues when troubleshooting slow performance VMs.
Subscribe to my newsletter
Read articles from Bruce L directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Bruce L
Bruce L
I’ve been rocking the DevOps journey for a decade, starting with building Cisco’s software-defined datacenters for multi-region OpenStack infrastructures. I then shifted to serverless and container deployments for finance institutions. Now, I’m deep into service meshes like Consul, automating with Ansible and Terraform, and running workloads on Kubernetes and Nomad. Stick around for some new tech and DevOps adventures!