df showing wrong size for gluster volume

hcker2000 · February 5, 2020

So I have a 2x2 distributed replicated setup on two nodes. I originally had two 32gb pen drives in each node. I replaced one 32 gig pen drive on gfs1 with a 1tb drive and let cluster replicate the files from gfs2. Once that was done I reversed the process and let it finish. All that went fine except now when I run a df -h on both nodes the bricks show the correct size but the gluster mount is not showing the correct total space or used space for that matter.

ubuntu@gfs1:~/glusterfs-tools$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            407M     0  407M   0% /dev
tmpfs            91M  4.3M   87M   5% /run
/dev/mmcblk0p2   29G  3.5G   24G  13% /
tmpfs           455M     0  455M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           455M     0  455M   0% /sys/fs/cgroup
/dev/loop0       79M   79M     0 100% /snap/core/8271
/dev/loop1       58M   58M     0 100% /snap/lxd/13165
/dev/loop3       60M   60M     0 100% /snap/lxd/13256
/dev/loop2       79M   79M     0 100% /snap/core/8042
/dev/mmcblk0p1  253M  127M  127M  51% /boot/firmware
/dev/sdb        932G   31G  901G   4% /data/glusterfs/gfs/brick3_1
/dev/sda         29G  274M   29G   1% /data/glusterfs/gfs/brick1
localhost:/gfs  480G   21G  460G   5% /mnt/gfs
tmpfs            91M     0   91M   0% /run/user/1000

Here is the out put when I get a detailed status of the volume

ubuntu@gfs1:~/glusterfs-tools$ sudo gluster volume status gfs detail
Status of volume: gfs
------------------------------------------------------------------------------
Brick                : Brick gfs1:/data/glusterfs/gfs/brick1/brick
TCP Port             : 49152
RDMA Port            : 0
Online               : Y
Pid                  : 1469
File System          : xfs
Device               : /dev/sda
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.6GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157182
------------------------------------------------------------------------------
Brick                : Brick gfs2:/data/glusterfs/gfs/brick2/brick
TCP Port             : 49155
RDMA Port            : 0
Online               : Y
Pid                  : 12723
File System          : xfs
Device               : /dev/sdb
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.6GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157150
------------------------------------------------------------------------------
Brick                : Brick gfs1:/data/glusterfs/gfs/brick3_1/brick
TCP Port             : 49153
RDMA Port            : 0
Online               : Y
Pid                  : 1477
File System          : xfs
Device               : /dev/sdb
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 900.2GB
Total Disk Space     : 931.0GB
Inode Count          : 488364544
Free Inodes          : 488364488
------------------------------------------------------------------------------
Brick                : Brick 192.168.0.124:/data/glusterfs/gfs/brick4_1/brick
TCP Port             : 49156
RDMA Port            : 0
Online               : Y
Pid                  : 12743
File System          : xfs
Device               : /dev/sda
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 900.2GB
Total Disk Space     : 931.0GB
Inode Count          : 488364544
Free Inodes          : 488364488

I have tried to run a heal, checked that there are no quotas in place, restarted the volume, reboot the node, ran a rebalance all with no change. Any one have any thoughts on what to try?

Jarsky · February 5, 2020

I know little about GlusterFS operationally, but perhaps this thread could help: https://bugzilla.redhat.com/show_bug.cgi?id=1517260

It discusses a bug with upgrading from an earlier version that had no FSID values, and a mismatch of the shared brick count.

So possibly check

grep -n "shared-brick-count" /var/lib/glusterd/vols/volumedisk1/*

hcker2000 · February 5, 2020

45 minutes ago, Jarsky said:

I know little about GlusterFS operationally, but perhaps this thread could help: https://bugzilla.redhat.com/show_bug.cgi?id=1517260

It discusses a bug with upgrading from an earlier version that had no FSID values, and a mismatch of the shared brick count

Thanks for the tip. I ran that and it seems that there may be something to this. as I get

/var/lib/glusterd/vols/gfs/gfs.192.168.0.124.data-glusterfs-gfs-brick4_1-brick.vol:4:    option shared-brick-count 0
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick1-brick.vol:4:    option shared-brick-count 2
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick3_1-brick.vol:4:    option shared-brick-count 2
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick2-brick.vol:4:    option shared-brick-count 0

Now the real thing is to learn what this means and how to resolve it as the shell script looks to just be setting them to 1

hcker2000 · February 6, 2020

This seems to be related to me accidentally setting a mount point to the wrong directory. Specifically one that was already used by brick3 and brick4.

After resetting my test setup I get this

ubuntu@ubuntu:/mnt/gfs$ sudo grep -n "shared-brick-count" /var/lib/glusterd/vols/gfs/*
grep: /var/lib/glusterd/vols/gfs/bricks: Is a directory
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick1-brick.vol:4:    option shared-brick-count 1
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick2_1-brick.vol:4:    option shared-brick-count 1
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick1-brick.vol:4:    option shared-brick-count 0
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick2_1-brick.vol:4:    option shared-brick-count 0

hcker2000 · February 7, 2020

Ok so I have gotten this to happen again. This time when adding two more drive to the test system, each 32gb. This is the command I ran to add the new pair of drives.

sudo gluster volume add-brick gfs gfs1:/data/glusterfs/gfs/brick3/brick/ gfs2:/data/glusterfs/gfs/brick3/brick/

Here is the disk space after. Also the used space is way off as far as I can tell. right now I have five tiny text files on /mnt/gfs

ubuntu@gfs1:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            877M     0  877M   0% /dev
tmpfs           185M  4.3M  181M   3% /run
/dev/mmcblk0p2   29G  2.7G   25G  10% /
tmpfs           925M     0  925M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           925M     0  925M   0% /sys/fs/cgroup
/dev/loop1       79M   79M     0 100% /snap/core/8271
/dev/loop0       79M   79M     0 100% /snap/core/8042
/dev/loop2       49M   49M     0 100% /snap/lxd/12634
/dev/loop3       60M   60M     0 100% /snap/lxd/13302
/dev/mmcblk0p1  253M  127M  127M  50% /boot/firmware
/dev/sda        932G  982M  931G   1% /data/glusterfs/gfs/brick2_1
/dev/sdb         29G   62M   29G   1% /data/glusterfs/gfs/brick1
localhost:/gfs  509G  5.7G  504G   2% /mnt/gfs
tmpfs           185M     0  185M   0% /run/user/1000
/dev/sdc         29G   62M   29G   1% /data/glusterfs/gfs/brick3

As you can see 932G(sdb) + 29G(sda) + 29G(sdc) != 509G

Here is my detailed volume status

ubuntu@gfs1:~$ sudo gluster volume status gfs detail
Status of volume: gfs
------------------------------------------------------------------------------
Brick                : Brick gfs1:/data/glusterfs/gfs/brick1/brick
TCP Port             : 49152
RDMA Port            : 0
Online               : Y
Pid                  : 1496
File System          : xfs
Device               : /dev/sdb
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.8GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157220
------------------------------------------------------------------------------
Brick                : Brick gfs2:/data/glusterfs/gfs/brick1/brick
TCP Port             : 49152
RDMA Port            : 0
Online               : Y
Pid                  : 1492
File System          : xfs
Device               : /dev/sdb
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.8GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157220
------------------------------------------------------------------------------
Brick                : Brick gfs1:/data/glusterfs/gfs/brick2_1/brick
TCP Port             : 49153
RDMA Port            : 0
Online               : Y
Pid                  : 1507
File System          : xfs
Device               : /dev/sda
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 930.1GB
Total Disk Space     : 931.0GB
Inode Count          : 488364544
Free Inodes          : 488364475
------------------------------------------------------------------------------
Brick                : Brick gfs2:/data/glusterfs/gfs/brick2_1/brick
TCP Port             : 49153
RDMA Port            : 0
Online               : Y
Pid                  : 1501
File System          : xfs
Device               : /dev/sda
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 930.1GB
Total Disk Space     : 931.0GB
Inode Count          : 488364544
Free Inodes          : 488364475
------------------------------------------------------------------------------
Brick                : Brick gfs1:/data/glusterfs/gfs/brick3/brick
TCP Port             : 49154
RDMA Port            : 0
Online               : Y
Pid                  : 6558
File System          : xfs
Device               : /dev/sdc
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.8GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157222
------------------------------------------------------------------------------
Brick                : Brick gfs2:/data/glusterfs/gfs/brick3/brick
TCP Port             : 49154
RDMA Port            : 0
Online               : Y
Pid                  : 3673
File System          : xfs
Device               : /dev/sdc
Mount Options        : rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size           : 512
Disk Space Free      : 28.8GB
Total Disk Space     : 28.9GB
Inode Count          : 15157248
Free Inodes          : 15157222

I tried running sudo gluster volume rebalance gfs fix-layout start which finished fine but with no additional space showing up.

I also reran sudo grep -n "shared-brick-count" /var/lib/glusterd/vols/gfs/* which looks ok

ubuntu@gfs1:~$ sudo grep -n "shared-brick-count" /var/lib/glusterd/vols/gfs/*
grep: /var/lib/glusterd/vols/gfs/bricks: Is a directory
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick1-brick.vol:4:    option shared-brick-count 1
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick2_1-brick.vol:4:    option shared-brick-count 1
/var/lib/glusterd/vols/gfs/gfs.gfs1.data-glusterfs-gfs-brick3-brick.vol:4:    option shared-brick-count 1
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick1-brick.vol:4:    option shared-brick-count 0
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick2_1-brick.vol:4:    option shared-brick-count 0
/var/lib/glusterd/vols/gfs/gfs.gfs2.data-glusterfs-gfs-brick3-brick.vol:4:    option shared-brick-count 0
grep: /var/lib/glusterd/vols/gfs/rebalance: Is a directory

hcker2000 · February 11, 2020

Ok so I think I worked out what was happening. When replacing a faulty brick you MUST follow these instructions from the docs. https://docs.gluster.org/en/latest/Administrator Guide/Managing Volumes/#replace-faulty-brick

Create a directory on the mount point that doesn't already exist. Then delete that directory, do the same for metadata changelog by doing setfattr. This operation marks the pending changelog which will tell self-heal damon/mounts to perform self-heal from /home/gfs/r2_1 to /home/gfs/r2_5.

mkdir /mnt/r2/<name-of-nonexistent-dir>
rmdir /mnt/r2/<name-of-nonexistent-dir>
setfattr -n trusted.non-existent-key -v abc /mnt/r2
setfattr -x trusted.non-existent-key  /mnt/r2