Jump to content

I want more nvme raid0 performance

hash02

I have 3x 3.8TB nvme pcie3 enteprise ssds in raid0 (mdadm on linux) and i'm getting read speeds the same or below that of the individual underlying drives. 
These drives are rated at 2.4~2.5 GB/s sequential max read, so in a raid0 setup i should get more, but sometimes i get less.

setup: 
Epyc 7502p
3x 3.8 TB Samsung NVME pcie3 ssds

128GB DDR4 

Debian 10, stock, default kernel 

 

On the software side, mdadm raid0 from all three devices with xfs on top. no special flags passed to xfs creation, defaults only.

Tested using hdparm and dd. 
nvme list:
 

nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     S438N*********       SAMSUNG MZQLB3T8HALS-00007               1           1.92  TB /   3.84  TB    512   B +  0 B   EDA5502Q
/dev/nvme1n1     S438N*********       SAMSUNG MZQLB3T8HALS-00007               1           1.92  TB /   3.84  TB    512   B +  0 B   EDA5502Q
/dev/nvme2n1     S438N*********       SAMSUNG MZQLB3T8HALS-00007               1           1.92  TB /   3.84  TB    512   B +  0 B   EDA5502Q

xfs info 

meta-data=/dev/md0               isize=512    agcount=32, agsize=87904768 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=2812952576, imaxpct=5
         =                       sunit=128    swidth=384 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

md info 
 

/dev/md0:
           Version : 1.2
     Creation Time : Fri Jun 18 18:11:41 2021
        Raid Level : raid0
        Array Size : 11251817472 (10730.57 GiB 11521.86 GB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Fri Jun 18 18:11:41 2021
             State : clean 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

        Chunk Size : 512K

Consistency Policy : none

              Name : epyc-server-01:0  (local to host epyc-server-01)
              UUID : 9c1a9795:81025516:ab1b34a2:4d27d794
            Events : 0

    Number   Major   Minor   RaidDevice State
       0     259        1        0      active sync   /dev/nvme1n1
       1     259        2        1      active sync   /dev/nvme2n1
       2     259        0        2      active sync   /dev/nvme0n1

Speed: 

hdparm -t /dev/md0 :  average 2000MB/s

hdparm -t /dev/nvme0n1 (or any other) - average 2300MB/s 

What's wrong here ? 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, hash02 said:

Tested using hdparm and dd. 

Those aren't great benchmarks, try using FIO here.

 

What board are you using and how are the drives connected to the board?

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 year later...

Just given the task to analyze a serious performance issue on a mdadm (Raid10)

 

Very similar setup. Probably hitting the same problem.

 

MB Supermicro H12DSU-iN

2 x AMD EPYC 7352 24-Core Processor (CPU1 and CPU2)

 

Storage 
 

Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     BTLJ113301G41P0FGN   INTEL SSDPE2KX010T8                      1           1.00  TB /   1.00  TB    512   B +  0 B   VDV10170
/dev/nvme1n1     BTLJ1133031W1P0FGN   INTEL SSDPE2KX010T8                      1           1.00  TB /   1.00  TB    512   B +  0 B   VDV10170
/dev/nvme2n1     S438NC0R600133       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
/dev/nvme3n1     S438NC0R600129       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
/dev/nvme4n1     S438NC0R600128       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
/dev/nvme5n1     S438NC0R600131       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
/dev/nvme6n1     S438NC0R600124       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
/dev/nvme7n1     S438NC0R600126       SAMSUNG MZQLB3T8HALS-00007               1           3.09  TB /   3.84  TB    512   B +  0 B   EDA5402Q
 

Sofware RAID setup:

md0 : active raid10 nvme4n1[2] nvme7n1[5] nvme5n1[3] nvme6n1[4] nvme3n1[1] nvme2n1[0]
      11251817472 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
      bitmap: 24/84 pages [96KB], 65536KB chunk

md2 : active raid1 nvme1n1p3[1] nvme0n1p3[0] -> Root
      975578112 blocks super 1.2 [2/2] [UU]
      bitmap: 4/8 pages [16KB], 65536KB chunk

md1 : active raid1 nvme1n1p2[1] nvme0n1p2[0]
      1046528 blocks super 1.2 [2/2] [UU]

 

Right now all NVMe are hanging from CPU1 

 

read latency in /dev/md0 is very bad.

 

I was planning to distribute the nvme to both CPUs, but I need to know which one is mdadm mirroring and which one is mdadm stripping.

 

Any clue/suggestion? Thanks!

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×