Jump to content

Freenas 10G RAID Z3 Speeds not what I expected

I'm here with another Freenas involved quest for reaching that 1GB/s (10gigabit) holy grail transfer.

 

I have an unusual issue that has come up.  I've expanded (well destroyed and recreated) my freenas drive setup from 6 disks to 10 disks, and I've changed it form RAIDZ2 to RAIDZ3 (actually on accident, and I transfered data before I caught it).  Now here's the interesting question.  Why am I getting better writes than reads?  And why am I not getting faster results with 10 disks?  My reads top out at about 230-300MB/s real word (copying a 30gig video file to 5 SSD array in RAID 5 on my local PC) over 10gigabit.  I've Iperfed the connection and I get 9.12 Gbits/sec.  I've run crystal mark and I get 637 sequential read and 571 sequential writes.  But when it comes to real world tests I'm getting terrible reads (230MB/s to 300MB/s) and my writes hold stead hovering just above 400MB/s.  This just doesn't make any sense to me.  

 

I'm using an intel x540 T1 in my PC and an intel x540 T2 in my Freenas box. And they are both direct connected at the moment, I had to RMA my 10G switch.  The same issue was happening with the 10G switch as well.  Currently the IP's are 192.168.2.1 and 192.168.2.2

 

On the PC side i have maxed out send and recieve buffers, I've disabled interrupt moderation, I have turned on jumbo frames, and I have 16 rss queues (in my testing there is no change between 8 and 16)  

I'm running a six core 3930K CPU and the intel card and RAID card are both in 16x slots running full speed (Both cards run at 8x speed just due to their individual architecture).

 

On the freenas side I've only assigned it an IP and added the MTU 9014 to match the PC NIC.  Under tunables I followed the 45 drives.

Kern.ipc.maxsockbuf 1677216 sysctl

net.inet.ip.intr_queue_maxlen 2048 sysctl
net.inet.tcp.recbuf_inc 524288 sysctl
net.inet.tcp.recvbuf_max 16777216 sysctl
net.inet.tcp.recvspace 4194304 sysctl
net.inet.tcp.sendbuf_inc 32768 sysct
net.inet.tcp.sendbuf_max 16777216 sysctl
net.inet.tcp.sendspace 2097152 sysctl
net.route.netisr_maxqlen 2048 sysctl
 
I've also bound SMB to the 10gig network interface
 
I just don't get why these read speeds are so terrible (By the way I'm only accessing with one client on windows 10 pro).
 
 
 
Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Strikermed said:

I've also bound SMB to the 10gig network interface

Your above testing is all using SMB and not iSCSI or NFS?

 

6 minutes ago, Strikermed said:

Why am I getting better writes than reads? 

Write caching, Async I/O vs Sync I/O. Operating systems cache writes, that includes Windows, as well as ZFS having some caching of it's own going on but that can depend on configuration etc.

 

Quote

To cut a long story short, an operating system often buffers write operations in the main memory, if the files are opened in asynchronous mode. This is not to be confused with ZFS’ actual write cache, ZIL.

https://linuxhint.com/configuring-zfs-cache/

 

Before diving in to potential SAMBA performance tuning create a ram disk and create a new pool from that, do some testing as see what you can get.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Strikermed said:

But when it comes to real world tests I'm getting terrible reads (230MB/s to 300MB/s) and my writes hold stead hovering just above 400MB/s.  This just doesn't make any sense to me. 

Can the local PC you're copying stuff to from the NAS actually handle writing to its own storage at such high speeds? I mean, even if the NAS could supply you with the full 1GB/s speeds, you're still limited by your local PC's storage-performance! It can read faster from its own local storage than write to, so that's why you're getting faster transfers when reading from the local PC and writing to the NAS than the other way around!

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, leadeater said:

Write caching, Async I/O vs Sync I/O. Operating systems cache writes, that includes Windows, as well as ZFS having some caching of it's own going on but that can depend on configuration etc.

No, his local PC's storage just can't handle those write-speeds.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, WereCatf said:

No, his local PC's storage just can't handle those write-speeds.

Extremely unlikely considering the client side is a 5 SSD array. The RAIDZ3 having slow write would make sense but it's got better writes than reads. I can assure you though Windows does write cache, when you copy a large file and explorer finishes watch the disk tab on Resource Monitor it'll still be either writing and/or reading depending on where you copied the data too.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

Extremely unlikely considering the client side is a 5 SSD array.

Shit, I missed that completely. I'm still groggy because I just woke up. Still, I'd test the transfer-speeds when writing to a RAM-disk in any case.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

Here would be some questions from me;

  1. Do you have sync enabled on the ZFS pool?
  2. What type of disks are you using in the pool, e.g SSD or HDD/SHDD?
  3. Do you utilise a ZFS SLog or ZIL drive?
  4. Do you have SMB3 configuration for multi-channel?

Could you also kindly output the configuration you are using for the zfs pool, what disks and Z3 layout you have configured.

Please quote or tag me if you need a reply

Link to comment
Share on other sites

Link to post
Share on other sites

Try removing your video card and anything else that is on your Northbridge chip beyond the raid card.  You are splitting resources and bottlenecking in your PC.

That's my guess.  But it's just a guess.

It must be true, I read it on the internet...

Link to comment
Share on other sites

Link to post
Share on other sites

I would test your pool locally within FreeNAS. Try the below, be sure to create a new dataset and disable compression on that dataset. Also RaidZ3 with 10 disks isn't an "optimal" configuration, I believe I read a guide somewhere you want multiples of 3. Still I would imagine better read speeds. It this one large vdev or possibly 2x five disk vdevs in a pool?

 

https://www.ixsystems.com/community/threads/notes-on-performance-benchmarks-and-cache.981/

 

Navigate to your dataset, run this to test you array write speed (all 0's - so again, be sure to disable cache)

dd if=/dev/zero of=tmp.dat bs=2048k count=50k

 

Afterwards this will read the file back into null:

dd if=tmp.dat of=/dev/null bs=2048k count=50k

Link to comment
Share on other sites

Link to post
Share on other sites

whats the ashift value for the drives?

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/31/2019 at 12:39 AM, leadeater said:

Your above testing is all using SMB and not iSCSI or NFS?

That is correct, all SMB share.  Do you know of a way I could try an NFS share with windows?

 

On 3/31/2019 at 12:39 AM, leadeater said:

Before diving in to potential SAMBA performance tuning create a ram disk and create a new pool from that, do some testing as see what you can get.

Should I create that on the PC side?  Or can I create a RAM Disk on Freenas?  If so, how do I do that?

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/31/2019 at 4:13 AM, WereCatf said:

Can the local PC you're copying stuff to from the NAS actually handle writing to its own storage at such high speeds? I mean, even if the NAS could supply you with the full 1GB/s speeds, you're still limited by your local PC's storage-performance! It can read faster from its own local storage than write to, so that's why you're getting faster transfers when reading from the local PC and writing to the NAS than the other way around!

I'm reading and writing to a RAID 5 5 SSD array.  Testing with a RAM disk I can reach 1500MB/s Reads and 900MB/s writes

Link to comment
Share on other sites

Link to post
Share on other sites

28 minutes ago, Strikermed said:

Should I create that on the PC side?  Or can I create a RAM Disk on Freenas?  If so, how do I do that?

Can't remember exactly how but it's pretty easy on the LInux/BSD side of things, for Windows I use StarWind RAM disk.

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/31/2019 at 4:06 PM, Falconevo said:

Here would be some questions from me;

  1. Do you have sync enabled on the ZFS pool?
  2. What type of disks are you using in the pool, e.g SSD or HDD/SHDD?
  3. Do you utilise a ZFS SLog or ZIL drive?
  4. Do you have SMB3 configuration for multi-channel?

Could you also kindly output the configuration you are using for the zfs pool, what disks and Z3 layout you have configured.

1.  Trying to determine this.  I didn't enable it, if it's something you need to enable.

 

2.  I'm using all HDD 7200 RPM.  

 

3.  I am not using ZFS SLog or ZIL drive. 

 

4.  I don't believe so.  I never made any configurations for multi-channel, and from what I've read it only applies to multiple aggregate connections

 

I'm trying to figure out where the debug file outputs too, but I'm having trouble locating it.  I'm able to get into shell and freenas-debug -c, and various other commands, but I just don't know where to find the actual reports.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, Strikermed said:

1.  Trying to determine this.  I didn't enable it, if it's something you need to enable.

 

2.  I'm using all HDD 7200 RPM.  

 

3.  I am not using ZFS SLog or ZIL drive. 

 

4.  I don't believe so.  I never made any configurations for multi-channel, and from what I've read it only applies to multiple aggregate connections

 

I'm trying to figure out where the debug file outputs too, but I'm having trouble locating it.  I'm able to get into shell and freenas-debug -c, and various other commands, but I just don't know where to find the actual reports.

 

 

Datasets are set to "either or" for synchronous writes, meaning if the protocol wants to use it, it can. The other option is to explicitly disable it which runs the risk of data loss during a power outage if you don't have a UPS + autoshutdown in place.

 

Have you tested the dataset's read/write speed locally yet?

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/2/2019 at 6:44 AM, shoutingsteve said:

Try removing your video card and anything else that is on your Northbridge chip beyond the raid card.  You are splitting resources and bottlenecking in your PC.

That's my guess.  But it's just a guess.

This same configuration has allowed over 600MB/s with less drives.  The only change has been some network changes, and an increase in drives.

Link to comment
Share on other sites

Link to post
Share on other sites

Ok, so I've done a fresh install of Freenas due to some lagging issues.  I also figured a clean install and fresh settings would remove any unknowns.

 

This made no change on the speed.

 

I then changed the A Time for the whole pool and this improved my speeds to 350-400 on a large single file transfer.  surprising, writes have jumped to the rates I'm getting on iperf tests.  750-820MB/s

 

I created a Test data set, and did some large file copies of about 30-60GB.  I disabled Sync, and I have compression disabled.  I got on average 420-500MB/s transfers

 

Following the tunables from 45 drives: http://45drives.blogspot.com/2016/05/how-to-tune-nas-for-direct-from-server.html... As of right now, these tuneables made absolutely not change.

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/2/2019 at 7:09 AM, Mikensan said:

I would test your pool locally within FreeNAS. Try the below, be sure to create a new dataset and disable compression on that dataset. Also RaidZ3 with 10 disks isn't an "optimal" configuration, I believe I read a guide somewhere you want multiples of 3. Still I would imagine better read speeds. It this one large vdev or possibly 2x five disk vdevs in a pool?

 

https://www.ixsystems.com/community/threads/notes-on-performance-benchmarks-and-cache.981/

 

Navigate to your dataset, run this to test you array write speed (all 0's - so again, be sure to disable cache)

dd if=/dev/zero of=tmp.dat bs=2048k count=50k

 

Afterwards this will read the file back into null:

dd if=tmp.dat of=/dev/null bs=2048k count=50k

I don't have time to try this tonight... Mostly because I'm not familiar with these commands, so I'll need to do my homework before I proceed.

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/6/2019 at 1:43 AM, Strikermed said:

I don't have time to try this tonight... Mostly because I'm not familiar with these commands, so I'll need to do my homework before I proceed.

It's not too bad and won't take more than 30 minutes total. You navigate to the dataset you want to try and copy/paste. /dev/zero just means write zeros. /dev/null is a void. Only cleanup in the end is to delete tmp.dat as it'll eat up ~100GB. If your test results shopup in the gigabyte/s then likely compression is on.

 

It is odd your write speeds are that much faster than your read speeds, how much RAM does your ZFS box have?

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/8/2019 at 9:09 AM, Mikensan said:

It's not too bad and won't take more than 30 minutes total. You navigate to the dataset you want to try and copy/paste. /dev/zero just means write zeros. /dev/null is a void. Only cleanup in the end is to delete tmp.dat as it'll eat up ~100GB. If your test results shopup in the gigabyte/s then likely compression is on.

 

It is odd your write speeds are that much faster than your read speeds, how much RAM does your ZFS box have?

It has 64 GB of RAM.  I'll try this test and report back

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/8/2019 at 9:09 AM, Mikensan said:

It's not too bad and won't take more than 30 minutes total. You navigate to the dataset you want to try and copy/paste. /dev/zero just means write zeros. /dev/null is a void. Only cleanup in the end is to delete tmp.dat as it'll eat up ~100GB. If your test results shopup in the gigabyte/s then likely compression is on.

 

It is odd your write speeds are that much faster than your read speeds, how much RAM does your ZFS box have?

I attempted to run that command, but I got permission denied. 

 

Since I'm new at this, I want to be clear.  

 

In the GUI shell I navigated to my Test dataset (ignore the RAIDz2 for the pool, it was a mistake I caught too late)

 

cd /mnt/RAIDz2/Test

 

I then ran the command from above

 

/dev/zero

 

I got this message:

"bash: /dev/zero: Permission denied"

 

Any thoughts?

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Strikermed said:

I attempted to run that command, but I got permission denied. 

 

Since I'm new at this, I want to be clear.  

 

In the GUI shell I navigated to my Test dataset (ignore the RAIDz2 for the pool, it was a mistake I caught too late)

 

cd /mnt/RAIDz2/Test

 

I then ran the command from above

 

/dev/zero

 

I got this message:

"bash: /dev/zero: Permission denied"

 

Any thoughts?

I've never worked out of the GUI shell - if possible I would work from the console and hit... 9 I think for shell which will log you in as root. Or SSH into the box using an account part of the root/wheel group.

 

the full command you ran was dd if=/dev/zero of=tmp.dat bs=2048k count=50k right?

Link to comment
Share on other sites

Link to post
Share on other sites

Also with 64GB of ram, you'll need to work with files larger than 64GB to get real write speeds of the disks because of how ARC works. This is why the DD test I'm recommending uses a 100GB file.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...
On 4/2/2019 at 7:09 AM, Mikensan said:

I would test your pool locally within FreeNAS. Try the below, be sure to create a new dataset and disable compression on that dataset. Also RaidZ3 with 10 disks isn't an "optimal" configuration, I believe I read a guide somewhere you want multiples of 3. Still I would imagine better read speeds. It this one large vdev or possibly 2x five disk vdevs in a pool?

 

https://www.ixsystems.com/community/threads/notes-on-performance-benchmarks-and-cache.981/

 

Navigate to your dataset, run this to test you array write speed (all 0's - so again, be sure to disable cache)

dd if=/dev/zero of=tmp.dat bs=2048k count=50k

 

Afterwards this will read the file back into null:

dd if=tmp.dat of=/dev/null bs=2048k count=50k

I finally got around to running this...

 

My results were:

51200+0 records in

51200+0 records out

107374182400 bytes transferred in 70.260040 secs (1528239696 bytes/sec)

 

Any other suggestions?  If this is accurate, this is saying that I'm getting 1500MB/s

Link to comment
Share on other sites

Link to post
Share on other sites

On 5/23/2019 at 11:55 PM, Strikermed said:

I finally got around to running this...

 

My results were:

51200+0 records in

51200+0 records out

107374182400 bytes transferred in 70.260040 secs (1528239696 bytes/sec)

 

Any other suggestions?  If this is accurate, this is saying that I'm getting 1500MB/s

This means you have compression turned on (lz4 is the default) - you'll have to either create a temporary dataset with it turned off, or temporarily turn off compression on the dataset you're testing on.

 

*Edit - if this is your SSD array then 1500 is correct, I get about 1300 across 3 SSDs.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×