Jump to content

Look, mum, I'm on TV! - High performance Samba configuration

gsuberland

I was watching We Finally Did It Properly today, and was taken by surprise with a screenshot of my blog in the video at 12:00. I'm glad it was useful 🙂

 

I'm not sure whether the LTT folks routinely check the forums, but I thought I'd drop a post here just in case, since there's plenty more opportunity to squeeze performance out of (New New) Whonnock 3.

 

Those 100G ConnectX-6 NICs you have are well-equipped for RoCE (RDMA over Converged Ethernet), which can be utilised with SMB Direct. I'd highly recommend looking into enabling that alongside SMB Multichannel, as it'll vastly reduce your CPU usage on the server when communicating with other RoCE-enabled systems that also have SMB Direct enabled. For server-to-server transfers (e.g. backups) and ingest systems you'd get a fairly significant performance boost by configuring SMB Direct on both sides. You don't need to do anything else with your networking config - it all runs over IP, unlike Infiniband. The only thing to be careful of is the cryptography settings - I'm not 100% sure if Mellanox's drivers have support for SMB cryptographic offload features like Chelsio does with their cards, so you might need to ask Mellanox about the interaction between RoCE / SMB Direct and encryption and message signing (authenticity) in Samba, but you could just benchmark with them on and off to see what happens. Network endpoints that don't support RoCE (e.g. editing workstations) will use SMB Multichannel in exactly the same way they do now, so there's no change there either. RoCE performance is sensitive to switching latency, so you'll want to minimise the number of hops between the two endpoints, but it should (insert handwave here) always be faster than a non-RoCE setup anyway.

 

Windows has native support for SMB Direct, just as it does for SMB Multichannel. You can enable it with powershell commands. The documentation for the feature on Samba is lacking, but SMB Direct should now be available in the latest versions of Samba. You should be able to enable it by adding an extra capability directive to your interface config, e.g. "capability=RSS,capability=RDMA". This sets the FSCTL_NET_IFACE_RDMA_CAPABLE flag internally on the interface, which appears to automatically enable RoCE support, which should enable SMB Direct. You may need to ask for clarification on the Samba mailing list, though, since I've not had time to fully investigate this.

 

Another thing you may want to look into is direct DMA transfer between the NIC and NVMe. On supporting system architectures this allows the NIC to directly talk to the NVMe over the PCI-e bus, without data being copied to system memory (RAM). The CPU sets up the initial communications channel and enables an IOMMU rule to allow cross-device DMA transactions, and then ceases to be involved in the transfer. The data flow path either goes straight through the PEX controller, or is handled by the CPU's IMC without actually going out to RAM. This is pretty bleeding-edge tech, intended for the HPC industry, but I know Chelsio have been working on it since the early 2010s. It requires kernel support, but from what I've read BSD has been the primary target environment for this work, with RedHat also being of interest. I'm also not sure how it plays with ZFS or other software RAID solutions. Unfortunately, without a relationship with Mellanox or Chelsio, I've been unable to find any information on practical implementations or get any clarification on the status or availability of these features. LMG are far better positioned to find out than I am.

 

Hope this is of use, and results in some blazing fast transfers! 🙂 

 

(P.S. LMG folks, feel free to drop me a DM on Twitter if you have questions about this stuff - my username is the same as here)

Workstation: 2x Xeon 8276L (56 cores), 192GB ECC RAM, 4x NVMe SSD in VROC, 2080Ti, Chelsio T520 2x10G NIC

NAS: TrueNAS, 8x 6TB WD Red Pro in RAIDZ2, 96GB RAM, 1TB NVMe L2ARC, Chelsio T520 2x10G NIC

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, gsuberland said:

--snippet--

I think they check the forums once in a while yes, maybe less of bossman linus, but the lmg team checks once in a while i think.

:old-smile:

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×