Jump to content

Helping Linus with Microsoft SMB 3.0 multipath/channel

.:MARK:.

Hi,

 

So last night Linus expressed his issues with trying to utilise more links on each windows machine to acheive higher throughput on SMB file transfers.

 

Now I thought I would take it upon myself to research and find a solution for him, although I'm not quite sure what Linus has done already to find a solution. So it would be helpful if @LinusTech could respond here with what he's tried.

 

So on with my preliminary findings, I'll list a bunch of links and resources below which go in more depth.

 

SMB 3.0 multipath on Windows 8.1/2012r2 is quite intelligent in that it basically requires no config to get it running, in most cases you get the multiple nics and hook them up and when you initiate a transfer, a handshake is done to determine the best nics to use on both hosts and create multiple TCP sessions for 2x, 3x, 4x... throughput.

 

AFAIK no LAG or LACP is needed or this to work, so all ethernet interfaces can have different local IPs, in fact you can go and unplug ethernet cables all over the place and the transfer will just scale down and then back up again when replugged.

 

There may be a required feature the NICs must have, RSS/RDMA. Windows will prefer RDMA NICs over RSS NICs, so check what you have. Preferably turn off all NIC teaming in windows and drivers and disable LACP and LAG. Make sure interfaces are both of the same kind, e.g. 2 x 1GbE RDMA/RSS.

 

I won't go over LACP and LAG or even Teaming until I know more what @LinusTech has done so far.

 

A few good links:

 

http://blogs.technet.com/b/josebda/archive/2012/05/13/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx

 

http://blog.workinghardinit.work/2013/04/24/smb-3-0-multichannel-auto-configuration-in-action-with-rdma-smb-direct/

 

https://social.technet.microsoft.com/Forums/en-US/68bdf37d-4ebf-4e35-b066-c8ff77bac573/how-does-smb-3-multipath-actualy-work?forum=winserver8gen

 

https://forums.servethehome.com/index.php?threads/smb-3-0-and-bandwidth-aggregation.1046/

 

 

Also @Slick if @LinusTech doesn't see this.

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

Hello, I remember when I was helping my father implement something similar in a school awhile back hope this helps.. 

  • We did ultimately manage to work around this and have been extremely happy with storage spaces and SMB3 since then.

Problem, The summary for us was as follows:

  • Extremely poor sequential write performance with standard drives.  Even with 8x SAS 7.2k in mirror it strugles.
  • Old sata ssd's that simply were not up to scratch.
  • small amounts of ssd's causing a low column count for the matching slow hd teir.  In future I will be more wary of this, especially when specing for general purpose file storage (for vm storage this is fine)
  • the new 10GB Nics we used had an issue with certain offload features.  After extensive testing we found the correct combination of tcp/udp offload settings

For us VMQ was not an issue as this was traffic between hyper-v hosts and the SOFS.  Not vm traffic.

Hope this helps

 


 

  • Solution, We designed and implemented the following:

    Hardware:

    • 2x sofs nodes with 1x 4port 10gig Chelsio RDMA adapters each (2 for smb3.0 and 1 for cluster coms)
    • 1x jbod with 4x 400gb enterprise ssd's and 8x 4tb enterprise 7.2k sas drives

    Storage Spaces:

    • 1x tiered space with 2 columns - 1.5tb of mirrored ssd space and 1.5tb of mirrored hd space
      This has insane IOPS and is great for VM's but is poor on sequential throughput when not using SSD tier (50MB/s or worse)  We currently run over 30 vm's on this single virtual drive over SMB
    • 1x non-teired space with 4 columns using just the 7.2 drives - this gives us approx 10TB of bulk storage for large vhdx's such as file servers and other large applications (WSUS etc...)  we get 100MB/s of throughput on this due to the extra column counts.  Note this space has a 10gb write back cache on the ssd's to absorb small peaks in writes

    This solution has allowed us to support a school of 1200 pupils and 150 staff with no problems and great performance.  We have capacity to double all the drives and double capacity and performance.  

    My tips if i was starting again:

    • Use faster spinning disks (10k/15k) the performance difference is large
    • Add more SSD's so we can have 4 columns accross the board (8x 200gb instead of 4x 400gb) this would allow us to keep with our original plan of having a single tiered space for all vhdx's
    • Pester MS to allow us to tier a storage space with different columns per tiere (2 for ssd and 4-8 for hd for example)

    Hope this helps/makes sense!

Link to comment
Share on other sites

Link to post
Share on other sites

http://dottech.org/26628/how-to-force-windows-to-use-100-of-your-network-bandwidth-how-to-guide/

 

If windows isn't using the full bandwidth then you can use the guide above to force it too. 

 

Update: the below is a link to Microsofts site telling you how to utilize multiple network cards in windows. 

 

http://windows.microsoft.com/en-us/windows/configuring-multiple-network-gateways#1TC=windows-7

 

 

I'd try the bottom link first and see if it helps if not it could be a variety of issues such as how your network is configured to if the server is functioning properly. Or maybe windows 8.1 has a bug in the OS that is causing your issue. Either way it will involve extensive testing..

 

Update: 3?  The below is also another thing I would try..

 

http://www.makeuseof.com/tag/7-simple-steps-diagnose-network-problem/

Link to comment
Share on other sites

Link to post
Share on other sites

why dont they just use 10gb nics instead of 4x gigabit for less performance and 4 cables needed

My Sightings on LTT : June 6th 2014 WAN Show After Party: Mario Kart 8 July 31st 2015 WAN Show: Tesla Topic   August 14th 2015 WAN Show: ESL Topic 
My Rig: i7 4770K | Z87 Sabertooth | 32GB Corsair Vengeance | EVGA GTX 780Ti SC ACX | Samsung 840 Pro 128GB | WD 4TB Black | Noctua NH-U14S | Corsair 750D | Corsair RM850  \
Peripherals: Triple VG248QE (1080p 144hz) | Corsair RGB K95 MX Blues | Razer Deathadder Chroma | ATH-M50X | JBL LSR305 | Mod Mic 4.0
Devices:Mac Book Pro Retina|iPad Mini (32GB) | HTC One M9 (160GB) Moto 360 (Black Leather) Nvidia Shield (80GB) Go Pro Hero 3+ Black
Link to comment
Share on other sites

Link to post
Share on other sites

Because those NICs have to be plugged into switches...

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

I have managed to get a dual teamed 1 Gigabit NICs each on their own individual PCIE cards plugged into my server with SMB 3.0 with multichannel working.

 

Testcase: 3 clients with dual NICS each. All of which sending exactly 100MB files generated by concatenating zero bytes (self made program) and a second test using an exactly 100B video file.

Network has 1 router and 1 switch capable with having the server NICs in an etherchannel.

Server is running an 160GB partition on a spare 2x120gb Samsung 840 EVOs in raid 0. OS is Windows Server 2012 r2 with various roles installed including SMB 3.0 and NAP. Share does not have encrypted transfer enabled for latency reasons.

The share I have tested is both windows network share and ISCSI.

 

And now I need sleep.

Link to comment
Share on other sites

Link to post
Share on other sites

Lol, as a guy who took networking and security your response was laughable.. But I thank MARK for answering it anyways. (not trying to sound mean but that's something I'd read on the forms topic (Experiences with non-techies)..

 

If they did that they would only run into more issues and we all know they are already at their wits end with their current setup as is. ( I would imagine anyway sense you don't have a networking expert on hand..) 

sorry you didn't really give me a response. I wasn't telling them to do that I was asking for education for myself. no reason to lol at me I was just asking. I seriously was asking 

 

also I wouldn't consider this a non-techie question because 90% of the people on this forum probably don't know on the network section many probably do but I'm not very experienced with networking so I figured I'd ask so I educate my self so its more of a experience with non-network guy question

My Sightings on LTT : June 6th 2014 WAN Show After Party: Mario Kart 8 July 31st 2015 WAN Show: Tesla Topic   August 14th 2015 WAN Show: ESL Topic 
My Rig: i7 4770K | Z87 Sabertooth | 32GB Corsair Vengeance | EVGA GTX 780Ti SC ACX | Samsung 840 Pro 128GB | WD 4TB Black | Noctua NH-U14S | Corsair 750D | Corsair RM850  \
Peripherals: Triple VG248QE (1080p 144hz) | Corsair RGB K95 MX Blues | Razer Deathadder Chroma | ATH-M50X | JBL LSR305 | Mod Mic 4.0
Devices:Mac Book Pro Retina|iPad Mini (32GB) | HTC One M9 (160GB) Moto 360 (Black Leather) Nvidia Shield (80GB) Go Pro Hero 3+ Black
Link to comment
Share on other sites

Link to post
Share on other sites

Lol, as a guy who took networking and security your response was laughable.. But I thank MARK for answering it anyways. (not trying to sound mean but that's something I'd read on the forms topic (Experiences with non-techies)..

 

If they did that they would only run into more issues and we all know they are already at their wits end with their current setup as is. ( I would imagine anyway sense you don't have a networking expert on hand..) 

did some research so I educate myself and was under the impression that he was doing quad 1gbit from the nas to the server to each PC lol I didn't think that one through. I'm an idiot. I get that he would need a 10gbit server and switch and every PC would need a 10gbit NIC but he isn't looking for 4gbit connection to each PC im assuming just 4gbit so when everyone  connects they each get close to a gbit instead of having one gbit and geting a few 100 mbit on each client. 

My Sightings on LTT : June 6th 2014 WAN Show After Party: Mario Kart 8 July 31st 2015 WAN Show: Tesla Topic   August 14th 2015 WAN Show: ESL Topic 
My Rig: i7 4770K | Z87 Sabertooth | 32GB Corsair Vengeance | EVGA GTX 780Ti SC ACX | Samsung 840 Pro 128GB | WD 4TB Black | Noctua NH-U14S | Corsair 750D | Corsair RM850  \
Peripherals: Triple VG248QE (1080p 144hz) | Corsair RGB K95 MX Blues | Razer Deathadder Chroma | ATH-M50X | JBL LSR305 | Mod Mic 4.0
Devices:Mac Book Pro Retina|iPad Mini (32GB) | HTC One M9 (160GB) Moto 360 (Black Leather) Nvidia Shield (80GB) Go Pro Hero 3+ Black
Link to comment
Share on other sites

Link to post
Share on other sites

What model number NIC card is Linus using?

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

Take a look at this SMB Multichannel  Windows Server 2012 and SMB 3.0

 

http://blogs.technet.com/b/josebda/archive/2012/05/13/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx

 

Question does:

Multichannel actually still works with NIC Teaming enabled..?
 
OR
 
It is the SMB-Direct which uses RDMA that breaks if you turn on NIC teaming..?
 
That is funny, because when I enable teaming with my Intel Gigabit CT adapters, SMB multichannel stops working with that team. I supposed you could create two separate teams and then the teams would work together through SMB Multichannel, but other than that, it appears to break it.
 
If this is your issue then you need to find or re-download a copy of windows (make sure windows isn't corrupted) And re-install everything..
Link to comment
Share on other sites

Link to post
Share on other sites

 Mark, was this link the solution to your problem? http://windows.microsoft.com/en-us/windows/configuring-multiple-network-gateways#1TC=windows-7

 

Did it help at all? 

 

Heh, it's not my problem, it's Linus' problem. But he asked the community, so I thought i'd make a thread about it.

 

And to answer some other questions, there are many people who like computer hardware and gaming on this forum, but relatively few who are knowledgable of Networking & Programming.

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

I want to know the rest of the setup. 

 

What model NIC is he using? (client and server)

Where/what client is connecting?

How many clients?

What service(s) needs the SMB?

Is this a dedicated server for another server?

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

 

Take a look at this SMB Multichannel  Windows Server 2012 and SMB 3.0

 

http://blogs.technet.com/b/josebda/archive/2012/05/13/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx

 

Question does:

Multichannel actually still works with NIC Teaming enabled..?
 
OR
 
It is the SMB-Direct which uses RDMA that breaks if you turn on NIC teaming..?
 
That is funny, because when I enable teaming with my Intel Gigabit CT adapters, SMB multichannel stops working with that team. I supposed you could create two separate teams and then the teams would work together through SMB Multichannel, but other than that, it appears to break it.
 
If this is your issue then you need to find or re-download a copy of windows (make sure windows isn't corrupted) And re-install everything..

 

 

As far as I can tell teaming will increase fault tolerance but won't do much for throughput, because multichannel will create a separate TCP session on each interface and intelligently handle them all. So I think Linus should not try teaming at all, or LAG or LACP, and just try hooking up the NICs on fresh installs and letting Windows do the rest.

 

That link you showed is the first one I put in OP xD

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

I want to know the rest of the setup. 

 

What model NIC is he using? (client and server)

Where/what client is connecting?

How many clients?

What service(s) needs the SMB?

Is this a dedicated server for another server?

 

I'd like to know his NICs and the relevant network devices between the machines. So @LinusTech, perhaps you should show up here.

Comb it with a brick

Link to comment
Share on other sites

Link to post
Share on other sites

As far as I can tell teaming will increase fault tolerance but won't do much for throughput, because multichannel will create a separate TCP session on each interface and intelligently handle them all. So I think Linus should not try teaming at all, or LAG or LACP, and just try hooking up the NICs on fresh installs and letting Windows do the rest.

 

That link you showed is the first one I put in OP xD

Yes but this is simply a test to get multichanel working because to be frank he said flat out he has not gotten it to work at all... SO I recommend doing it and see if multichannel even works at all, I mean at this point you have nothing to lose...  Think of it as a diagnostic test to make sure it's not an OS issue or network issue. 

 

Now that I think about it wasn't your ethernert cables custom made? http://www.incentre.net/tech-support/other-support/ethernet-cable-color-coding-diagram/

Depending on what type you used either A or B, check all ends to make sure the color code is in the correct order and that no RJ-45 connector pin out was wired incorrectly.. (This is quite a common problem..)(Also check to make sure the connector wasn't put on backwards)

Link to comment
Share on other sites

Link to post
Share on other sites

There could be one of a few things going on here. 

 

1. The teaming protocol being used. Switch independent (default on windows 2012) is mostly for redundancy switch independent is fast out slow in. this is due to the switch trying to prevent a switching loop\storm buy sending data from 1 session to multiple places. 

 

2. Spanning tree even if it’s not a smart switch it’s not a hub it is going to do some mid-level network tech and protocols. Setting a team on a switch (depending on brand) will disable STP on those ports. 

 

3. The switch has Green mode or some other such bullshit. Power saving modes tend to cause problems with all advanced network protocols 

 

4. Security crap some switches now are including SPI this will destroy any and all teaming and multi-channel services. 

 

5. Switch backplane splitting this if very much a problem with low end switches a 12 port switch only has the hardware to run 8 at max speed and or its just 3 4 port switches glued together the only fix is get a new switch.

 

I believe Linus is using a netgear XS these support LACP it’s an l2 plus switch it has some manageability.  Note client versions of windows can’t team natively. 

 

sigh as someone that does this for a living ( San Storage for virtualization and the media industry ) using a cheap switch when teaming or trying to get anywhere near the max of the standard is not going to happen. Buying a cheap switch and attaching high end high speed gear to it is like buying a no name power supply. Sorry rant off 

Link to comment
Share on other sites

Link to post
Share on other sites

Yes but this is simply a test to get multichanel working because to be frank he said flat out he has not gotten it to work at all... SO I recommend doing it and see if multichannel even works at all, I mean at this point you have nothing to lose...  Think of it as a diagnostic test to make sure it's not an OS issue or network issue. 

 

Now that I think about it wasn't your ethernert cables custom made? http://www.incentre.net/tech-support/other-support/ethernet-cable-color-coding-diagram/

Depending on what type you used either A or B, check all ends to make sure the color code is in the correct order and that no RJ-45 connector pin out was wired incorrectly.. (This is quite a common problem..)(Also check to make sure the connector wasn't put on backwards)

 

Multichannel does work. I verified it.

Link to comment
Share on other sites

Link to post
Share on other sites

I am using X540-T2 NICs on both ends.

 

 

I am creating basic file shares (not using Storage Spaces. Just normal network shares) then I'm doing a simple network file transfer using Explorer.

 

I have tried (fresh installs in all cases) Win8.1 to Win8.1, Server 2012 R2 to Win8.1, and Server 2012 R2 to Server 2012 R2.

 

I have tried (on the S2012R2 to S2012R2 setup) no link aggregation, LACP through Dashboard, LACP through Intel control panel. In all cases, the ports were configured to correspond to the LAG setup I was using on the on the Netgear XS712T

 

I've read (almost) all of the blog posts linked in here. I've been working on this for a while. I know how it's SUPPOSED to work.. It just isn't.

 

Glorfendel has pointed out that the switch might be doing something funky in between though... Very interesting. Didn't think of this since i thought it was supposed to not matter at all. So I'll try a Gigabit dumb switch (I don't have another 10Gbit switch to try) and see if I can at least get 2Gbit transfers and two SMB channels to handshake.. If that works, then I need to play around with the switch..

Link to comment
Share on other sites

Link to post
Share on other sites

If the servers are close enough, just try connecting them directly also.

 

Those NICs dont support RDMA, and no mention of RSS yet.

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

In theory multichannel should work without rdma

Also rss is enabled by default on them :(

Link to comment
Share on other sites

Link to post
Share on other sites

In theory multichannel should work without rdma

Also rss is enabled by default on them :(

 

Is the SMB server dedicated to one client?

Is the server in a cluster?

Are the server & client machines on a domain?

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

Is the SMB server dedicated to one client?

Is the server in a cluster?

Are the server & client machines on a domain?

In my testing the server is only being used for one client. It is not in a cluster, and no they are not domain members.

Link to comment
Share on other sites

Link to post
Share on other sites

If the smb server will remain for one client....

 

I have another way to spread the traffic over the the multiple NICs using block level storage.

 

If you cannot get the SMB multichannel to work.

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

How are you connecting from the client to the server?

 

Are you typing an IP address or hostname to connect to the share?

if you're using the hostname, have you setup multiple dns entries for the server?

Can Anybody Link A Virtual Machine while I go download some RAM?

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×