Jump to content

Unable to get full full-duplex 2Gbps speeds

sazrocks
Go to solution Solved by brwainer,
3 hours ago, sazrocks said:

Unfortunately when using two different other devices (which are each individually capable of 1Gbps) sending to my desktop with the 2Gbps teamed connection it caps out at 1Gbps, with a random distribution of that among the two other devices.

 

I'm guessing this would be the result of a crap lacp algorithm?

More specifically it is a result of a crap LAG algorithm - this switch is too basic to actually support LACP. With a LAG you force the two ports together, and there is no verification that the two ports you forced together actually go to the same destination. With LACP you enable the ports for LACP and then the two ends verify that they are indeed connected to each other on both ports and can also inform each other of which traffic splitting algorithm they are using.

This is more of a proof of concept and fun thing than practical, but still I'm curious why this isn't working.

 

I have two 1Gbps Intel NICs (on the same card) teamed together using static link aggregation and have set up the ports they are connected to on my switch as a LAG group. I am using two instances of iperf3 client and two instances of iperf3 server in combination with two other devices on the network each with one instance of iperf3 server and one instance of iperf3 client. The client arguments I'm using across all of them are 

-w 1024KB -p PORT -t 60

Good:

I can get 2Gbps TX (1Gbps to each device)

I can get 1Gbps TX and 1Gbps RX  simultaneously for each client individually actually I can't

Bad:

I can only get 1Gbps RX with both clients trying together (random speeds on both clients but they always add up to ~1Gbps)

I can only get ~1.6Gbps TX peak and 800Mbps peak RX  simultaneously when testing all 4 instances of iperf3 at once.

 

Questions:

1. There seems to be some bottleneck around 1Gbps for RX. Is there a way to fix this?

2. There seems to be some bottleneck around  2.5-3.0 Gbps for combined RX and TX. Is there a way to fix this?

3. Is there a better way I could title this thread?

Current LTT F@H Rank: 98    Score: 2,285,387,178    Stats

Yes, I have 9 monitors.

My main PC (Hybrid Windows 10/Arch Linux):

OS: Arch Linux w/ XFCE DE (VFIO-Patched Kernel) as host OS, windows 10 as guest

CPU: Ryzen 9 3900X w/PBO on (6c 12t for host, 6c 12t for guest)

Cooler: Noctua NH-D15

Mobo: Asus X470-F Gaming

RAM: 32GB G-Skill Ripjaws V @ 3200MHz (12GB for host, 20GB for guest)

GPU: Guest: EVGA RTX 3070 FTW3 ULTRA Host: 2x Radeon HD 8470

PSU: EVGA G2 650W

SSDs: Guest: Samsung 850 evo 120 GB, Samsung 860 evo 1TB Host: Samsung 970 evo 500GB NVME

HDD: Guest: WD Caviar Blue 1 TB

Case: Fractal Design Define R5 Black w/ Tempered Glass Side Panel Upgrade

Other: White LED strip to illuminate the interior. Extra fractal intake fan for positive pressure.

 

unRAID server (Plex, Windows 10 VM, NAS, Duplicati, game servers):

OS: unRAID 6.11.2

CPU: Ryzen R7 2700x @ Stock

Cooler: Noctua NH-U9S

Mobo: Asus Prime X470-Pro

RAM: 16GB G-Skill Ripjaws V + 16GB Hyperx Fury Black @ stock

GPU: EVGA GTX 1080 FTW2

PSU: EVGA G3 850W

SSD: Samsung 970 evo NVME 250GB, Samsung 860 evo SATA 1TB 

HDDs: 4x HGST Dekstar NAS 4TB @ 7200RPM (3 data, 1 parity)

Case: Sillverstone GD08B

Other: Added 3x Noctua NF-F12 intake, 2x Noctua NF-A8 exhaust, Inatek 5 port USB 3.0 expansion card with usb 3.0 front panel header

Details: 12GB ram, GTX 1080, USB card passed through to windows 10 VM. VM's OS drive is the SATA SSD. Rest of resources are for Plex, Duplicati, Spaghettidetective, Nextcloud, and game servers.

Link to comment
Share on other sites

Link to post
Share on other sites

What algorithm are you using on your switch for lacp?  By default you should get more server to multiple hosts bandwidth, but single host to single host usually hashes out to one interface.

PC : 3600 · Crosshair VI WiFi · 2x16GB RGB 3200 · 1080Ti SC2 · 1TB WD SN750 · EVGA 1600G2 · Define C 

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, beersykins said:

What algorithm are you using on your switch for lacp?  By default you should get more server to multiple hosts bandwidth, but single host to single host usually hashes out to one interface.

Unfortunately that isn't documented anywhere for my switch (TP-link TL-SG108E) and there doesn't seem to be any way of changing it. I'm guessing that means there's nothing I can do?

Current LTT F@H Rank: 98    Score: 2,285,387,178    Stats

Yes, I have 9 monitors.

My main PC (Hybrid Windows 10/Arch Linux):

OS: Arch Linux w/ XFCE DE (VFIO-Patched Kernel) as host OS, windows 10 as guest

CPU: Ryzen 9 3900X w/PBO on (6c 12t for host, 6c 12t for guest)

Cooler: Noctua NH-D15

Mobo: Asus X470-F Gaming

RAM: 32GB G-Skill Ripjaws V @ 3200MHz (12GB for host, 20GB for guest)

GPU: Guest: EVGA RTX 3070 FTW3 ULTRA Host: 2x Radeon HD 8470

PSU: EVGA G2 650W

SSDs: Guest: Samsung 850 evo 120 GB, Samsung 860 evo 1TB Host: Samsung 970 evo 500GB NVME

HDD: Guest: WD Caviar Blue 1 TB

Case: Fractal Design Define R5 Black w/ Tempered Glass Side Panel Upgrade

Other: White LED strip to illuminate the interior. Extra fractal intake fan for positive pressure.

 

unRAID server (Plex, Windows 10 VM, NAS, Duplicati, game servers):

OS: unRAID 6.11.2

CPU: Ryzen R7 2700x @ Stock

Cooler: Noctua NH-U9S

Mobo: Asus Prime X470-Pro

RAM: 16GB G-Skill Ripjaws V + 16GB Hyperx Fury Black @ stock

GPU: EVGA GTX 1080 FTW2

PSU: EVGA G3 850W

SSD: Samsung 970 evo NVME 250GB, Samsung 860 evo SATA 1TB 

HDDs: 4x HGST Dekstar NAS 4TB @ 7200RPM (3 data, 1 parity)

Case: Sillverstone GD08B

Other: Added 3x Noctua NF-F12 intake, 2x Noctua NF-A8 exhaust, Inatek 5 port USB 3.0 expansion card with usb 3.0 front panel header

Details: 12GB ram, GTX 1080, USB card passed through to windows 10 VM. VM's OS drive is the SATA SSD. Rest of resources are for Plex, Duplicati, Spaghettidetective, Nextcloud, and game servers.

Link to comment
Share on other sites

Link to post
Share on other sites

The switch is probably doing a MAC-based algorithm distribution of traffic. Your host controls which port outbound traffic leaves on, but the switch controls with port inbound traffic comes in on. Try doing your tests with more than one other device sending traffic to the computer with the teamed interfaces - the different transmitting MACs will hopefully cause the switch to load balance the two connections to different ports in the team.

EDIT: This is exactly what @beersykins was referring to as well.

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, sazrocks said:

Unfortunately that isn't documented anywhere for my switch (TP-link TL-SG108E) and there doesn't seem to be any way of changing it. I'm guessing that means there's nothing I can do?

I don't see any settings for the LAG in the user manual

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, brwainer said:

The switch is probably doing a MAC-based algorithm distribution of traffic. Your host controls which port outbound traffic leaves on, but the switch controls with port inbound traffic comes in on. Try doing your tests with more than one other device sending traffic to the computer with the teamed interfaces - the different transmitting MACs will hopefully cause the switch to load balance the two connections to different ports in the team.

Unfortunately when using two different other devices (which are each individually capable of 1Gbps) sending to my desktop with the 2Gbps teamed connection it caps out at 1Gbps, with a random distribution of that among the two other devices.

 

I'm guessing this would be the result of a crap lacp algorithm?

Current LTT F@H Rank: 98    Score: 2,285,387,178    Stats

Yes, I have 9 monitors.

My main PC (Hybrid Windows 10/Arch Linux):

OS: Arch Linux w/ XFCE DE (VFIO-Patched Kernel) as host OS, windows 10 as guest

CPU: Ryzen 9 3900X w/PBO on (6c 12t for host, 6c 12t for guest)

Cooler: Noctua NH-D15

Mobo: Asus X470-F Gaming

RAM: 32GB G-Skill Ripjaws V @ 3200MHz (12GB for host, 20GB for guest)

GPU: Guest: EVGA RTX 3070 FTW3 ULTRA Host: 2x Radeon HD 8470

PSU: EVGA G2 650W

SSDs: Guest: Samsung 850 evo 120 GB, Samsung 860 evo 1TB Host: Samsung 970 evo 500GB NVME

HDD: Guest: WD Caviar Blue 1 TB

Case: Fractal Design Define R5 Black w/ Tempered Glass Side Panel Upgrade

Other: White LED strip to illuminate the interior. Extra fractal intake fan for positive pressure.

 

unRAID server (Plex, Windows 10 VM, NAS, Duplicati, game servers):

OS: unRAID 6.11.2

CPU: Ryzen R7 2700x @ Stock

Cooler: Noctua NH-U9S

Mobo: Asus Prime X470-Pro

RAM: 16GB G-Skill Ripjaws V + 16GB Hyperx Fury Black @ stock

GPU: EVGA GTX 1080 FTW2

PSU: EVGA G3 850W

SSD: Samsung 970 evo NVME 250GB, Samsung 860 evo SATA 1TB 

HDDs: 4x HGST Dekstar NAS 4TB @ 7200RPM (3 data, 1 parity)

Case: Sillverstone GD08B

Other: Added 3x Noctua NF-F12 intake, 2x Noctua NF-A8 exhaust, Inatek 5 port USB 3.0 expansion card with usb 3.0 front panel header

Details: 12GB ram, GTX 1080, USB card passed through to windows 10 VM. VM's OS drive is the SATA SSD. Rest of resources are for Plex, Duplicati, Spaghettidetective, Nextcloud, and game servers.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, sazrocks said:

Unfortunately when using two different other devices (which are each individually capable of 1Gbps) sending to my desktop with the 2Gbps teamed connection it caps out at 1Gbps, with a random distribution of that among the two other devices.

 

I'm guessing this would be the result of a crap lacp algorithm?

More specifically it is a result of a crap LAG algorithm - this switch is too basic to actually support LACP. With a LAG you force the two ports together, and there is no verification that the two ports you forced together actually go to the same destination. With LACP you enable the ports for LACP and then the two ends verify that they are indeed connected to each other on both ports and can also inform each other of which traffic splitting algorithm they are using.

Looking to buy GTX690, other multi-GPU cards, or single-slot graphics cards: 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×