Jump to content

The Pure Solid State Server Build Log

1 hour ago, leadeater said:

Incoming yes, outgoing no.

That explains the behaviour I see on the Class C network. Server can ping clients but not vice versa. The 2 class A's can't do either.

 

Hopefully it's just down to a matter of configuration. I tried going into Network Locations and using //10.0.0.1 (IP of FreeNAS box) but it claimed the server didn't respond. I was hoping to make more progress than this before hitting yet another brick wall.

 

At this time the only drive I can use that's fast enough to test 10/20Gbit would be a RAM disk and with 2 CPUs running in single channel I'm uncertain how fast that will actually be. I have to troubleshoot the network first though.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Windows7ge said:

That explains the behaviour I see on the Class C network. Server can ping clients but not vice versa. The 2 class A's can't do either.

It's just a default firewall rule, go enabled the Default Rule under the category of File and Printing for ICMP, after that you will be able to ping it.

Link to comment
Share on other sites

Link to post
Share on other sites

I've reached a point where I simply cannot figure out any other reasonable troubleshooting step. The NICs are behaving (as in active) but absolutely refuse to acknowledge the network they're connected to. I've tried everything, looking up guides online, I tried editing firewall rules, I tried private vs public networks, nothing has made any change at all.

 

Things I can still try but don't expect a solution from include:

  • A P2P connection with my laptop. This is in the event there's some sort of issue with the network switch.
  • Connect a router to the switch and let DHCP do its thing. That would tell me if the network is working AT ALL.
  • Reset the network switch (continuing the possibility it is at fault)
  • Put FreeNAS on the server again as it will tell me if really is just a windows thing.

I've been trying to troubleshoot this all day with no progress made.

 

I think I'll poke around with the bullets tomorrow. I'll try adding a DHCP server first and if that suddenly bring the network to life I'll at least know it isn't more hardware problems (hopefully).

Link to comment
Share on other sites

Link to post
Share on other sites

55 minutes ago, Bitter said:

Boot a Linux USB stick and check from there, sounds less arduous than installing freeNAS.

I did that with Ubuntu when I was trying to troubleshoot the first motherboard. Ubuntu didn't have the drivers for the x540. If I have to go though the work of locating, downloading, and installing Linux drivers I might as well just do my own install using a version of Linux that comes with a compatible driver.

Link to comment
Share on other sites

Link to post
Share on other sites

47 minutes ago, Windows7ge said:

I did that with Ubuntu when I was trying to troubleshoot the first motherboard. Ubuntu didn't have the drivers for the x540. If I have to go though the work of locating, downloading, and installing Linux drivers I might as well just do my own install using a version of Linux that comes with a compatible driver.

How old of a version of Ubuntu were you using? The X540-AT2 is listed  in several articles summarizing officially supported server systems for 18.04 versions of Ubuntu.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Bitter said:

How old of a version of Ubuntu were you using? The X540-AT2 is listed  in several articles summarizing officially supported server systems for 18.04 versions of Ubuntu.

It was 18.04. I went into the bootable USB and "Tried Ubuntu". After going into the terminal and installing net-tools (because apparently they aren't installed when trying ubuntu) the 10Gbit NICs did not show up.

 

Although with that in mind perhaps trying ubuntu doesn't install the very drivers I'm discussing but they ARE there for performing an actual install.

 

That brings us back where we started though where I have to do an install to test the NIC in linux.

 

Installing FreeNAS to USB only takes a few minutes. Really watching the OS boot takes longer.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Windows7ge said:

It was 18.04. I went into the bootable USB and "Tried Ubuntu". After going into the terminal and installing net-tools (because apparently they aren't installed when trying ubuntu) the 10Gbit NICs did not show up.

 

Although with that in mind perhaps trying ubuntu doesn't install the very drivers I'm discussing but they ARE there for performing an actual install.

 

That brings us back where we started though where I have to do an install to test the NIC in linux.

 

Installing FreeNAS to USB only takes a few minutes. Really watching the OS boot takes longer.

That makes sense, bummer tho. Keep a spare SSD with a few OS's installed for testing I guess. SSD's are spo cheap now I always keep a couple spares around.

Link to comment
Share on other sites

Link to post
Share on other sites

Holy whatever it is that you pray to there's FINALLY some signs of life.

Screenshot_2.png.3404d32e85e82e5c1b901b36aa2431a3.png

Letting Windows create it's own /16 network like it does when it doesn't find a DHCP server and going P2P with my laptop enables me to ping my laptop on both interfaces.

 

You know what that means?

 

It means I've wasted the past day and a half troubleshooting something that was working perfectly fine when I should have been troubleshooting my network switch.

 

So here we go, now I have to figure out what the switch is up to. I know I played with VLANs a while ago but I disabled them recently.

Link to comment
Share on other sites

Link to post
Share on other sites

We have a functioning network!

Screenshot_7.png.b31b0d38825a8325072bebd3a77f770a.png

Screenshot_8.png.4f7c82da11c89b7612499c02c47908be.png

Kind of a nice thing about Windows here is that multiple local NICs can be on the same network and work in unison without separate subnets

 

Desktop:

NIC 1: 10.0.0.2/24

NIC 2: 10.0.0.3/24

 

Server:

NIC 1: 10.0.0.4/24

NIC 2: 10.0.0.5/24

 

I don't know what the switches problem was. I reset it (since I don't really utilize the managed functionality of it) and everything started working from there.

 

So FINALLY moving on, I cannot create the SSD pool until the 10TB HDD I ordered comes in. I want 2 copies of my data existing at any one time so I don't want to take apart the SSD pool just yet.

 

Due to this the only other thing I have that is fast enough to test 10/20Gbit is a RAM disk and for that I used a tool called ImDisk Toolkit which is a free alternative to SoftPerfect RAM Disk (a more popular RAM disk testing tool)

 

When it's made it shows up like any other attached drive. (Local Disk (D:))

Screenshot_5.png.0d2524d1d81315f61cd691879e45d20f.png

 

Creating a 64GB RAM disk (half my usable RAM) and running Crystal Disk Mark over it showed

Screenshot_1.png.54a0399cfaa2f5e6d472ac919bbf7a47.png

that even 2x single channel DDR4 is seriously fast. I am very curious as to why the write speeds are so much faster since reads are usually faster than writes in any other memory/storage application.

 

Testing the network performance I wanted to go into Server Manager and create an SMB share but as it turns out

Screenshot_4.png.5dc62250e234a424bdfe5cd8e961a0eb.png

Screenshot_6.png.a788dc0594e324c7e7b9c9c8160b6bb5.png

The RAM Disk does not mount as a usable disk that Server Manager can recognize.

 

Falling back on my alternative option I decided to share the drive with the network via Properties > Sharing > Advanced Sharing... > Share this folder

I have no way of knowing if this uses the same SMB protocol but it's worth a shot.

 

With the NICs online, mounting the share as a mapped drive on my PC, & running Crystal Disk Mark my network performance looked like this.

Screenshot_3.png.7ecca2bbef4d2990744806392b6b6b9a.png

It can be seen (left) that both 10Gbit interfaces are being utilized evenly. This tells me that SMB3.0 Multi-channel is active & working.

 

As it is shown on Crystal Disk Mark

Screenshot_2.png.e646f013a6a66fcec2dc188053e4989d.png

This looks like what I'd expect from a 10Gbit link that had some overhead.

It is unfortunate to see that it isn't scaling properly. I'll have to tinker with the settings to see what I can make it do.

 

For now this is the last big update. If I get the NICs to scale (go above 1GB/s throughput) I'll post again. Regardless I'll post again when I install the actual SSDs but Fedex says that wont happen for at least another week (waiting on the back-up HDD).

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Windows7ge said:

Kind of a nice thing about Windows here is that multiple local NICs can be on the same network and work in unison without separate subnets

 

Desktop:

NIC 1: 10.0.0.2/24

NIC 2: 10.0.0.3/24

 

Server:

NIC 1: 10.0.0.4/24

NIC 2: 10.0.0.5/24

 

5 minutes ago, Windows7ge said:

For now this is the last big update. If I get the NICs to scale (go above 1GB/s throughput) I'll post again.

 

That configuration isn't actually correct for SMB3 mult-channel, I suspect what is happening is multi-channel is only active and working on one end and the established session endpoint is to the same IP address. That's why you're only getting 10Gbps rather than 20Gbps. Put your NIC2's on each computer on to say 10.0.1.x/24 and test it again.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

 

 

That configuration isn't actually correct for SMB3 mult-channel, I suspect what is happening is multi-channel is only active and working on one end and the established session endpoint is to the same IP address. That's why you're only getting 10Gbps rather than 20Gbps. Put your NIC2's on each computer on to say 10.0.1.x/24 and test it again.

Both resulted in identical scores.

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Windows7ge said:

Both resulted in identical scores.

hmm, there isn't some other kind of limit like the PCIe slot the NIC cards are in?

 

While doing the transfer check using Resource Monitor network tab and see what IP addresses are being used and by how much. Also run this PowerShell command on the server and client during the transfer, Get-SmbMultichannelConnection.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, leadeater said:

hmm, there isn't some other kind of limit like the PCIe slot the NIC cards are in?

 

While doing the transfer check using Resource Monitor network tab and see what IP addresses are being used and by how much. Also run this PowerShell command on the server and client during the transfer, Get-SmbMultichannelConnection.

 

 

The X540 is built into the servers motherboard and all settings relative to it in the BIOS are set to default.

The Broadcom BCM57810S in my desktop is set to PCI_e 3.0 and should have all available 8x lanes.

 

I just reset the MTU on my desktops NICs since the switch got reset and the server is set to 1514 I've suddenly gone up +150/50

Screenshot_9.png.b3a3f9bbd08b24de202c5e83defad5a1.png

Still not above 10Gbit though.

 

On both systems the load is being perfectly split 50/50 across the two interfaces.

As for the command you had me run. The server accepted it but displayed nothing it just brought me back to a prompt. As for the client:

Server Name     Selected Client IP Server IP Client Interface Index Server Interface Index Client RSS Capable Client
                                                                                                              RDMA
                                                                                                              Capable
-----------     -------- --------- --------- ---------------------- ---------------------- ------------------ ---------
Win-tjkk3ulqouj True     10.1.0.2  10.1.0.3  11                     13                     True               False
Win-tjkk3ulqouj True     10.0.0.2  10.0.0.3  6                      15                     True               False

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, Windows7ge said:

As for the command you had me run. The server accepted it but displayed nothing it just brought me back to a prompt. As for the client:

Yea that looks correct to me, 2 sessions from two different IP src/dst pairs.

 

11 minutes ago, Windows7ge said:

The Broadcom BCM57810S in my desktop is set to PCI_e 3.0 and should have all available 8x lanes.

Is the PCIe slot coming off the chipset so going through the DMI interface or is it direct off the CPU?

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

Is the PCIe slot coming off the chipset so going through the DMI interface or is it direct off the CPU?

According to the manual diagram of the motherboard it's using lanes though the CPU.

 

I have to say though I don't think it's a hardware bottleneck. I had an issue similar to this when we were trying to get multi-channel going on FreeNAS. Eventually we got it working and I was pushing about 1.35GB/s with the hardware config I'm using right now so I have good reason to believe we're dealing with a software config issue.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Windows7ge said:

According to the manual diagram of the motherboard it's using lanes though the CPU.

 

I have to say though I don't think it's a hardware bottleneck. I had an issue similar to this when we were trying to get multi-channel going on FreeNAS. Eventually we got it working and I was pushing about 1.35GB/s with the hardware config I'm using right now so I have good reason to believe we're dealing with a software config issue.

Just hoping for a nice easy move the card to a different slot solution ?.

 

I must be lucky because for me it just works each time I do it, 9Gbps ish on both interfaces. I just can't think of a good reason why it's capping out at ~10Gbps.

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, leadeater said:

Just hoping for a nice easy move the card to a different slot solution ?.

 

I must be lucky because for me it just works each time I do it, 9Gbps ish on both interfaces. I just can't think of a good reason why it's capping out at ~10Gbps.

It is ALWAYS my luck for nothing to happen easily. You can read the history of the server build thus far. You can think back to how many times I asked for your help involving the FreeNAS server. You can look forward to the problems that I don't yet know I'm going to have with the future PROXMOX server. That's just how my luck goes.

 

In case you were hoping different slot were a solution here the only other slot that's free shares bandwidth with my M.2 and goes though the PCH so if I used it it'd slow down the M.2 & run the remaining lanes & 2.0 though the PCH. Everything would lose basically.

 

I have a couple things to tinker with. I also need to test transferring an actual file. We also need to consider that this "disk" isn't the final pool it's just a quick "this is what we can kind of expect" test. Also I just shared the drive it's not a proper SMB share. That may or may not have something to do with it. Also for real world applications I find that Disk Mark isn't really all that accurate. I have to try some actual files.

 

I'm more interested in what we'll see from the actual SSD pool. The configuration plan there is striping with mirrors. With 12 disks it will be an effective RAID0 of 6 disks which should result in a max theoretical of about 3GB/s. From there we can really tinker around knowing it is the final hardware configuration.

Link to comment
Share on other sites

Link to post
Share on other sites

So the drive showed up quite early.

IMAG0437.thumb.jpg.6e5bb7e79bad6a35e417ff36e2e6d48b.jpg

It's a beautiful looking drive it's like a textured chrome on all sides.

 

Performance wise it doesn't do too bad. For most of my files it did between 175~215MB/s.

 

I dumped my 4.2TB life collection onto it and I decided for simplicity to just do it from the server.

 

Windows Defender on the server didn't like some of my files.

Screenshot_10.png.de816387a513b75f7ed7cd596a6362b9.png why?

That's not a big deal. I'll just go unquarantine them aaaaaannd...

Screenshot_11.png.cb75a05f0a9bf2d7cfb5cf209c26fb9d.png

That's fine I'll just pull them off the other server aaaaaannd...

 

They're gone...

 

Windows Defender said "Nope. Not on this server. You know else? NOT ON THAT OTHER SERVER EITHER! *deletes the remote files*

 

Had to recover them from the remote servers backup pool which had not updated yet so they still existed.

 

-1 point to Windows Defender.

 

I want a larger boot SSD for the server than the one I'm using right now (64GB) for future projects so we're going to wait on that. When that shows up we'll do the actual SSD pool move and go over what the performance metrics look like.

Link to comment
Share on other sites

Link to post
Share on other sites

The new drive has arrived. (right)

IMAG0440.thumb.jpg.23e215503845d74ff531010f0aba66d5.jpg

it is a 480GB Intel DC S4500. This should be more than enough boot drive storage.

With this we can begin.

 

The first order of business is to remove the drives from the FreeNAS server.

IMAG0446.thumb.jpg.90dd3878f568270ce06aa58516c1b7b4.jpg

They. Are. Dusty.

 

Getting them removed and dusted off the look a lot better.

IMAG0447.thumb.jpg.2bb5616130553e24cd5fc2339a45325e.jpg

These are Intel DC S4500 960GB SSDs.

 

I'm going to need to expand the pool soon and I think I'll be expanding it using the Intel D3 4510 960GB SSDs. The specs only differ in a few select areas like IOPS and they costs 50% less. I think it's because of putting more bits on each individual cell kind of like SLC, TLC, & MLC. They're 3D2 instead of just 3D but I'm uncertain what that means. I expect them to live for a very long time even if each cell get's written to more often so I'll take it for a 50% price reduction.

 

Got them mounted in their new cages.

IMAG0448.thumb.jpg.6ce174266ad788410d5061a029ffd10b.jpg

and put in the server.

 

The next thing is setting up Storage Spaces. I wanted to test each form available including Simple (RAID0), Dual Parity (RAID6), and 2-Way Mirrors (RAID10). I also wanted to have a go with ReFS since it's an option.

 

The results of a 12 SSD Simple Volume.

Screenshot_2.png.087ddba066b23b143e7d610a23521ecc.png

Love it but I want some redundancy.

 

I was told Storage Spaces has issues with Parity write operations but I wanted to check for myself and oh my god...

Screenshot_3.png.8e063c50a92275996d0f026675f01cf2.png

Yep. Nope. Parity is not an option.

 

Lastly 2-Way Mirrors. With 12 drives this is six RAID1's striped in RAID0.

Screenshot_4.png.b18a2c5e70eb6f06984185f2e43d985f.png

I was hoping for 3GB/s writes but I can see there's quite a bit of overhead.

Still we can saturate 20Gbit so if we get the networking figured out we'll be good to go there.

 

Speaking of the network we have to setup SMB.

Following some online tutorials you have to go to Server Manager > File and Storage Services > Shares

Aright sounds easy enough.

1573681654_Screenshot_1-edit.png.3836746548989c4521b434fef1f16088.png Nope.

 

Some more Google explains I needed to run this

Install-WindowsFeature -Name FS-FileServer -IncludeAllSubFeature -IncludeManagementTools

in PowerShell.

 

Basically the role for sharing wasn't installed. Why doesn't that come installed stock?

 

Anyways it's there now.

Screenshot_1.png.0be2b6d97e09526666a50a8a132f2f33.png

 

After fiddling with share permissions for a while I finally got to perform my first over the network test and...

Screenshot_6.png.39c39d608366fb39b948842f3a3f2e85.png

Not that much different from when I shared the RAM-disk.

A REAL test with my typical files show around 675MB/s max. That's like 3~4 7200RPM HDDs.

 

I'll have to do some tinkering. In the meantime I was looking to enable deduplication on the pool but even after installing the role I ran into this issue.

Screenshot_5.png.ef3d69f84c9eec161b51084852118d74.png

It's grayed out. Some research says it's because I'm using ReFS and that I can only enable dedup from PowerShell.

 

So I tried and it just spit an error at me saying I need to use NTFS so I don't know what to do here.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Windows7ge said:

It's grayed out. Some research says it's because I'm using ReFS and that I can only enable dedup from PowerShell.

 

So I tried and it just spit an error at me saying I need to use NTFS so I don't know what to do here.

Need to use Server 2019 if you want ReFS and dedup

 

1 hour ago, Windows7ge said:

Not that much different from when I shared the RAM-disk.

File copies ram disk to network ram disk for me I get around 1.3GB/s to 1.67GB/s. To my SSD array I get around 800MB/s to 1GB/s, explorer isn't that great at file copies and stress tests though, not if you want to get the maximum possible results. For that you can use IOmeter which you can load up a bunch of parallel high queue depth I/O to the share.

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, leadeater said:

Need to use Server 2019 if you want ReFS and dedup

Eh, most of my files that take up the most space are .mp4 which I don't think can or will dedup very well. I'd rather play with ReFS if that's the case.

 

6 hours ago, leadeater said:

File copies ram disk to network ram disk for me I get around 1.3GB/s to 1.67GB/s. To my SSD array I get around 800MB/s to 1GB/s, explorer isn't that great at file copies and stress tests though, not if you want to get the maximum possible results. For that you can use IOmeter which you can load up a bunch of parallel high queue depth I/O to the share.

Testing parallel processing capabilities isn't worth it when I'm the only client. I'll just be looking up numbers for the sake of the numbers. Plus I don't know how to use I/Ometer.

 

I remember when LMG was experimenting with 100Gbit Infiniband they were able to push 4GB/s over file explorer so I don't think that's my bottleneck. Not unless file explorer has some sort of direct influence on the storage media being used (besides sending the data in that direction). For the sake of testing though do you know of another protocol or program for network file transfers besides file explorer that may be less restrictive if that's the case?

Link to comment
Share on other sites

Link to post
Share on other sites

Rsync maybe worth a shot? I think there's a version for Windows that exists.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Windows7ge said:

I remember when LMG was experimenting with 100Gbit Infiniband they were able to push 4GB/s over file explorer so I don't think that's my bottleneck.

Infiniband has way lower latency and RDMA support, you need RDMA to get the really high throughput. The X540's don't have it.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×