Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
TopHatProductions115

HomeLab/HomeDatacentre and PowerUsers of the Forum

Recommended Posts

Posted · Original PosterOP

Hey - thanks for dropping by! This thread is for discussion of personal setups that involve (or revolve around) use of enterprise hardware/software (ESXi, Windows Server, etc.), distributed computing/HPC clusters, workstations, server racks/farms, mainframes, multi-device management/administration, MS AD (Active Directory), database access/management, hypervisor management, VPN/vLAN, VPS/Multi-User Remote Access, and other service-provisioning tasks. Think of it as r/homelab or r/homedatacenter, without the cancer risk of going to Reddit. But, those subreddits are actually better than most, so don't be scared to visit them if you need something that isn't available here - and then report back here with your findings. The only restriction is that the setup's primary purpose (or at least one of its main purposes) has to be for high-performance computation of some sort, and not just for bragging rights.

Spoiler

Also might be time to go storm r/homedatacenter to keep it alive :3

For instance, I currently use my workstation for Plex Media Server, Simple DNSCrypt (acting as a local DNS resolver), hosting the occasional game server, multimedia encoding/livestreaming, and Moonlight Gamestreaming (like a personal version of Google Stadia). I also am working toward a personal initiative that should be ready by the middle of 2022 if everything goes as planned :D This is for heavy tasks that either benefit from or require compute capabilities that are afforded by high core-count CPUs, considerable RAM capacities, and/or considerable GPU acceleration (ie., nVIDIA CUDA). Software H.264/H.265 and CUDA-accelerated video encoding counts. AV1 also counts, because that thing's a monster 😂 If you own a public-facing web server or file server, that counts too. NextCloud and VPS as well (as long as it's port forwarded and accessible over the Internet). If it's something that's publicly-facing (accessible over the internet) and can be used by multiple people, the server counts :) NVENC is not included - since almost every modern nVIDIA card can do it. AMD's VCE is also out, due to similar reasoning. Intel QuickSync only counts if it's used for tasks like video editing and transcoding. If your configuration doesn't achieve any of the above points, there is one more possibility - the GPU itself. If you use your GPU for non-gaming tasks, like GPGPU, ANN, ML/AI, PCI Passthrough, or other specialised tasks on a regular basis (at least monthly), you can still post here. Radeon Pro/FirePro, Quadro/Tesla, and other workstation cards are welcome if you're using them for non-consumer workloads regularly (and not just for ePeen). Crypto mining is also not covered in this thread. Please leave that for a thread of its own. With that, I now will list a few related threads on the forum:

 

 

If you, by some rare chance, want your thread listed here, please feel free to say so below :) If your computer has a Xeon (or i7 equivalent), Threadripper (or Ryzen 9 equivalent), EPYC, Opteron, or other specialised CPU in it, it's definitely welcome. Just make sure it's actually doing something (like video transcoding or 3D rendering). Mobile workstations (like HP's EliteBook 8770W and Dell's Precision M7720) are allowed as well. 

 

Spoiler

 

If you want, please feel free to post your benchmarking results below. Please note that the following test suites and hardware loads are expected here:

  • 3DMark
  • Cinebench R15 (Vanilla/modded) and R20
  • Unigine Valley/Heaven/Superposition
  • V-Ray 
  • FurMark
  • HWBOT HDBC and ffmpeg
  • Crystal DiskMark
  • F@H/BOINC
  • UserBenchmark  no longer

If you want to use other tools, feel free to do so. However, it may be more difficult to get comparison numbers (for relative performance). Please tell us what settings you used when benchmarking, to allow for easier comparisons. Otherwise, it defeats the purpose of comparing benchmarks.

 

 

 

 

Have fun!!!

 

Link to post
Share on other sites

So just to clarify, are you looking for people to share what they have done? Or projects currently underway? Or is this to contain links to other threads where the content is discussed? Just wanting to better understand the contributions you are looking for / the intent of this thread. 

Link to post
Share on other sites

My home test lab where I mess with things before pushing them to the production rack.


Build Logs: Cotton Candy Threads | Lucid Visions | Matte Machine (4U Rackmount) | Noc Noc | NUC | Storage Log

 

Cotton Candy Threads - CPU AMD Threadripper 2950X | GPU EVGA FTW3 RTX 2080 Ti | MOBO Asus ROG Zenith Extreme | MEMORY 128GB (8x 16GB) Corsair Vengeance RGB 3200 | STORAGE 3x Samsung 960 Evo SSD + 4x Crucial P1 1TB + 2x Seagate Ironwolf 8TB 7.2k HDDs | PSU Corsair HX1200i w/ Cablemod Pro Extensions | COOLING Cooler Master TR4 ML360 | CASE Lian Li O11 Dynamic Black | LIGHTING 9x Corsair HD120 Fans, 4x Corsair Addressable RGB Strips, 2x Corsair Commander Pro | PCPP
 
LUCID VISIONS - CPU AMD Ryzen 9 3950X | GPU PowerColor 5700XT Liquid Devil | MOBO Crosshair VII Hero | MEMORY 64GB (4x 16GB) Trident-Z Neo @ 3733 | STORAGE Samsung 960 Pro SSD | PSU Corsair RM1000i | COOLING EKWB Custom Loop | CASE Define S2 Vision | LIGHTING 4x Fractal Prisma AL-14 Fans, Built-in RGB Strip | PCPP
 
Just NCASE mITX - CPU Intel Core i7 8700K @ 5.2GHz | GPU EVGA RTX 2080 Ti XC | MOBO Asus Z370-I Gaming | MEMORY 16GB (2x 8GB) G.Skill Triden-Z RGB 3000 | STORAGE Samsung 960 Evo 500GB SSD + Corsair MX500 1TB M.2 SSD | PSU Corsair SF600 | COOLING Noctua NH-U9S w/ Redux Push/Pull Fans | CASE NCase M1v5 | LIGHTING 2x Cablemod Addressable RGB Strips | PCPP
 
Noc Noc, Who's There? - CPU AMD Threadripper 1950X | GPU ASUS RTX 2080 Ti OC | MOBO ASRock X399M Taichi | MEMORY 32GB (4x 8GB) GSkill Trident-Z 3200 | STORAGE Samsung 970 Evo SSD | PSU Corsair HX1000i w/ Cablemod Pro B&W Kit | COOLING Noctua U9 TR4 w/ 2x Redux 92mm | CASE Corsair 280X White | FANS 6x Noctua 140mm Redux | PCPP
Link to post
Share on other sites

Not quite sure I'm clear on what this topic is for? It's like extremely broad and for things that do and do not count, count towards what? Confused.

Link to post
Share on other sites
Posted · Original PosterOP

@leadeater
 

Quote

Think of it as r/homelab, without the cancer risk of going to Reddit. The only restriction is that the setup's primary purpose (or at least one of its main purposes) has to be for high-performance computation of some sort, and not just for bragging rights.

 

Link to post
Share on other sites

So I finally got a lancache VM up and running tonight after working on it for a good part of the day, the documentation is so so and partially out of date in some parts due to the project going from just steam caching to multi service caching. Not generally needed for 1-2 users but I find my game installs break sometimes, especially if I reinstall Linux, and I have very slow download speeds (currently 5Mbps but going to 25 after next billing cycle). Having to redownload games that are potentially 80+GB is painful (looking at you ESO). I have it set up to go client->lancache->pihole->upstream dns. This is also easier to maintain than some sort of backup method instead of caching.

 

My next project will probably be getting the Sickbeard MP4 Automator python script to encode my media files into Plex direct play friendly format automatically upon download yarr harr if they aren't already.


[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | 16GB Trident Z 3200MHz | 256GB 840 EVO | 960GB Corsair Force LE | EVGA P2 650W

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 2x1TB 2x8TB Western Digital HDDs

Link to post
Share on other sites

Do you mean like this?

 

(cf. 

)

 

Specs: (copied and pasted from my other post)

 

Equipment list:

Supermicro 6027TR-HTRF (quad dual socket half-width blade nodes in a 2RU rackmount, each node having dual Intel Xeon E5-2690 (v1) (8-cores, 2.9 GHz stock, 3.3 GHz max all core turbo, 3.6 GHz max turbo), 8x Samsung 16 GB DDR3-1866 ECC Reg. 2Rx4 RAM (128 GB total per node, 512 GB total for the whole system) running at DDR3-1600 speed (because it's 2R), SATA SSD for OS, HGST 7200 SATA for data, Mellanox ConnectX-4 dual port 100 Gbps (4x EDR Infiniband) NIC) (Storage configuration varies a little depending on which OS I am booting into. I have it physically separately by different OS SSDs and data HDDs.)

 

Mellanox 36-port 100 Gbps 4x EDR Infiniband externally managed switch (MSB-7890)

 

Qnap TS-832X 8-bay NAS (8x 10 TB HGST SATA 7.2krpm drives, in RAID5)

Qnap TS-832X 8-bay NAS (7x 6 TB HGST SATA 7.2krpm drives, in RAID5)

(those two are tied together via dual SFP+ to SFP+ 10GbE connections)

 

Buffalo Linkstation 441DE (4x 6 TB HGST SATA 7.2 krpm drives, in RAID5)

 

Netgear GS116 16-port 1 GbE switch

Netgear GS208 8-port 1 GbE switch

 

There's a bunch of other stuff that's not pictured here (4 workstations, another NAS, and some old, decommissioned servers). I'll have to get longer cables before I can bring those systems back up online.

 

==end of copy and paste==

 

The relatively "new(er)" thing that I'm going to be testing is different ways of building/compiling OpenFOAM in CentOS 7.6.1810 that will enable Infiniband and also whether I will build OpenMPI as part of the "normal" build process as outlined per OpenFOAM's OpenFOAM v6/CentOS 7 instructions and/or whether I will actually DISABLE building OpenMPI 2.2.1 per the instructions and use the OpenMPI 1.10.7 that's from the CentOS repo in order to try and help resolve an issue that I was having with trying to run the Motorbike OpenFOAM benchmark.

 

(Ironically, using the exact same physical hardware, but using Ubuntu 18.04 LTS, I had no problems getting everything to work. It's with CentOS that I have an issue with, but the idea is that I don't want to have to switch to a different OS whenever I want to run a different application -- I want it all to run on CentOS, so I'm doing some testing/research for the OpenFOAM development team since I submitted a bug report in regards to this issue that I was seeing after running through their install instructions.)

 

I'm starting with OpenFOAM and the benchmark because that's a case that's readily and publically available that I can use to ensure that the system is up and running as it should be. I'm looking into moving to Salome as well for FEA and I've already used GPGPU on my other workstations with DualSPHysics for SPH/particle modelling and simulation (my last run was for an offshore power generation installation simulation).

 

Here is an animation of a CFD simulation of a Wankel internal combustion engine that ran on this system (which took just shy of around 22 hours to run). On my 8-core workstation, it would have taken a little almost 7.5 days (or ~176 hours) to run the same so this system is helping to cut the run times of my HPC/CAE/CFD/FEA applications down SIGNIFICANTLY.

IMG_1880.JPG.465eed6496df1f98fb0d6db4ba560b7d.jpg

Link to post
Share on other sites

The small update over this past weekend was I now have a 52-port L2 managed GbE switch (Netgear GSM7248) instead of the 16-port GbE switch that I had previously (Netgear GS116).

 

This is in preparation for my office moving to the basement from where it is now, currently occupying a room.

 

There is a proposal that I might actually end up consolidating almost all of my centralised network/computing equipment (with very few exceptions) to the rack now that I have it up and running.

 

It's going to be quite the PITA to take down though, when we eventually move to a bigger house.

Link to post
Share on other sites

The other small news is that in the last month or two or so, I've managed to kill all of my Intel SSDs by burning through the write endurance limit on ALL of the drives.

 

So, now, I'm looking to see what I can do about as all of the consumer grade drives are being pulled from the micro cluster.

 

The latest round of SSD deaths occurred in about a little over two years of ownership (out of a 5 year warranty), and based on the power-on hours data from SMART, it's actually even sooner than that -- about 1.65 years.

 

So yeah...that happened.

 

Anybody here ever played with gluster/pNFS/Ganesha before?

Link to post
Share on other sites
7 hours ago, alpha754293 said:

Anybody here ever played with gluster/pNFS/Ganesha before?

Limited amount, Ganesha backed by a Ceph cluster. Gluster itself is pretty simple/basic so I don't think you'll have any issue with that.

Link to post
Share on other sites
9 hours ago, leadeater said:

Limited amount, Ganesha backed by a Ceph cluster. Gluster itself is pretty simple/basic so I don't think you'll have any issue with that.

Interesting.

 

Thanks.

 

Yeah, I'm trying to decide on what I want to do ever since I wore through the write endurance of all of my Intel SSDs.

 

I'm trying to decide if I want to switch over to data center/enterprise grade SSDs (which supports 3 DWPD) or if I want to create a tmpfs on each of my compute nodes and then export it to a parallel/distributed file system like GlusterFS or Ceph or pNFS (although the pNFS server isn't supported in CentOS 7.6.1810), so I'm not sure what's better in my usage scenario.

Upside with tmpfs (RAM drive) is that I won't have the write endurance issue that I am currently facing with SSDs (even enterprise grade SSDs).

Downside with tmpfs is that it's volatile memory which means that there is a potential risk for data loss, even with high availability (especially if the nodes are physically connected to the same power supply/source).

 

On the other hand, using the new usage data that I have from the newly worn SSDs, IF my usage pattern persists, then I might actually be able to get away with replacing the SSDs with enterprise grade SSDs and that will be sufficient over the life of the system. Not really sure yet, also only because the enterprise grade SSDs are larger capacity, and therefore; I might be inclined to use it more, especially if I DO end up deploying either GlusterFS or Ceph or alternatively, all of the enterprise grade will go to a new head node for the cluster, and it will just be a "conventional" NFSoRDMA export, which will simplify things for me. (Also might have the fringe benefit that it if is a new head node, I might be able to take advantage of NVMe.)

 

Decisions, decisions, decisions (especially when, again, I'm trying to get the best bang for the buck, and working with a VERY limited budget.)

 

 

Link to post
Share on other sites
6 hours ago, alpha754293 said:

-snip-

For what you want I wouldn't go with Ceph, it's more resilient that say Gluster and has great throughput over the cluster but it's not good at low latency and per client throughput can be lower than other options. It's a lot harder to get really good performance out of Ceph compared to say Gluster with underlying ZFS volumes etc. Lustre is another option for you.

Link to post
Share on other sites
20 hours ago, leadeater said:

For what you want I wouldn't go with Ceph, it's more resilient that say Gluster and has great throughput over the cluster but it's not good at low latency and per client throughput can be lower than other options. It's a lot harder to get really good performance out of Ceph compared to say Gluster with underlying ZFS volumes etc. Lustre is another option for you.

Yeah, I was reading about the difference between the two and at least one source that I found online said that GlusterFS is better for large, sequential transfers whereas Ceph work better for lots of smaller files or more random transfers.

 

Yeah, I think that I've mentioned Lustre in my other thread as well.

 

Still trying to decide whether I want to have a parallel/distributed RAM drive vs. enterprise SSDs, or whether I want to just get the SSDs and have a new head node that will pretty much just only do that (and present the enterprise SSDs as a single RAID0 volume to the network as "standard/vanilla" NFSoRDMA).

Link to post
Share on other sites
On 8/13/2019 at 4:08 PM, leadeater said:

For what you want I wouldn't go with Ceph, it's more resilient that say Gluster and has great throughput over the cluster but it's not good at low latency and per client throughput can be lower than other options. It's a lot harder to get really good performance out of Ceph compared to say Gluster with underlying ZFS volumes etc. Lustre is another option for you.

For Gluster, can the data servers also be the clients or is the assumed model that the data servers are separate from the clients?

 

Thanks.

 

(I was trying to play with pNFS over the weekend and I couldn't figure out how to make the data servers and the clients export to the same mount point.)

Link to post
Share on other sites
5 hours ago, alpha754293 said:

For Gluster, can the data servers also be the clients or is the assumed model that the data servers are separate from the clients?

You can, most probably don't do that but there isn't anything that would stop you. Suspect what you are wanting to do isn't anything different from hyper-converged infrastructure.

Link to post
Share on other sites
6 hours ago, leadeater said:

You can, most probably don't do that but there isn't anything that would stop you. Suspect what you are wanting to do isn't anything different from hyper-converged infrastructure.

Sort of.

 

The idea really currently stems from I've got four nodes and I want to allocate half of the RAM to a RAM drive (tmpfs) and then using that as the source of the space ("data server") that then will serve the four nodes itself as the client.

In other words, each node right now has 128 GB of RAM.

 

If I only allocate half of the RAM to each local node, then each node will only get 64 GB.

 

But if I can pool them together using GlusterFS, then all four nodes would be able to see and address a total of 256 GB (combined) which is more than any single node can address/provide.

 

I'm not sure if that really means "hyperconverged" because I thought that converged meant something different.

Link to post
Share on other sites
11 minutes ago, alpha754293 said:

But if I can pool them together using GlusterFS, then all four nodes would be able to see and address a total of 256 GB (combined) which is more than any single node can address/provide.

 

I'm not sure if that really means "hyperconverged" because I thought that converged meant something different.

Yea pretty much is, all it means in the VM hosting world is each node serves both as storage as well as compute in a scale out fashion so when you add a node you are adding storage capacity, storage performance and compute resources.

Link to post
Share on other sites
14 hours ago, leadeater said:

Yea pretty much is, all it means in the VM hosting world is each node serves both as storage as well as compute in a scale out fashion so when you add a node you are adding storage capacity, storage performance and compute resources.

Thanks.

Link to post
Share on other sites

For those who might be interested, here's my current testing results with GlusterFS:

 

For those that might be following the saga, here's an update:

 

I was unable to mount tmpfs using pNFS.

 

Other people (here and elsewhere) suggested that I use GlusterFS, so I've deployed that and am testing it now.

 

On my compute nodes, I created a 64 GB RAM drive on each node:

 

# mount -t tmpfs -o size=64g /bricks/brick1


and edited my /etc/fstab likewise. *edit* -- I ended up removing this like from /etc/fstab, due in part to the bricks being on volatile memory, so I was recreating the brick mount points with each reboot, which helped to clean up the configuration stuff. (e.g. if I deleted a GlusterFS volume, and then tried to create another one using the same brick mount points, it wouldn't let me.)

 

I then created the mount points for the GlusterFS volume and then created said volume:

 

# gluster volume create gv0 transport=rdma node{1..4}:/bricks/brick1/gv0


but that was a no-go when I tried to mount it, so I disabled SELinux (based on the error message that was being written to the log file), deleted the volume, and created it again with:

 

# gluster volume create gv0 transport=tcp,rdma node{1..4}:/bricks/brick1/gv0


Started the volume up, and I was able to mount it now with:

 

# mount -t glusterfs -o transport=rdma,direct-io-mode=enable node1:/gv0 /mnt/gv0


Out of all of the test trials, here's the best result that I've been able to get so far. (The results are VERY sporadic and they're kind of all over the map. I haven't quite why just yet.)

 

[root@node1 gv0]# for i in `seq -w 1 4`; do dd if=/dev/zero of=10Gfile$i bs=1024k count=10240; done
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 5.47401 s, 2.0 GB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 5.64206 s, 1.9 GB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 5.70306 s, 1.9 GB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 5.56882 s, 1.9 GB/s


Interestingly enough, when I try to do the same thing on /dev/shm, I only max out at around 2.8 GB/s.

 

So at best right now, with GlusterFS, I'm able to get about 16 Gbps throughput on four 64 GB RAM drives (for a total of 256 GB split acrossed four nodes).

 

Note that IS with a distributed volume for the time being.

 

Here are the results with the dispersed volume:

 

[root@node1 gv1]# for i in `seq -w 1 4`; do dd if=/dev/zero of=10Gfile$i bs=1024k count=10240; done
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 19.7886 s, 543 MB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 20.9642 s, 512 MB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 20.6107 s, 521 MB/s
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 21.7163 s, 494 MB/s


It's quite a lot slower.

Link to post
Share on other sites

Looks like I missed this thread being started. I'll have to share my gear when I find the time. It's too bad the server enthusiasts here are the minority of the forum. Seeing more discussions like this would be great.


Guides & Tutorials:

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

A Beginners Guide to Debian CLI Based File Servers

A Beginners Guide to PROXMOX

How to Use Rsync on Microsoft Windows for Cross-platform Automatic Data Replication

A How To Guide: Setting up SMB3.0 Multichannel on FreeNAS

How You can Reset Your Windows Login Password with Hiren's BootCD - (Depreciated)

 

Guide/Tutorial in Progress:

How to Setup Drive Sharing in Windows 10

 

In the Queue:

How to Format a HDD/SSD in Windows

How to Flash a RAID Card to IT Mode

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×