Jump to content

DNA Processing Server... A $30,000 adventure

So I fairly recently started working in a research lab that works with microbial genomes. Basically it’s a ton of data processing. My boss and I talked about one of my hobbies being pc building, and he asked me if I would be interested in leading the search for a new server. With the rough budget of 30,000 USD, I pretty much agreed just for the adventure. Only thing is, I’m very new to the server space.

 

Here’s the key specs he’s after-

At least 80 threads (more would be better)

At least 1TB RAM (once again, more is good)

Somewhere above 150TB of storage (In a RAID array, ~200TB ideally)

 

We already run two other servers with less storage that are 24c/48t and 40c/80t Do you wonderful people have any recommendations on where to look to either get the right parts or matching servers to the specs? Dell Enterprise and HPE just aren’t cutting it, especially since I know Epyc Rome exists. I’ve thought about either doing it all in one like the existing servers or setting up two separate processing and storage servers. What do you all think?

Link to comment
Share on other sites

Link to post
Share on other sites

Just go for epyc. That will literally give you up to 128 threads and support for up to 4tb ram at a lower cost than going with xeons.

Im with the mentaility of "IF IM NOT SURE IF ITS ENOUGH COOLING, GO OVERKILL"

 

CURRENT PC SPECS    

CPU             Ryzen 5 3600 (Formerly Ryzen 3 1200)

GPU             : ASUS RX 580 Dual OC (Formerly ASUS GTX 1060 but it got corroded for some odd reasons)

GPU COOOER      : ID Cooling Frostflow 120 VGA (Stock cooler overheats even when undervolted :()

MOBO            : MSI B350m Bazooka

MEMORY          Team Group Elite TUF DDR4 3600 Mhz CL 16
STORAGE         : Seagate Baracudda 1TB and Kingston SSD
PSU             : Thermaltake Lite power 550W (Gonna change soon as i dont trust this)
CASE            : Rakk Anyag Frost
CPU COOLER      : ID-Cooling SE 207
CASE FANS       : Mix of ID cooling fans, Corsair fans and Rakk Ounos (planned change to ID Cooling)
DISPLAY         : SpectrePro XTNS24 144hz Curved VA panel
MOUSE           : Logitech G603 Lightspeed
KEYBOARD        : Rakk Lam Ang

HEADSET         : Plantronics RIG 500HD

Kingston Hyper X Stinger

 

and a whole lot of LED everywhere(behind the monitor, behind the desk, behind the shelf of the PC mount and inside the case)

Link to comment
Share on other sites

Link to post
Share on other sites

Maybe consider the Gigabyte R282 series, with Epyc 7002 series CPU's?

You could add an external disk shelf for storage via external SAS connectivity. 

 

 

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO + 4 Additional Venturi 120mm Fans | 14 x 20TB Seagate Exos X22 20TB | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Apollod said:

Dell Enterprise and HPE just aren’t cutting it, especially since I know Epyc Rome exists.

They both offer EPYC Rome servers e.g.

 https://www.hpe.com/nz/en/product-catalog/servers/apollo-systems/pip.hpe-apollo-35-v2-system.1012136313.html

https://www.hpe.com/nz/en/product-catalog/servers/proliant-servers/pip.hpe-proliant-dl385-gen10-server.1010268408.html

 

Apollo is HPE's HPC product line and that Apollo 35 system supports 4 server nodes in a single chassis, 1TB per node is supported.

 

Ideally you would have a storage server or storage array but $30K isn't going to stretch too far when you need 1TB ram and two high end CPUs.

 

A Dell server with two 7532 CPUs and 1TB of ram is $27k without storage.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Apollod said:

research lab that works with microbial genomes.

I work in a place that does the same thing.

When I get back to the office, I'll list the specs of our system(s)

NOTE: I no longer frequent this site. If you really need help, PM/DM me and my e.mail will alert me. 

Link to comment
Share on other sites

Link to post
Share on other sites

R282-Z90

dual 64 cores

2tb of ram

12 16tb seagate drives

on some server buying sites is about 35.5k

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, leadeater said:

They both offer EPYC Rome servers e.g.

 https://www.hpe.com/nz/en/product-catalog/servers/apollo-systems/pip.hpe-apollo-35-v2-system.1012136313.html

https://www.hpe.com/nz/en/product-catalog/servers/proliant-servers/pip.hpe-proliant-dl385-gen10-server.1010268408.html

 

Apollo is HPE's HPC product line and that Apollo 35 system supports 4 server nodes in a single chassis, 1TB per node is supported.

 

Ideally you would have a storage server or storage array but $30K isn't going to stretch too far when you need 1TB ram and two high end CPUs.

 

A Dell server with two 7532 CPUs and 1TB of ram is $27k without storage.

I must’ve missed the HP ones. Dell just doesn’t seem like we can get everything we want for the right budget. Yea the 30k sounds like a lot, but after two CPUs you’re already up to 10k

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Jarsky said:

Maybe consider the Gigabyte R282 series, with Epyc 7002 series CPU's?

You could add an external disk shelf for storage via external SAS connectivity. 

 

 

Would doing disks this way basically all act as one machine? Could it be worth getting a separate NAS and processing server? (Possible expansions are down the road)

Link to comment
Share on other sites

Link to post
Share on other sites

58 minutes ago, Apollod said:

Would doing disks this way basically all act as one machine? Could it be worth getting a separate NAS and processing server? (Possible expansions are down the road)

I'd go with separate servers so the compute can just have a SSD cache on board with then a big file storage server.

but the budget will have to go up.

45 drives has large array cheap servers.

 

1 hour ago, Apollod said:

I must’ve missed the HP ones. Dell just doesn’t seem like we can get everything we want for the right budget. Yea the 30k sounds like a lot, but after two CPUs you’re already up to 10k

hp also has the Dl385 gen 10+ which are epyc 2nd gen but you'll need to contact for pricing.

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, GDRRiley said:

I'd go with separate servers so the compute can just have a SSD cache on board with then a big file storage server.

but the budget will have to go up.

45 drives has large array cheap servers.

 

hp also has the Dl385 gen 10+ which are epyc 2nd gen but you'll need to contact for pricing.

Having all the files in one place would be amazing, I may have to make some budget magic happen though! For caching, is there any typical rule for how large the cache should be? Would it be an issue if multiple users were using the server at once?

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Apollod said:

Having all the files in one place would be amazing, I may have to make some budget magic happen though! For caching, is there any typical rule for how large the cache should be? Would it be an issue if multiple users were using the server at once?

cache should be bigger than ram. it just comes down to your usage.

you'd rather have enough ram but if you are running projects back to back it could help get files off the memory so they can get written to disk.

you are going to want 40 or 100gb network.

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, Apollod said:

Would it be an issue if multiple users were using the server at once?

You got any job queue management like Slurm? You'll probably want to look in to something like that at some point if you do go down a dedicated storage server route with compute nodes etc.

 

Also how are you accessing the files? Are the running jobs going to directly call them from the storage server or are you going to stage the required files for the job on local cache on the compute node running the job?

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Apollod said:

Would doing disks this way basically all act as one machine? Could it be worth getting a separate NAS and processing server? (Possible expansions are down the road)

Yes it would act as one machine, but the idea of doing a disk shelf and using SAS is that it eliminates the need for iSCSi or FCoE or something like that.

You'd need to consider your budget more carefully doing dedicated storage as you'll need to consider the networking. Ideally for the speed you need for storage access you'll be wanting SFP+ or QSFP(+) switching.

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO + 4 Additional Venturi 120mm Fans | 14 x 20TB Seagate Exos X22 20TB | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Jarsky said:

Yes it would act as one machine, but the idea of doing a disk shelf and using SAS is that it eliminates the need for iSCSi or FCoE or something like that.

You'd need to consider your budget more carefully doing dedicated storage as you'll need to consider the networking. Ideally for the speed you need for storage access you'll be wanting SFP+ or QSFP(+) switching.

you can get older 40gb switches 2nd hand now for cheap.

this would be a good place to save some money and by 2.

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

For networking at those speeds, consider DAC if the compute and storage machines are physically close (same rack or so) or fiber-optic. On Aliexpress I found fiber-optic switches (4 or 8 SFP ports, it looks like unmanaged) but I suspect only up to 1 Gb/s, so of little use in this case. Maybe on Alibaba (=enterprise version of Aliexpress) more suitable switches can be had for cheap-ish. No idea on quality though.

"You don't need eyes to see, you need vision"

 

(Faithless, 'Reverence' from the 1996 Reverence album)

Link to comment
Share on other sites

Link to post
Share on other sites

On 2/26/2020 at 9:23 PM, leadeater said:

You got any job queue management like Slurm? You'll probably want to look in to something like that at some point if you do go down a dedicated storage server route with compute nodes etc.

 

Also how are you accessing the files? Are the running jobs going to directly call them from the storage server or are you going to stage the required files for the job on local cache on the compute node running the job?

We've haven't made a call on how the files will be managed yet, we've got a few months before the grant money comes in. I know local caching can really improve the workflow on the compute server, but occasionally we deal with up to 10 files that are 8-14 gigabytes by themselves. I'm not sure how that would work on the cache. In that capacity I wonder what the performance impacts would be for direct calling.

 

We don't use any job queues currently, its never been something we've even considered to my knowledge. I don't know much about that area. We can typically run several workloads at once with a couple notable exceptions that will easily take up our entire capacity. We currently run Ubuntu on our two servers and we work through SSH terminal to run our jobs and such. What's the overhead and benefits of the job queue? Would it prevent us from running things simultaneously?

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Apollod said:

What's the overhead and benefits of the job queue? Would it prevent us from running things simultaneously?

More that it'll allow you to automate the process and if you do go with local storage cache on the compute nodes copy the required files over for the job, but this is still something you have to get going it won't just work.

 

A cluster job manager will run as many jobs as you define. It's not actually something you need with only 1-3 compute nodes but when you have 3 or more it does make it simpler to manage your jobs and allocate them to servers, manually doing it will get to a point where it gets annoying then transition to unmanageable reliably.

 

I'd suggest on the side spinning up 4 VMs on a workstation or something and setup a cluster manager and 3 compute instances and give Slurm a try, don't worry about speed or letting the jobs complete or anything just do it to get working understand of Slurm and if it's even worth pursuing further for you.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×