Jump to content

Large Parallel Streaming Writes: PCIe SSD vs HDD RAID

An interesting question was proposed to me on Twitter last night, and I was curious what the LTT communities' take on it it.

 

When working with large streaming parallel writes, what is better - a big RAID array of spinning drives, or a single PCIe SSD.

 

Data-wise, we're talking about multiple machines writing multiple streams of uncompressed video to a SAN.  In my specific instance, I have a couple of PCs each capturing multiple uncompressed 1080p60 streams to my SAN.  After editing, I use a group of 4 machines to render out the final file that I upload to YouTube and then I'll usually delete the source footage (or sometimes I'll send it to the render farm so I have h.264 copies of them if I ever needed it).

 

My current SAN  has 56 spinning drives (mainly because they were cheap) in RAID-50.  Overall it can sustain 1.6GB/s of throughput, although 10GbE limits me to 1.25GB/s.

 

So - would I have been better served by putting the $1200 I spent on this towards a PCIe SSD (and sacrificing most of my space doing that, since a 6TB SSD is $22K!), or is the path I chose the best I could do within reason?

"VictoryGin"

Case: Define R6-S, Black | PSU: Corsair RM1000x, Custom CableMod Cables | Mobo: Asus ROG Zenith Extreme X399 | CPU: Threadripper 1950X | Cooling: Enermax Liqtech TR4 360mm AIO with Noctua NF-F12's | RAM: 64GB (8x8GB) G.SKILL Flare X DDR4-2400 | GPUs: AMD Radeon Vega Frontier Edition; NVIDIA Quadro P400 | Storage: 3x Intel 760p 128GB NVMe RAID-0 | I/O Cards: Blackmagic Intensity Pro, Mellanox ConnectX-3 Pro EN 40GbE NIC, StarTech FireWire, StarTech Serial/Parallel  | Monitors: 3x 30” First Semi F301GD @ 2560x1600, 60Hz; 50" Avera 49EQX20 @ 3840x2160, 60 Hz | Keyboard: Black/Gray Unicomp Classic | Mouse: Logitech MX Master 2S | OS: Windows 10 Enterprise LTSB 2016

Link to comment
Share on other sites

Link to post
Share on other sites

As you said, you're already limited by the internet connection anyway, so going with an SSD wouldn't improve speeds. So would it be better......well, in the sense that it's simpler and SSD-based (so lower latency), yes, but it's also less practical due to costing significanty more and/or having significantly less space. 

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

Well fundamentally a large amount of HDDs will give excellent sequential read or write performance.

 

You able to give more details on the current system? Is it a ZFS based server or hardware RAID?

 

Depending on what you have you can use SSD write caching to significantly improve performance, however you do have to be aware that the SSD cache actually needs to be faster than the HDDs and when we're talking in the hundred or greater amount of disks a single SSD, not even NVMe, will be faster in sequential.

 

For a SATA class HDD we calculate on a baseline of 80 IOPs per disk and 100MB/s sustained transfer speed, this takes in to account efficiency loss over a large number of disks and more realistic performance than what the manufacture claims.

 

The real benefit of SSD is latency or IOPs or the small number of disks to give very large throughput. Cost per GB is much worse obviously.

 

Edit:

Also for HDDs as the I/O demand goes up the performance decreases so that is something you have to be aware of, SSDs don't really suffer from this and in fact most need that heavy demand to reach their full potential performance.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, leadeater said:

Well fundamentally a large amount of HDDs will give excellent sequential read or write performance.

 

You able to give more details on the current system? Is it a ZFS based server or hardware RAID?

 

Depending on what you have you can use SSD write caching to significantly improve performance, however you do have to be aware that the SSD cache actually needs to be faster than the HDDs and when we're talking in the hundred or greater amount of disks a single SSD, not even NVMe, will be faster in sequential.

 

For a SATA class HDD we calculate on a baseline of 80 IOPs per disk and 100MB/s sustained transfer speed, this takes in to account efficiency loss over a large number of disks and more realistic performance than what the manufacture claims.

 

The real benefit of SSD is latency or IOPs or the small number of disks to give very large throughput. Cost per GB is much worse obviously.

 

Edit:

Also for HDDs as the I/O demand goes up the performance decreases so that is something you have to be aware of, SSDs don't really suffer from this and in fact most need that heavy demand to reach their full potential performance.

Alright - more info on the setup:

 

The setup is a mix of hardware & software RAID.  I have 8 arrays of 7 drives in hardware RAID-5 - each array has 512MB of cache, plus each drive has 8MB onboard.  Each array is connected to the host server via 2 gigabit fibre channel.  The 8 logical "drives" are then put in a software RAID-0 within macOS's disk utilities.  The host machine is a quad-core G5 (older, but still plenty fast as a file server) with 16GB of RAM, 2 SSDs in software RAID-1 for the OS, and a Myricom 10GbE card.  File sharing is handled via NFS, so no issues with that.

 

The drives themselves are IDE (I said this was cheap, didn't I?), but with 7 drives on a 2 gigabit FC link, the individual drive speed is effectively nulled out as the link is saturated (200MB/s after overhead)

 

I don't know how much the hardware RAID, RAM caching, and other things plays in masking any latencies; but I've yet to drop any frames even with 4 streams recording.

"VictoryGin"

Case: Define R6-S, Black | PSU: Corsair RM1000x, Custom CableMod Cables | Mobo: Asus ROG Zenith Extreme X399 | CPU: Threadripper 1950X | Cooling: Enermax Liqtech TR4 360mm AIO with Noctua NF-F12's | RAM: 64GB (8x8GB) G.SKILL Flare X DDR4-2400 | GPUs: AMD Radeon Vega Frontier Edition; NVIDIA Quadro P400 | Storage: 3x Intel 760p 128GB NVMe RAID-0 | I/O Cards: Blackmagic Intensity Pro, Mellanox ConnectX-3 Pro EN 40GbE NIC, StarTech FireWire, StarTech Serial/Parallel  | Monitors: 3x 30” First Semi F301GD @ 2560x1600, 60Hz; 50" Avera 49EQX20 @ 3840x2160, 60 Hz | Keyboard: Black/Gray Unicomp Classic | Mouse: Logitech MX Master 2S | OS: Windows 10 Enterprise LTSB 2016

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, FaultyWarrior said:

Alright - more info on the setup:

 

The setup is a mix of hardware & software RAID.  I have 8 arrays of 7 drives in hardware RAID-5 - each array has 512MB of cache, plus each drive has 8MB onboard.  Each array is connected to the host server via 2 gigabit fibre channel.  The 8 logical "drives" are then put in a software RAID-0 within macOS's disk utilities.  The host machine is a quad-core G5 (older, but still plenty fast as a file server) with 16GB of RAM, 2 SSDs in software RAID-1 for the OS, and a Myricom 10GbE card.  File sharing is handled via NFS, so no issues with that.

 

The drives themselves are IDE (I said this was cheap, didn't I?), but with 7 drives on a 2 gigabit FC link, the individual drive speed is effectively nulled out as the link is saturated (200MB/s after overhead)

 

I don't know how much the hardware RAID, RAM caching, and other things plays in masking any latencies; but I've yet to drop any frames even with 4 streams recording.

Thanks for the extra information.

 

Only thing I would recommend is to start looking in to finding a cheap used SAS disk shelf so you can migrate to SATA disks when you need to. Stick with the higher number of disks rather than going with the least possible i.e. don't go below 8 disks for what you are doing.

 

An IBM EXP3000 is a good choice and is around $150 USD on ebay and you can chain the shelves off a single RAID card so you can also remove the software RAID 0 since you won't need it.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×