Jump to content

Windows 11 Storage Spaces dual parity and performance

Kriz
Go to solution Solved by leadeater,
27 minutes ago, Kriz said:

But if I could strategically add disks to existing pool without loosing any performance that would be a preferable path until I got close to 10 disks at which point I would make sense to rebuild array from scratch as dual parity space. What's your take on that ?

Add disks to the existing pool, there is no reason and no negative performance impact to doing so. Pools themselves don't carry much meaning or weight in Storage Spaces other than logically limiting which disks can be used for 'reasons' that someone might have. Someone will always want it but generally speaking don't create multiple pools. When you create Virtual Disks you have all the options in the world including choose which physical disks within a pool are allowed to be used for it (don't do this without a damn good reason).

 

If you end up with 10x 4TB in a single Pool and create a new Virtual disk of Dual Parity 10 Columns, interleave 16KB, AUS 128K you'll probably, and this is just guessing, only get around 500MB/s write. I would experiment with larger interleave and AUS sizes too, if most of your files are large(ish) then higher is technically better. Even so just do some benchmarks as you add disks and see how performance increases, you might find no change is required at all (doubtful but no point making work for yourself if you don't need to).

 

If you do plan on creating a new Virtual Disk in the Pool then don't expand the existing virtual disk much, you need Pool capacity to create a new Virtual Disk and migrate the data. Or you have to take it off system and then copy it back, more painful.

Hi

 

Thanks to this forum and users @leadeater and @Electronics Wizardy I have successfully deployed 14TB Windows 11 parity storage spaces. 5 columns, 16KB interleave, 64KB AUS, 5x 4TB Toshiba n300. Performance is from good to great on read (500 - 800 MB/s) on writes it's a little slower but still acceptable (270 - 350 MB/s). It works good for now but as a compulsive tech enthusiast I already think about upgrade path. Obvious way is to evacuate data to backup and double the hdd count (when I will need more space) and make it a dual parity Storage Space. My question is how will write speed scale ? Read will most definitely double on lower and higher end but what about writes. Parity calculations will be more complex with two parity stripes.

 

Also is there a way to control RAM write cache ? During file transfer I can see that it is working but it stops fairly quickly and there is plenty of ram left in my system (16 GB and 10 GB being free most of the time). I know there is an option in device manager to dissable write cache buffer flushing for each of the disk also in PowerShell you can tell Storage Spaces that server is power protected but how effective would that be for write speed improvement ?

 

I don't own a UPS but if gain would be significant I would consider buying one.

 

Thanks for all the help everyone !

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Kriz said:

Hi

 

Thanks to this forum and users leadeater and ElectronicsWizardry I have successfully deployed 14TB Windows 11 parity storage spaces. 5 columns, 16KB interleave, 64KB AUS, 5x 4TB Toshiba n300. Performance is from good to great on read (500 - 800 MB/s) on writes it's a little slower but still acceptable (270 - 350 MB/s). It works good for now but as a compulsive tech enthusiast I already think about upgrade path. Obvious way is to evacuate data to backup and double the hdd count (when I will need more space) and make it a dual parity Storage Space. My question is how will write speed scale ? Read will most definitely double on lower and higher end but what about writes. Parity calculations will be more complex with two parity stripes.

 

Also is there a way to control RAM write cache ? During file transfer I can see that it is working but it stops fairly quickly and there is plenty of ram left in my system (16 GB and 10 GB being free most of the time). I know there is an option in device manager to dissable write cache buffer flushing for each of the disk also in PowerShell you can tell Storage Spaces that server is power protected but how effective would that be for write speed improvement ?

 

I don't own a UPS but if gain would be significant I would consider buying one.

 

Thanks for all the help everyone !

Do a @ then their name if you want to holler at them.  @leadeater and @Electronics Wizardy.

"Do what makes the experience better" - in regards to PCs and Life itself.

 

Onyx AMD Ryzen 7 7800x3d / MSI 6900xt Gaming X Trio / Gigabyte B650 AORUS Pro AX / G. Skill Flare X5 6000CL36 32GB / Samsung 980 1TB x3 / Super Flower Leadex V Platinum Pro 850 / EK-AIO 360 Basic / Fractal Design North XL (black mesh) / AOC AGON 35" 3440x1440 100Hz / Mackie CR5BT / Corsair Virtuoso SE / Cherry MX Board 3.0 / Logitech G502

 

7800X3D - PBO -30 all cores, 4.90GHz all core, 5.05GHz single core, 18286 C23 multi, 1779 C23 single

 

Emma : i9 9900K @5.1Ghz - Gigabyte AORUS 1080Ti - Gigabyte AORUS Z370 Gaming 5 - G. Skill Ripjaws V 32GB 3200CL16 - 750 EVO 512GB + 2x 860 EVO 1TB (RAID0) - EVGA SuperNova 650 P2 - Thermaltake Water 3.0 Ultimate 360mm - Fractal Design Define R6 - TP-Link AC1900 PCIe Wifi

 

Raven: AMD Ryzen 5 5600x3d - ASRock B550M Pro4 - G. Skill Ripjaws V 16GB 3200Mhz - XFX Radeon RX6650XT - Samsung 980 1TB + Crucial MX500 1TB - TP-Link AC600 USB Wifi - Gigabyte GP-P450B PSU -  Cooler Master MasterBox Q300L -  Samsung 27" 1080p

 

Plex : AMD Ryzen 5 5600 - Gigabyte B550M AORUS Elite AX - G. Skill Ripjaws V 16GB 2400Mhz - MSI 1050Ti 4GB - Crucial P3 Plus 500GB + WD Red NAS 4TBx2 - TP-Link AC1200 PCIe Wifi - EVGA SuperNova 650 P2 - ASUS Prime AP201 - Spectre 24" 1080p

 

Steam Deck 512GB OLED

 

OnePlus: 

OnePlus 11 5G - 16GB RAM, 256GB NAND, Eternal Green

OnePlus Buds Pro 2 - Eternal Green

 

Other Tech:

- 2021 Volvo S60 Recharge T8 Polestar Engineered - 415hp/495tq 2.0L 4cyl. turbocharged, supercharged and electrified.

Lenovo 720S Touch 15.6" - i7 7700HQ, 16GB RAM 2400MHz, 512GB NVMe SSD, 1050Ti, 4K touchscreen

MSI GF62 15.6" - i7 7700HQ, 16GB RAM 2400 MHz, 256GB NVMe SSD + 1TB 7200rpm HDD, 1050Ti

- Ubiquiti Amplifi HD mesh wifi

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Kriz said:

hanks to this forum and users @leadeater and @Electronics Wizardy I have successfully deployed 14TB Windows 11 parity storage spaces. 5 columns, 16KB interleave, 64KB AUS, 5x 4TB Toshiba n300. Performance is from good to great on read (500 - 800 MB/s) on writes it's a little slower but still acceptable (270 - 350 MB/s). It works good for now but as a compulsive tech enthusiast I already think about upgrade path. Obvious way is to evacuate data to backup and double the hdd count (when I will need more space) and make it a dual parity Storage Space. My question is how will write speed scale ? Read will most definitely double on lower and higher end but what about writes. Parity calculations will be more complex with two parity stripes.

 

One thing to note is that you need to create a new virtual disk when changing to different parity levels. You can also just add drives and it will sread the data across all the new drives and keep the single parity.

 

I haven't testing with that exact config, but I believe a number of columns of 10 with 8 data and 2 parity should preform well on this config.

 

3 hours ago, Kriz said:

 

Also is there a way to control RAM write cache ? During file transfer I can see that it is working but it stops fairly quickly and there is plenty of ram left in my system (16 GB and 10 GB being free most of the time). I know there is an option in device manager to dissable write cache buffer flushing for each of the disk also in PowerShell you can tell Storage Spaces that server is power protected but how effective would that be for write speed improvement ?

 

I will note that often memory that is used for caching is shown as free so check your reading it correctly.

 

Also you probably don't want to be caching writes too much im ram as you can lose a lot in a power outage or crash.

 

I don't know a way of changing the write cache settings other than the normal ones listed, and it generally does a decent job of a pretty big write cache in windows it seems for tasks like file copies that don't call sync.

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Kriz said:

Hi

 

Thanks to this forum and users @leadeater and @Electronics Wizardy I have successfully deployed 14TB Windows 11 parity storage spaces. 5 columns, 16KB interleave, 64KB AUS, 5x 4TB Toshiba n300. Performance is from good to great on read (500 - 800 MB/s) on writes it's a little slower but still acceptable (270 - 350 MB/s). It works good for now but as a compulsive tech enthusiast I already think about upgrade path. Obvious way is to evacuate data to backup and double the hdd count (when I will need more space) and make it a dual parity Storage Space. My question is how will write speed scale ? Read will most definitely double on lower and higher end but what about writes. Parity calculations will be more complex with two parity stripes.

 

Also is there a way to control RAM write cache ? During file transfer I can see that it is working but it stops fairly quickly and there is plenty of ram left in my system (16 GB and 10 GB being free most of the time). I know there is an option in device manager to dissable write cache buffer flushing for each of the disk also in PowerShell you can tell Storage Spaces that server is power protected but how effective would that be for write speed improvement ?

 

I don't own a UPS but if gain would be significant I would consider buying one.

 

Thanks for all the help everyone !

What cpu are you using? 

Single core performance is going to make a deference on writes with parity calculation. 

Write is always going to be slower than read. 

Not sure if it was a platform or drive difference but with truenas scale and Seagate IronWolf nas drives in a 4 disk test array I was hitting about 350 mbps with a single parity drive. 

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, m9x3mos said:

What cpu are you using? 

Single core performance is going to make a deference on writes with parity calculation. 

Write is always going to be slower than read. 

Not sure if it was a platform or drive difference but with truenas scale and Seagate IronWolf nas drives in a 4 disk test array I was hitting about 350 mbps with a single parity drive. 

My CPU is Core i3 4330. With single parity CPU usage is well under control so there should be plenty of headroom. If need be I could get 4790k but I would rather keep it for something else.

22 hours ago, Electronics Wizardy said:

One thing to note is that you need to create a new virtual disk when changing to different parity levels. You can also just add drives and it will sread the data across all the new drives and keep the single parity.

 

I haven't testing with that exact config, but I believe a number of columns of 10 with 8 data and 2 parity should preform well on this config.

 

I will note that often memory that is used for caching is shown as free so check your reading it correctly.

 

Also you probably don't want to be caching writes too much im ram as you can lose a lot in a power outage or crash.

 

I don't know a way of changing the write cache settings other than the normal ones listed, and it generally does a decent job of a pretty big write cache in windows it seems for tasks like file copies that don't call sync.

 

 

I know that Storage Spaces are quite flexible but if you want to keep your data I'm pretty sure number of columns and interleave size is fixed so I'm not sure what would happen when I would add a 6th disk to the array. Would it work (probably) would it perform well (rather unlikely cause cause with 5 stripes 4 data and 1 parity interleave size would perfectly translate to AUS you can format your array to) but I haven't test that so that's pure speculation. So I think the only thing you can do to keep your write performance is update and verify your backup, nuke the array and start from scratch witch double the amount of disks and new PowerShell settings.

 

One more question do you have any HBA cards (about 8 SATA devices would suffice) from a reputable brand like LSI ? There is plenty of this stuff around but finding correct gear is not easy especially when manufacturers often provide conflicting information in different places. Doing my own research I found LSI SAS 9205-8i H220 HBA card. It's not enough on it's own to do 10 drives (without some SATA multiplayers) but with Storage spaces using two controllers shouldn't be a problem

 

Oh and once again thanks for all the help along the way ! It was a big help to get my foot through the door.

 

Link to comment
Share on other sites

Link to post
Share on other sites

28 minutes ago, Kriz said:

I know that Storage Spaces are quite flexible but if you want to keep your data I'm pretty sure number of columns and interleave size is fixed so I'm not sure what would happen when I would add a 6th disk to the array. Would it work (probably) would it perform well (rather unlikely cause cause with 5 stripes 4 data and 1 parity interleave size would perfectly translate to AUS you can format your array to) but I haven't test that so that's pure speculation. So I think the only thing you can do to keep your write performance is update and verify your backup, nuke the array and start from scratch witch double the amount of disks and new PowerShell settings.

The interleave and number of columns is fixed when a virtual disk is made. If you add disks its just spreads it across more disks and performance should stay about the same.

 

29 minutes ago, Kriz said:

 

One more question do you have any HBA cards (about 8 SATA devices would suffice) from a reputable brand like LSI ? There is plenty of this stuff around but finding correct gear is not easy especially when manufacturers often provide conflicting information in different places. Doing my own research I found LSI SAS 9205-8i H220 HBA card. It's not enough on it's own to do 10 drives (without some SATA multiplayers) but with Storage spaces using two controllers shouldn't be a problem

 

Id probably try to get newer cards now like the 9305-8i as the 9200 series is getting pretty old now and drivers aren't included in windows 11 by default not. Using multiple controllers isn't a issue, but if you need more drives, look into the 16 port models.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Electronics Wizardy said:

The interleave and number of columns is fixed when a virtual disk is made. If you add disks its just spreads it across more disks and performance should stay about the same.

 

That's interesting. So if I add a 6th drive to the pool and optimize usage (to spread data equally over all disks) the data would still be read from 4 data stripes and written with 4 data stripes + parity but some stripes will be offset by one disk in an alternating pattern so performance can stay the same and additional drive's capacity can be used effectively ? I'm not sure weather I'm sufficiently clear how I'm imagine this but I hope you get the gist of it.

 

That would be a good news. If I needed just a few gigs of extra space adding a new drive would be easy and I still have a free sata port on the same controller but I was afraid that the trick to get good write performance would stop working.

.

1 hour ago, Electronics Wizardy said:

Id probably try to get newer cards now like the 9305-8i as the 9200 series is getting pretty old now and drivers aren't included in windows 11 by default not. Using multiple controllers isn't a issue, but if you need more drives, look into the 16 port models.

I was afraid you gonna say this. I will eventually have to get one of this cards but newer 16 port models are quite pricey like this one https://www.ebay.com/itm/404443071143?hash=item5e2aaf82a7:g:fFoAAOSwNGdjrEIY&amdata=enc%3AAQAIAAAA8PUyDw8VKH5rWKGcPC8Y8sX1DcXNpwhNRZhag62tkNnyq6SNIiB2Jfg9%2F2DZlNvfhOGK4jfLGRd1SrIq8tGKrdbpiyq5lNNDs0pUZ0UzW3EByV9UOOcnIbxSrbaxthTZyC2t%2BtwiTUFojJjsIx7AxfKB2L%2Bv9pO9ldsfe0ctcAOHIg98FMu4Ijxl2kOcDvmvFQiMChafnLzkCXsCVb%2Fp7ekRNx0YLoe7WPSHIZOcJ8SzCyKEHqvRA4poK30NhIije2ZFZ2RSbXPC%2F6ZStcy0TAH%2Bv1ixBC0pv2f%2FOYbqTwpaIYnEY6KZ84eP3G6pueY9nw%3D%3D|tkp%3ABk9SR7TqlcHaYg.

Link to comment
Share on other sites

Link to post
Share on other sites

36 minutes ago, Kriz said:

That's interesting. So if I add a 6th drive to the pool and optimize usage (to spread data equally over all disks) the data would still be read from 4 data stripes and written with 4 data stripes + parity but some stripes will be offset by one disk in an alternating pattern so performance can stay the same and additional drive's capacity can be used effectively ? I'm not sure weather I'm sufficiently clear how I'm imagine this but I hope you get the gist of it.

 

That would be a good news. If I needed just a few gigs of extra space adding a new drive would be easy and I still have a free sata port on the same controller but I was afraid that the trick to get good write performance would stop working.

.

You can also mix drive sizes. 4TB drives are pretty small these days, so I'd probably try to get bigger drives instead of more drives. Also saves on power.

 

Thats pretty much correct, it just changes which drives the data is on, so its spead out and can use all the space of pools bigger than the number of columns.

 

36 minutes ago, Kriz said:

I'd probably go with the 9300-16i here. Newer gen and fairly cheap from what I see.

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Kriz said:

I know that Storage Spaces are quite flexible but if you want to keep your data I'm pretty sure number of columns and interleave size is fixed so I'm not sure what would happen when I would add a 6th disk to the array. Would it work (probably) would it perform well (rather unlikely cause cause with 5 stripes 4 data and 1 parity interleave size would perfectly translate to AUS you can format your array to) but I haven't test that so that's pure speculation. So I think the only thing you can do to keep your write performance is update and verify your backup, nuke the array and start from scratch witch double the amount of disks and new PowerShell settings.

I would have to test but I'm fairly sure adding disks to the pool increases capacity and performance without having to change the virtual disk and columns etc. The performance increase isn't as significant and is only achieved on higher queue depths/outstanding IO's though so for lighter workloads no increase would be expected to be observed i.e. single file transfer using Explorer.

 

It is something I could test but I'm realistically not sure I would actually get around to doing it but if you did double you disk count you would be able to test it out before creating a new virtual disk configuration and migrating the data.

 

As you already know allocation unit alignment is far more important anyway and that is so much more important than anything else.

 

10 hours ago, Kriz said:

That's interesting. So if I add a 6th drive to the pool and optimize usage (to spread data equally over all disks) the data would still be read from 4 data stripes and written with 4 data stripes + parity but some stripes will be offset by one disk in an alternating pattern so performance can stay the same and additional drive's capacity can be used effectively ? I'm not sure weather I'm sufficiently clear how I'm imagine this but I hope you get the gist of it.

The data is only redistributed after you do a Storage Optimization task, using PowerShell or have it scheduled in Task Manager. Since the column and interleave does not change it's only where the data chunks are physically stored so the number of disks in an I/O operation does not change. However if you are issuing multiple concurrent I/O operations those can be spread across different physical disks which means even if you don't have exactly double your column count of physical disks there should be a measurable performance increase in heavy I/O workloads.

 

It's been a long time since I have tested this so this is just theory and going off memory but either way it's a situation not really applicable in home, where everything is going to be lighter workloads of low queue depths/outstanding IO's.

 

Fortunately you can create a test virtual disk after you add disks with different configuration parameters and benchmark the resulting performance to figure out what actually is best. Storage Spaces is actually nice like that, more of a unique capability of a single server software storage solution compared to others.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, leadeater said:

I would have to test but I'm fairly sure adding disks to the pool increases capacity and performance without having to change the virtual disk and columns etc. The performance increase isn't as significant and is only achieved on higher queue depths/outstanding IO's though so for lighter workloads no increase would be expected to be observed i.e. single file transfer using Explorer.

 

It is something I could test but I'm realistically not sure I would actually get around to doing it but if you did double you disk count you would be able to test it out before creating a new virtual disk configuration and migrating the data.

 

As you already know allocation unit alignment is far more important anyway and that is so much more important than anything else.

 

The data is only redistributed after you do a Storage Optimization task, using PowerShell or have it scheduled in Task Manager. Since the column and interleave does not change it's only where the data chunks are physically stored so the number of disks in an I/O operation does not change. However if you are issuing multiple concurrent I/O operations those can be spread across different physical disks which means even if you don't have exactly double your column count of physical disks there should be a measurable performance increase in heavy I/O workloads.

 

It's been a long time since I have tested this so this is just theory and going off memory but either way it's a situation not really applicable in home, where everything is going to be lighter workloads of low queue depths/outstanding IO's.

 

Fortunately you can create a test virtual disk after you add disks with different configuration parameters and benchmark the resulting performance to figure out what actually is best. Storage Spaces is actually nice like that, more of a unique capability of a single server software storage solution compared to others.

 

Well my NAS is mostly for big file transfers to and from my main PC (hence I'm using 10 GbE adapters as a p2p connection in addition to general 1GbE access for all other devices) so no heavy IO operations. So for additional disks to be utilize I have to use disk optimization to relocate data across physical disks ?

 

@Electronics Wiza is pointing out that you can even use different size disks. I get how unRAID works (with biggest parity drives and basically span data structure across data disks) so there it will work but in Storage Spaces and data striping ? What if you filled up your array and the size of the smallest drive what then ? And what about the parity data. At least with mismatched columns/physical disks I have some idea how it might work.

 

Oh and another question. I know that GUI will not let you do a dual parity space without at least 7 columns but what about PowerShell will it be possible there or is it some hard coded minimum ? For 4TB disks and 5 of them dual parity is probably overkill but for 8TB drives I would feel much better with dual parity even at the cost of storage efficiency.

 

12 hours ago, Electronics Wizardy said:

You can also mix drive sizes. 4TB drives are pretty small these days, so I'd probably try to get bigger drives instead of more drives. Also saves on power.

 

My justification flawed it may be was to instead using 3 x 8TB drives use 5 x 4TB drives for better speed (and speed is quite important for me) but I didn't factor in the need for further expansion and it will definitely be needed down the line so since I need time to put aside money and hardware to make an upgrade I'd like to plan ahead. Better start now rather than later.

 

If I use let's say 8TB drive will the array let me use this last 4TB that isn't reflected in smaller disks or will this space be reserved for additional parity until all other disks will be replaced with equal/larger drives ? I can't imagine how data can be pulled from 4 stripes at once form 4 different disks when all other disks are much smaller ?

 

Ps. I Imagine parity space with 5 columns and 6 disks works something like that :

 

        1  2  3  4  5    6
        a  b  c  d   p   p2
        a  b  c  p  a2  d
        a  b  p b2  c   d
        a  p c2  b  c   d
        p d2  a  b  c   d

        a2 a  b  c  d   p

 

Numbers represent physical disks letters represent columns and stripes (a-d data, p parity and a2-d1 and p2 being a second layer of stripes to utilize full space without compromising performance. I might be dead wrong but I can't imagine different way it could work.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Kriz said:

So for additional disks to be utilize I have to use disk optimization to relocate data across physical disks ?

Sort of, new data will use the new disks and existing will stay where it is. The PowerShell command is a really simple one liner so no reason to no do it and the GUI does it by default.

 

4 hours ago, Kriz said:

@Electronics Wiza is pointing out that you can even use different size disks. I get how unRAID works (with biggest parity drives and basically span data structure across data disks) so there it will work but in Storage Spaces and data striping ?

Yes and this is were the column size really matters a lot. If your virtual disk column configuration requires 4 physical disks and you have 4x 4TB then later add 4x 8TB your virtual disk can utilize equivalent of 12x 4TB capacity, the 4x 8TB is capable of holding twice as much as the 4TB disks as everything fits nicely within the physical disk boundaries.

 

So mixed disk size can absolutely be used but you have to think about the physical layout and column sizes and where they can fit, it gets really complicated and confusing quickly unfortunately. The lower the column count the easier and more efficient/effective capacity you will gain and be able to use when adding disk(s) for existing virtual disks.

 

4 hours ago, Kriz said:

Oh and another question. I know that GUI will not let you do a dual parity space without at least 7 columns but what about PowerShell will it be possible there or is it some hard coded minimum ?

7 is the absolute minimum. Single and Dual parity are not RAID Parity in Storage Spaces FYI, it's actually Erasure Coding and the minimums are based on how this works and I believe what Microsoft has deemed as sensible configuration minimums. Dual Parity configurations that would allow less disks to be used would give the same usable capacity as Two-Way Mirror for example but require more disks and worse performance so there is no reason to offer/allow it.

 

4 hours ago, Kriz said:

If I use let's say 8TB drive will the array let me use this last 4TB that isn't reflected in smaller disks or will this space be reserved for additional parity until all other disks will be replaced with equal/larger drives ? I can't imagine how data can be pulled from 4 stripes at once form 4 different disks when all other disks are much smaller ?

Fundamentally disk size does not matter, it's only used to check if there is capacity to place data in to at the moment of a write as well as showing you the maximum size a virtual disks can be when doing resize/creation operations. Other than that if the interleave size is 256MB then all that is required is the number of disks matching the column size with 256MB free.

 

You can start a pool with 3x 8TB disks, create a Single Parity Virtual Disk that uses 3 Columns, add 1x 1TB and gain 0.64TB usable for the virtual disk, add another 1x 2TB and gain 1.38TB usable for the virtual disk, then finally add another 1x 3TB and gain 2.07TB usable capacity totaling 4.09TB usable capacity added. Now if you add a fourth disk of 2TB you will gain 1.34TB usable for the virtual disk. If you now add another 2TB disk and remove the 1TB disk you will gain 0.65TB usable for the virtual disk.

 

Confused yet? lol. Easily rule of thumb, if you add disk(s)/capacity you will gain usable capacity that isn't too far off theoretical maximum (see below)

 

Single Parity 3 Column

 

8, 8, 8 = 15.84TB 100%

Spoiler

Initial creation using 3x 8TB VHDX files and using Fixed Virtual Disk type (required for this demo) with PowerShell

image.thumb.png.0a5c9f5edd78d99cf540f8c13767f146.png

 

8, 8, 8 + 1 = 16.48TB of 16.62TB 99.16%

Spoiler

image.thumb.png.88717e965348a08dbe100614958c4fde.png

 

Resize-VirtualDisk -FriendlyName Disk1 -Size 16.48TB

 

image.thumb.png.261259b43f7730abf743bf43564f7bb6.png

0.255TB not usable for original virtual disk

 

8, 8, 8, 1 + 2 = 17.86TB of 17.95TB 99.5%

Spoiler

image.thumb.png.64c3ef587c2e69f6ce75c6a534f9b897.png

 

 

Resize-VirtualDisk -FriendlyName Disk1 -Size 17.86TB

 

image.thumb.png.b4314ad6407f48af4bf6a52f47dce1b5.png

 

8, 8, 8, 1, 2 + 3 = 19.93TB 100% (within error/overhead etc)

Spoiler
Resize-VirtualDisk -FriendlyName Disk1 -Size 19.93TB

image.thumb.png.aef6ee6f0ae88962aea0954acfc66c5d.png

 

8, 8, 8, 1, 2, 3 + 2 = 21.27TB 100% (within error/overhead etc)

Spoiler
Resize-VirtualDisk -FriendlyName Disk1 -Size 21.27TB

image.thumb.png.7ab011d4fe495cb8b065a3e8b7beb085.png

 

8, 8, 8, 1, 2, 3, 2 + 2 - 1 = 21.92TB of 21.93TB 99.95%

Spoiler
Resize-VirtualDisk -FriendlyName Disk1 -Size 21.92TB

image.thumb.png.270434b3309acd8df916e37255a099fe.png

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Sort of, new data will use the new disks and existing will stay where it is. The PowerShell command is a really simple one liner so no reason to no do it and the GUI does it by default.

 

Yes and this is were the column size really matters a lot. If your virtual disk column configuration requires 4 physical disks and you have 4x 4TB then later add 4x 8TB your virtual disk can utilize equivalent of 12x 4TB capacity, the 4x 8TB is capable of holding twice as much as the 4TB disks as everything fits nicely within the physical disk boundaries.

 

So mixed disk size can absolutely be used but you have to think about the physical layout and column sizes and where they can fit, it gets really complicated and confusing quickly unfortunately. The lower the column count the easier and more efficient/effective capacity you will gain and be able to use when adding disk(s) for existing virtual disks.

 

7 is the absolute minimum. Single and Dual parity are not RAID Parity in Storage Spaces FYI, it's actually Erasure Coding and the minimums are based on how this works and I believe what Microsoft has deemed as sensible configuration minimums. Dual Parity configurations that would allow less disks to be used would give the same usable capacity as Two-Way Mirror for example but require more disks and worse performance so there is no reason to offer/allow it.

 

Fundamentally disk size does not matter, it's only used to check if there is capacity to place data in to at the moment of a write as well as showing you the maximum size a virtual disks can be when doing resize/creation operations. Other than that if the interleave size is 256MB then all that is required is the number of disks matching the column size with 256MB free.

 

You can start a pool with 3x 8TB disks, create a Single Parity Virtual Disk that uses 3 Columns, add 1x 1TB and gain 0.64TB usable for the virtual disk, add another 1x 2TB and gain 1.38TB usable for the virtual disk, then finally add another 1x 3TB and gain 2.07TB usable capacity totaling 4.09TB usable capacity added. Now if you add a fourth disk of 2TB you will gain 1.34TB usable for the virtual disk. If you now add another 2TB disk and remove the 1TB disk you will gain 0.65TB usable for the virtual disk.

 

Confused yet? lol. Easily rule of thumb, if you add disk(s)/capacity you will gain usable capacity that isn't too far off theoretical maximum (see below)

 

Single Parity 3 Column

 

8, 8, 8 = 15.84TB 100%

  Reveal hidden contents

Initial creation using 3x 8TB VHDX files and using Fixed Virtual Disk type (required for this demo) with PowerShell

image.thumb.png.0a5c9f5edd78d99cf540f8c13767f146.png

 

8, 8, 8 + 1 = 16.48TB of 16.62TB 99.16%

  Reveal hidden contents

image.thumb.png.88717e965348a08dbe100614958c4fde.png

 

Resize-VirtualDisk -FriendlyName Disk1 -Size 16.48TB

 

image.thumb.png.261259b43f7730abf743bf43564f7bb6.png

0.255TB not usable for original virtual disk

 

8, 8, 8, 1 + 2 = 17.86TB of 17.95TB 99.5%

  Reveal hidden contents

image.thumb.png.64c3ef587c2e69f6ce75c6a534f9b897.png

 

 

Resize-VirtualDisk -FriendlyName Disk1 -Size 17.86TB

 

image.thumb.png.b4314ad6407f48af4bf6a52f47dce1b5.png

 

8, 8, 8, 1, 2 + 3 = 19.93TB 100% (within error/overhead etc)

  Reveal hidden contents
Resize-VirtualDisk -FriendlyName Disk1 -Size 19.93TB

image.thumb.png.aef6ee6f0ae88962aea0954acfc66c5d.png

 

8, 8, 8, 1, 2, 3 + 2 = 21.27TB 100% (within error/overhead etc)

  Reveal hidden contents
Resize-VirtualDisk -FriendlyName Disk1 -Size 21.27TB

image.thumb.png.7ab011d4fe495cb8b065a3e8b7beb085.png

 

8, 8, 8, 1, 2, 3, 2 + 2 - 1 = 21.92TB of 21.93TB 99.95%

  Reveal hidden contents
Resize-VirtualDisk -FriendlyName Disk1 -Size 21.92TB

image.thumb.png.270434b3309acd8df916e37255a099fe.png

 

 

 

Damn that was a lot more detailed than I was expecting. I would lie if I said I understand everything now but hey that never stopped me before form tinkering. So what would be a sensible upgrade path then ? I have 5x4TB Toshiba n300 (16K interleave as far as I have tested the smallest PowerShell will allow and 64K AUS to have some efficiency with my limited testing this setup proved sufficient for decent speeds) . My plan was to if need be create another Space with 3x4TB n300's and when even more space would be needed add two more drives and merge all of this disks into 10x4TB n300'3 with dual parity to not only conserve but hopefully improve write speed (currently I'm getting about 270 - 350 MB/s usually on the high end on this range). That would necessitate backing up everything and rebuilding I'm well aware of that. But as @Electronics Wiza pointed out it requires twice the power for twice the storage. Speed however should be a lot better on reads especially. But if I could strategically add disks to existing pool without loosing any performance that would be a preferable path until I got close to 10 disks at which point I would make sense to rebuild array from scratch as dual parity space. What's your take on that ?

 

Oh and thanks for time you spend testing couldn't do it myself.

 

Link to comment
Share on other sites

Link to post
Share on other sites

27 minutes ago, Kriz said:

But if I could strategically add disks to existing pool without loosing any performance that would be a preferable path until I got close to 10 disks at which point I would make sense to rebuild array from scratch as dual parity space. What's your take on that ?

Add disks to the existing pool, there is no reason and no negative performance impact to doing so. Pools themselves don't carry much meaning or weight in Storage Spaces other than logically limiting which disks can be used for 'reasons' that someone might have. Someone will always want it but generally speaking don't create multiple pools. When you create Virtual Disks you have all the options in the world including choose which physical disks within a pool are allowed to be used for it (don't do this without a damn good reason).

 

If you end up with 10x 4TB in a single Pool and create a new Virtual disk of Dual Parity 10 Columns, interleave 16KB, AUS 128K you'll probably, and this is just guessing, only get around 500MB/s write. I would experiment with larger interleave and AUS sizes too, if most of your files are large(ish) then higher is technically better. Even so just do some benchmarks as you add disks and see how performance increases, you might find no change is required at all (doubtful but no point making work for yourself if you don't need to).

 

If you do plan on creating a new Virtual Disk in the Pool then don't expand the existing virtual disk much, you need Pool capacity to create a new Virtual Disk and migrate the data. Or you have to take it off system and then copy it back, more painful.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, Kriz said:

If I use let's say 8TB drive will the array let me use this last 4TB that isn't reflected in smaller disks or will this space be reserved for additional parity until all other disks will be replaced with equal/larger drives ?

image.thumb.png.e0c28a02e50a38643133f178a1459c05.png

 

Add 1x 8TB Disk

image.thumb.png.3d6aa261c3beb9bea9a16b7fd5018ed5.png

Can use nearly all the 8TB disk capacity

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, leadeater said:

Add disks to the existing pool, there is no reason and no negative performance impact to doing so. Pools themselves don't carry much meaning or weight in Storage Spaces other than logically limiting which disks can be used for 'reasons' that someone might have. Someone will always want it but generally speaking don't create multiple pools. When you create Virtual Disks you have all the options in the world including choose which physical disks within a pool are allowed to be used for it (don't do this without a damn good reason).

 

If you end up with 10x 4TB in a single Pool and create a new Virtual disk of Dual Parity 10 Columns, interleave 16KB, AUS 128K you'll probably, and this is just guessing, only get around 500MB/s write. I would experiment with larger interleave and AUS sizes too, if most of your files are large(ish) then higher is technically better. Even so just do some benchmarks as you add disks and see how performance increases, you might find no change is required at all (doubtful but no point making work for yourself if you don't need to).

 

If you do plan on creating a new Virtual Disk in the Pool then don't expand the existing virtual disk much, you need Pool capacity to create a new Virtual Disk and migrate the data. Or you have to take it off system and then copy it back, more painful.

 

1 hour ago, leadeater said:

image.thumb.png.e0c28a02e50a38643133f178a1459c05.png

 

Add 1x 8TB Disk

image.thumb.png.3d6aa261c3beb9bea9a16b7fd5018ed5.png

Can use nearly all the 8TB disk capacity

 

Once again thanks for all the testing. 500 MB/s write would be good enough for me so I think I'm gonna stick with 4TB drives and keep expanding my array to hopefully one day get to maxing out my home server HDD capacity. Still this knowledge will be useful in the future. Energy cost in eastern Europe is through the roof with war and all but I operate my server more or less as a standard Windows PC with daily power cycles so hard drive longevity is in the end a bigger concern.

 

Thanks guys for the help as always !

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Kriz said:

Once again thanks for all the testing. 500 MB/s write would be good enough for me

Just keep in mind that's only an estimate, I didn't do any performance testing as all those disks are virtual disks on an SSD and were used just for the capacity calculations. I find doing that is easier than trying to do any math on HDD size, counts, interleave, columns etc. Just create some thin provisioned/dynamic VHDX files in disk manager and then create a Storage Pool etc with them.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, leadeater said:

Just keep in mind that's only an estimate, I didn't do any performance testing as all those disks are virtual disks on an SSD and were used just for the capacity calculations. I find doing that is easier than trying to do any math on HDD size, counts, interleave, columns etc. Just create some thin provisioned/dynamic VHDX files in disk manager and then create a Storage Pool etc with them.

Well if there is performance scaling with the number of disks that's about what the performance should be so it's no that bad to go by in lack of practical testing on actual hardware. My current write speeds are fine for 5 disks but obviously if I double the disk count you would expect to see proportional performance scaling at least on reads and hopefully on writes as well.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×