Jump to content

Raid 0 SSD array for literature database

CubeSat Guy

Hello,

I need professional advice regarding the following issue: I want to set up a database server (if this is important: The database will contain literature information like author, title, abstact, pdf link, citation references to other papers) for text mining. There will be an initial dump import of about 75gb. The database will be updated and extended by something like a crawler that will place insert statements to the database. This will happen in sprint periods every few weeks. Most of the time though, and this is much more important, there will be many parallel select statements on the database to build up a graph flat file with a word-net-graph.

I bought an LSI 9260-8i controller. I have
- a Samsung 840 Pro 256GB
- 2 Samsung 840 Evo 250 GB.

I want to upgrade it with 7/6 more SSDs to use the full Raid 0 capabilities of the controller to make it as fast as possible. There will be regular backups to another server.

My questions:
1. Are there general flaws in my logic?
2. Is it better to get 7 more 840 Pros or 6 more 840 EVOs? Before you answer this question: I know that the 840 pro is generally considered the "better" SSD, but the EVO was released later and doesn't really seem to have much less IOPS. I wonder if the Pro will *really* have any effect. Getting the EVOs would be much cheaper. I don't want to sacrifice too much performance for the money though.

Thanks in advance! :)

Link to comment
Share on other sites

Link to post
Share on other sites

Hello,

I need professional advice regarding the following issue: I want to set up a database server (if this is important: The database will contain literature information like author, title, abstact, pdf link, citation references to other papers) for text mining. There will be an initial dump import of about 75gb. The database will be updated and extended by something like a crawler that will place insert statements to the database. This will happen in sprint periods every few weeks. Most of the time though, and this is much more important, there will be many parallel select statements on the database to build up a graph flat file with a word-net-graph.

I bought an LSI 9260-8i controller. I have

- a Samsung 840 Pro 256GB

- 2 Samsung 840 Evo 250 GB.

I want to upgrade it with 7/6 more SSDs to use the full Raid 0 capabilities of the controller to make it as fast as possible. There will be regular backups to another server.

My questions:

1. Are there general flaws in my logic?

2. Is it better to get 7 more 840 Pros or 6 more 840 EVOs? Before you answer this question: I know that the 840 pro is generally considered the "better" SSD, but the EVO was released later and doesn't really seem to have much less IOPS. I wonder if the Pro will *really* have any effect. Getting the EVOs would be much cheaper. I don't want to sacrifice too much performance for the money though.

Thanks in advance! :)

I don't see any issues here, depending on the controller though you'll probably lose 6GB since the array will conform to the lowest drive size.

 

For the price difference I would go for the Pro, it's old and uses MLC which is slower than the Evo's TLC but the Pro's are rated to last much longer (although it might not matter to you)

 

if you go for Pro's you might be losing out on 6GB each time you add a drive, so if that bothers you and you want to go for pro then maybe get rid of the EVO...

Link to comment
Share on other sites

Link to post
Share on other sites

If you are going to do backups to another server Raid0 would be OK, But you might want to look into Raid 5/10 for some resiliance of drive failures. (then you will have two backups, one onsite and an offsite while still getting RAID0 benefits) 

Intel I9-9900k (5Ghz) Asus ROG Maximus XI Formula | Corsair Vengeance 16GB DDR4-4133mhz | ASUS ROG Strix 2080Ti | EVGA Supernova G2 1050w 80+Gold | Samsung 950 Pro M.2 (512GB) + (1TB) | Full EK custom water loop |IN-WIN S-Frame (No. 263/500)

Link to comment
Share on other sites

Link to post
Share on other sites

If you are going to do backups to another server Raid0 would be OK, But you might want to look into Raid 5/10 for some resiliance of drive failures. (then you will have two backups, one onsite and an offsite while still getting RAID0 benefits) 

Good point, although you're going to need 3 drives min for RAID5 and 4 for RAID10

 

If you're not worried about "Business Continuity" in the event of a failure you could stay with RAID0 but most would not recommend it. 

Link to comment
Share on other sites

Link to post
Share on other sites

Don't run in RAID 0. Seriously. It's better with SSDs than with HDDs, but you're taking a lot of risk.

 

I would go with a RAID 6 -- SSD RAID doesn't suffer from the same problems that HDD RAID does (low random performance). You'll get plenty of redundancy while not losing out on too much space.

I also would hold off on buying all of your drives -- get a few together in a RAID 6 and measure the performance. If you need more, then add another drive. If adding that next drive doesn't get you much more performance, then you've saturated the controller.

I do not feel obliged to believe that the same God who has endowed us with sense, reason and intellect has intended us to forgo their use, and by some other means to give us knowledge which we can attain by them. - Galileo Galilei
Build Logs: Tophat (in progress), DNAF | Useful Links: How To: Choosing Your Storage Devices and Configuration, Case Study: RAID Tolerance to Failure, Reducing Single Points of Failure in Redundant Storage , Why Choose an SSD?, ZFS From A to Z (Eric1024), Advanced RAID: Survival Rates, Flashing LSI RAID Cards (alpenwasser), SAN and Storage Networking

Link to comment
Share on other sites

Link to post
Share on other sites

There will also be a point where performance increases will tail off as the controller will become saturated & the only benefit to adding drives will be space

On standard SATA 3 that's 3-4 drives

Moving from 3 to 4 the benefits / speed increase are really low between th e two

Workstation:
Intel Core i7 5820k @ 4.4Ghz, Asus Rampage V Extreme, 32Gb G.Skill Ripjaws 4 2400 DDR4,2 x Nvidia 980 Gtx Reference Cards in Sli,
1TB - 4 x 250Gb Samsung Evo 840 Raid 0, Corsair AX1200i, Lian Li PC-D600 Silver.

Link to comment
Share on other sites

Link to post
Share on other sites

There will also be a point where performance increases will tail off as the controller will become saturated & the only benefit to adding drives will be space

On standard SATA 3 that's 3-4 drives

Moving from 3 to 4 the benefits / speed increase are really low between th e two

Isn't that the case with sequential reads/writes only though? I would imagine my use case needs a lot of IOPS and 4K reads with a small queue depth. This shoudn't saturate the path between the PCIe 2.0 interface, the 9260-8i's processor, the SAS port and the SATAIII ports. Someone tell me if you know I'm wrong here, please!

 

Thanks wpirobotbuilder. I don't really need a scenario where I need to be able to continue working immideately after a failure. Minor data loss (a day or two) will be ok, too, which is why I think the RAID 0 with daily backups will be enough, but it's a very good idea to start with a few drives and measure the additional performance gain.

 

I'm still torn on the 840 pro vs. 840 evo topic though. Speed is my major priority, which to me means the evos - even though being the "little brother" - have an advantage.

Link to comment
Share on other sites

Link to post
Share on other sites

Isn't that the case with sequential reads/writes only though? I would imagine my use case needs a lot of IOPS and 4K reads with a small queue depth. This shoudn't saturate the path between the PCIe 2.0 interface, the 9260-8i's processor, the SAS port and the SATAIII ports. Someone tell me if you know I'm wrong here, please!

 

Thanks wpirobotbuilder. I don't really need a scenario where I need to be able to continue working immideately after a failure. Minor data loss (a day or two) will be ok, too, which is why I think the RAID 0 with daily backups will be enough, but it's a very good idea to start with a few drives and measure the additional performance gain.

 

I'm still torn on the 840 pro vs. 840 evo topic though. Speed is my major priority, which to me means the evos - even though being the "little brother" - have an advantage.

Personally I use evo's, I'm not sure what the saturation point would be on that controller, my points were based on sata3 rather than sas. Buttherewouldbe some point it would happen when using multiple drives, the pcie bus may not be the point things saturate at but a different point in the chain.

Obviously If the database isnt being accessed locally then nomatterhow fast a setup you use it won't make the slightest difference ifthe lam / Internet connection isn't onpar

Workstation:
Intel Core i7 5820k @ 4.4Ghz, Asus Rampage V Extreme, 32Gb G.Skill Ripjaws 4 2400 DDR4,2 x Nvidia 980 Gtx Reference Cards in Sli,
1TB - 4 x 250Gb Samsung Evo 840 Raid 0, Corsair AX1200i, Lian Li PC-D600 Silver.

Link to comment
Share on other sites

Link to post
Share on other sites

Personally I use evo's, I'm not sure what the saturation point would be on that controller, my points were based on sata3 rather than sas. Buttherewouldbe some point it would happen when using multiple drives, the pcie bus may not be the point things saturate at but a different point in the chain.

Obviously If the database isnt being accessed locally then nomatterhow fast a setup you use it won't make the slightest difference ifthe lam / Internet connection isn't onpar

 

Database will be scanned locally by a 3930K on X79 at first and later by a 24core Xeon machine.

Link to comment
Share on other sites

Link to post
Share on other sites

I would honestly stay away from raid 5/6 with SSD, since you get huge write penalties 4x and 6x respectively.  So that will wear your SSDs out much sooner than needed.  Naturally, it is not that much of an issue with the insane amounts of writes SSDs can handle these days; but it is an issue and you will actually be using the disks with some extremely heavy I/O, so it might actually be an issue for you.

 

As long as you are backing up appropriately, and you don't have an issue with rebuild/repair times, I don't see any problems with raid 0.

Link to comment
Share on other sites

Link to post
Share on other sites

Isn't that the case with sequential reads/writes only though? I would imagine my use case needs a lot of IOPS and 4K reads with a small queue depth. This shoudn't saturate the path between the PCIe 2.0 interface, the 9260-8i's processor, the SAS port and the SATAIII ports. Someone tell me if you know I'm wrong here, please!

Thanks wpirobotbuilder. I don't really need a scenario where I need to be able to continue working immideately after a failure. Minor data loss (a day or two) will be ok, too, which is why I think the RAID 0 with daily backups will be enough, but it's a very good idea to start with a few drives and measure the additional performance gain.

I'm still torn on the 840 pro vs. 840 evo topic though. Speed is my major priority, which to me means the evos - even though being the "little brother" - have an advantage.

Get the pro or an 850 evo. The 840 evo may give you issues. Plus the 850 evo is a cheaper, I believe.

Link to issue:

http://techreport.com/review/27727/some-840-evos-still-vulnerable-to-read-speed-slowdowns

It's always a good day if you woke up breathing.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×