Jump to content

Tips for Setting Up Network Accessible Drive on Linux

Hello All,

This is my first post to the forum!聽 I made an account specifically to ask this question, though I have often used forum topics to work through past problems without an account (Trying to avoid more user names and passwords to remember聽馃槄聽).

For reasons I will explain below,聽 I am seeking input on the best way to take 16 HHDs in a server case and make them accessible over my 10GbE LAN as a storage pool.聽 This pool will primarily be used as a NAS for聽important data, but will also need to be accessible by the local machine for data processing.聽 The server needs to be running on Red Hat 7 (already installed) and be compatible with MacOS.

My issue/request stems from a very specific use case.聽 My fianc茅 is an astronomer and she primarily uses radio telescopes for her research. We met through work (I have a computer science background聽in information security, so I know just enough about what I am doing to be dangerous!),聽where聽am a telescope operator at the radio observatory she starting working at after getting her PhD.聽聽

Essentially the problem boils down to this:聽 She, and many others in the astronomy community, uses聽Mac OSX as聽her聽daily driver.聽 All most all of the computers I use聽for my job run聽Linux (Red Hat 7, RHEL 7,聽to be exact).聽 Many of the programs she uses for data reduction are written first for Linux (validated for RHEL 7) and often will be ported to OSX too.聽 The problem we have is 2 fold.

1)聽 Storage.聽 The data sets her observations generate are typically 3TB on the low end.聽 Normally storage isn't an issue as the observatory where the research is done will hold on to the date for a number of years for her, however, there will eventually come a time when the observatories will flush the data to make room for new research.聽 When we first met, her solution was to get a new external HHD for each data set and copy all the data from a project onto one drive once the project was complete.聽 As you may realize, there are a number of problems with this from lack of redundancy, ease of physically losing the drives... there being about a dozen of these things now,聽etc.

2)聽 Data reduction.聽 After the data has been collected, there is a lot of聽post processing that needs to be done for her to pull out or clean up the information she wants from the images.聽 This is presents its own set of problems:聽 If she uses聽the external HDDs and her MacBook Pro, she has to leave her聽laptop on overnight, or longer, to do just a few passes on the images (the processing can sometimes take hundreds of passes for a final product).聽 If she remotes into a computer at the observatory where the data was collected, or into her work computer, she can run into issues where the network connection can time out during the data reduction, the computers can be rebooted as part of normal maintenance by the IT department at the observatories, or some other incident can interrupt the data processing forcing her to start all over again.

To address these issues, I have taken my old gaming hardware聽and a 3U Supermicro box to build a rack mounted system she can use instead.聽 My old gaming hardware will be聽better聽(more CPU cores, more powerful GPU) than what she currently is using either at work, or with her MacBook.聽 The Supermicro box also has 16 3.5" bays that I have populated with 500GB HHDs (size is subject to change, these are simply drives my Plex server outgrew that I am reusing to prototype this server).聽 I currently have all the hardware in place and have installed RedHat 7 on a SSD as聽the bootdrive in order to avoid current or future compatibility issues with the data reduction software.

Where I am now, and the reason for this post, is I am looking for input/help聽on the best way to take the 16 drives and RAID them into a single directory that can be accessed over our (10GbE) LAN from her MacBook for storage use, and support ssh protocols for her to do her data processing.聽 I have some past experience building FreeNAS based servers, I have a Plex server and a second FreeNAS box that has been setup to do automatic backups of the Plex server and certain folders on my desktop.聽 However, I have very little experience with OSX, and none at all when it comes to communicating between OSX and Linux.聽 I looked at building a ZFS RAID array in RHEL 7, and while it seems doable, there are quiet a few steps, and all of it is done through the command line.聽 An option I am not familiar with, but thought may work well is to use something like Unraid to setup the pool, but I have not experience with Unraid and am not sure if that would be a good fit for what I need to do.

Thanks in advance for any help!

Link to comment
Share on other sites

Link to post
Share on other sites

The primary working environment needs to be RHEL, but I have thought it might be a good idea to use Unraid then create a VM for RHEL.聽 I just am not sure how much overhead that will make, or if there is a cleaner way to do it.

Link to comment
Share on other sites

Link to post
Share on other sites

Aright so I hope it's alright if I try to summarize what you want in some easy bullet points, and we can talk about what your priorities really are.

You want:

1. A great big shared folder which is 16 drives in the background, but to the user is just one big network share.

2. Something that plays nice with macOS and where the machine itself can actually run some of the data processing work

3. You don't want to make your life harder than it needs to be.

The reason I added point 3 is because you wrote "I looked at building a ZFS RAID array in RHEL 7, and while it seems doable, there are quiet a few steps, and all of it is done through the command line." and you are really committed to using RHEL no matter what. This is a fine goal to have and not making my life harder is a high priority for me too whenever I'm setting something up at home.

So what you need to do is get a hardware RAID controller. Your OS itself shouldn't even know that you have 16 drives, it should just see one massive block of storage. It costs some money, but it makes your life way easier. Next, you just install RHEL as normal in the way that you like. Then create a directory for all the telescope data and share it via samba. The you back it all up using duplicati to B2, some local network share or wherever you normally backup to. If it were me, this is what I'd do except I'd run it all on Ubuntu instead of RHEL, because Ubuntu's my preferred distro.

This is the easiest to setup, easiest to maintain option you're likely to find. It doesn't really use all your hardware to it's theoretically full potential, but that's the trade-off.

If you wanted to through even more money at the problem and make it even easier to maintain, buy a hardware RAID external controller and a used 2014 Mac Mini. You already know that everything runs smooth as silk on macOS, this is the cheapest machine that'll run Big Sur when it comes out this fall.

Link to comment
Share on other sites

Link to post
Share on other sites

Those 3 points are spot on.

I have a few RAID controllers already - the way the 16 drives are connected to the MB is through a LSI 9211-8i that has been flashed to IT mode (IT mode makes it an HBA) - I bought a pair of these when setting up my Plex server so ZFS could see the individual drives.聽 The OS can already see all 16 of the individual drives no problem.聽聽

From what you are saying, I may want to use the RAID controller as an actual RAID controller rather than an Host Bus Adaptor as it is now.聽 That is doable - I already flashed the card firmware one way, flashing it back should be a similar process.聽 Either that, or just leave the card the way it is and setup the RAID array through the OS.聽


I was leaning toward ZFS only because of ZFS Replication (it makes back ups very snappy and can be setup as an easy cron job - and I am already familiar with ZFS).聽 I was planning to use my current BackUp Server and ZFS Replication to backup this new box like I already do my Plex box, but it should be doable to backup the way you suggested too.

Thanks for your help!聽 I mostly needed to bounce ideas around and get out of my own head!

Link to comment
Share on other sites

Link to post
Share on other sites

On 10/17/2020 at 9:14 PM, ExpendableTango said:

From what you are saying, I may want to use the RAID controller as an actual RAID controller rather than an Host Bus Adaptor as it is now.

Yep, this is what I'd do. Just makes life easier.

On 10/17/2020 at 9:14 PM, ExpendableTango said:

Thanks for your help!聽 I mostly needed to bounce ideas around and get out of my own head!

Anytime! Sometimes just trying describe what you to somebody is the best way to figure out what you should do.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now