Jump to content

Plan for on-and-off-site mirrored storage

Like many of us, I have a lot of data that I don't want to lose and it deserves a proper disaster avoidance and recovery solution. I've had a plan in mind for years and I think LTT's recent JONSBO N1 NAS video has answered some questions I had but I wanted to run it by you more experienced people to be sure I'm on a sane path. Nothing has been purchased yet so I'm open to any suggestions. I've seen many threads on here about NAS and backups but I haven't run across this specific idea. I tend to only consider DIY-heavy paths that sometimes end up more difficult or expensive than using existing solutions, hence this sanity-check post.

 

Objective: Error-tolerant, fault-tolerant, and disaster-tolerant multi-terabyte expandable private storage.

 

My plan is to build two identical machines, each with multiple drives in a fault-tolerant RAID configuration. They would host a private volume on each end mirrored and encrypted on the other, plus a shared volume mirrored and writable at both ends. One machine would live with me and the other with a friend. In return for hosting their machine, they would have use of that machine's private volume. The volumes would take periodic snapshots for time-based data recovery.

  • Error tolerance (user error, not hardware error) would be achieved with ZFS's snapshots or some other rollback mechanism.
  • Fault tolerance would be achieved with RAID parity and off-site mirroring.
  • Disaster tolerance would be achieved with off-site mirroring.

The two machines would look somewhat like this, with a VPN connecting them:

883305537_NASbackupplandiagram.png.51cdc79ac65b491de63bc76cd3f7de1e.png

 

As a bonus it'd be nice to be able to add a third identical machine to the cluster, stored at a third location, with its own private volume mirrored on the other two and them mirrored on it. Repeat as necessary.

 

I had no idea what software would support such a feature set until Linus mentioned TrueNAS. After looking at their site I think this is all doable but I'm not certain.

 

My questions are:

  1. Is this achievable with off-the-shelf software like TrueNAS?
  2. Is there a better approach I should consider? Replace machine B with hosted storage like Amazon S3 Glacier?
Link to comment
Share on other sites

Link to post
Share on other sites

  • 4 weeks later...

Hello,

I'm looking for a setup like that for quite a while now. I didn't found much about it online.

If TrueNas can handle the data volumes part of the story, I'm planning on having wireguard servers running on Machine A and B (as well as a client in one to connect to the other) in order to access both LANs (don't want to open my firewall too much).

My main goal is to have setup that allow for uneven hardware.I want to play with some docker, some game servers, ..., but that's not the case for the other machine.

Looking forward to some update on the subject !

Link to comment
Share on other sites

Link to post
Share on other sites

The replication is the problem. Maintaining functional VPNs from your living room to somebody else's living room via residential ISPs is a royal pain and prone to incalculable points of failure. Large image files, like full backups of computers can literally take days to replicate unless you have super fast upload plans.

 

If your data is in the TB range and you dont generate that much new data use back, blaze, glacier, etc. If you have dozens of TBs to back up and constantly generating more TBs of delta you have another problem entirely. 

Link to comment
Share on other sites

Link to post
Share on other sites

ZFS send is excellent but keep in mind it is a one way transfer. It can't "sync a folder" full of changes from both the sender and receiver. (it works on based on time at the block level and has no concept of files at all, that approach has pro's and cons.) ZFS send can also transfer incrementally encrypted datasets over the wire and store the encrypted volume on a untrusted guest securely almost nothing else I know of can do that directly without decrypting / re-encrypting it.

(there is even a way to do it to non ZFS block store or even, now object store.. with the absolute minimum of transation$ and it's due to ZFS's design)

 

If you need two way replication functionality you may try rsync.

If all you need is one way replication (backups), ZFS send is gold.

"Only proprietary software vendors want proprietary software." - Dexter's Law

Link to comment
Share on other sites

Link to post
Share on other sites

I'm glad this has sparked a bit of discussion. Thanks all for the feedback so far.

 

On 4/4/2022 at 2:17 PM, Ehiztary said:

If TrueNas can handle the data volumes part of the story

Linus has regularly used TrueNAS on his massive-volume projects without a second thought so I've been assuming that volume is the one thing I know it won't have an issue with.

 

On 4/4/2022 at 8:24 PM, wseaton said:

Maintaining functional VPNs from your living room to somebody else's living room via residential ISPs is a royal pain and prone to incalculable points of failure.

Back when I was using OpenVPN I would've wholeheartedly agreed. It, and many, VPN protocols are brittle and fail as soon as the underlying network hiccups. But then I found Wireguard and it solved all of these problems. Brilliant software.

 

On 4/4/2022 at 8:24 PM, wseaton said:

Large image files, like full backups of computers can literally take days to replicate unless you have super fast upload plans.

Yes, this is a concern that I need to design around. My assumption is that as long as I consider it an "eventually-remote" backup it should meet my needs. I plan on completing the initial sync on a LAN. Ongoing, I'll likely set it up to throttle during the day but go full speed at night. Daily theoretical capacity, assuming 10 Mbit for 16 hours and 30 Mbit for 8 hours, would be 175 GB ((30 Mbit * 8 h) + (10 Mbit * 16 h)) * 3600 seconds / 8 bits / 1024 MB). Certainly full-system backups would exceed this so they'd span multiple days. My thinking is that an automated eventually-synched system is preferable to manually moving drives between locations.

 

On 4/5/2022 at 11:25 AM, jde3 said:

If you need two way replication functionality you may try rsync.

If all you need is one way replication (backups), ZFS send is gold.

Thank you for this! I'll look further into ZFS send but on the surface it seems to be the piece I was wanting but not seeing. I also realize my diagram is incorrect since the disk layout and volume file systems should really be independent. Given your suggestion, volumes A and B would use ZFS send for one-way replication, and the shared volume would be ZFS-formatted for its snapshot abilities but use an entirely different system to mirror like rsync. I've had success with rsync in the past so I'm liking this hybrid approach.

 

It sounds like my goal is achievable. My next step will be to build a proof of concept out of old hardware to suss out the details before committing.

 

Link to comment
Share on other sites

Link to post
Share on other sites

TrueNAS replication can do what you want. Just keep in mind the only true disaster plan is to have a backup. It's well and good having data mirroring, snapshots, and arrays to be fault tolerant, but if that data is compromised, for mission critical data you always want a backup that is not connected.

 

There are plenty of BaaS providers out there these days  which could be an option if you only need to have a true backup of some of your data, cloud can be pretty convenient for that.  I only actively backup about 1% of my various NAS' which is absolutely critical (A copy of my local backups from my machines photos, documents, etc...), the rest is just a "It'd be sh*t if I did lose it but not the end of the world"

 

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO + 4 Additional Venturi 120mm Fans | 14 x 20TB Seagate Exos X22 20TB | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×