Jump to content

Recommendations for scalable storage, multi-node and load balancing on k8s?

aDataWat

Hey everyone, was hoping for some help as I try and scale a small business using #kubernetes

 

We provide a service where each client is allocated their own instance of a database on Kubernetes. Currently, we utilize hostPath storage by mounting a local directory on the node to store data. However, we've realized that this setup may not scale well as we plan to add more nodes to our cluster.

 

We've explored High Availability and Mayastor for on-node distributed storage, but these require a minimum of three nodes. While this could be a viable path, it necessitates a load balancer in front of the nodes which is an additional consideration.

 

We do currently plan on going from 1 server to 2, so we recognise that HA won't be available for us right now

 

We are now actively seeking advice on:

 

1.Alternative storage solutions that would facilitate smooth scaling as we add a second node, and potentially more nodes in the future.

 

2. Effective strategies for load balancing in a multi-node setup.

 

3. Any experiences or insights on transitioning from a single-node to a multi-node setup, particularly focusing on data storage and management.

 

Your suggestions and recommendations on potential solutions or improvements to our current setup would be amazing

Link to comment
Share on other sites

Link to post
Share on other sites

If you only need storage load balancing then Gluster and Ceph would likely be good choices.

 

6 hours ago, aDataWat said:

While this could be a viable path, it necessitates a load balancer in front of the nodes which is an additional consideration.

Is that for the storage solution you were looking at or to load balance container access? I assume the second.

Link to comment
Share on other sites

Link to post
Share on other sites

I'd recommend having a look at Longhorn https://www.rancher.com/products/longhorn

Check out TechnoTim's stuff 

 

 

 

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO | 12 x 8TB HGST Ultrastar He10 (WD Whitelabel) | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

18 hours ago, leadeater said:

If you only need storage load balancing then Gluster and Ceph would likely be good choices.

 

Is that for the storage solution you were looking at or to load balance container access? I assume the second.


So it would be that the nodes are in different locations with 2 servers having 2 different IPs so it's how do we know where to send each request and having a bit of disaster recovery as if one machine goes down for some reason it would spin up the required containers and re-route the traffic with 0/minimal downtime or interruption.

We're doing this self hosted, I've updated the initial post as I realise this is an important thing to note

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, aDataWat said:

So it would be that the nodes are in different locations with 2 servers having 2 different IPs so it's how do we know where to send each request and having a bit of disaster recovery as if one machine goes down for some reason it would spin up the required containers and re-route the traffic with 0/minimal downtime or interruption.

We're doing this self hosted, I've updated the initial post as I realise this is an important thing to note

If they are going to be distant servers then I think @Jarsky suggestion is going to be suitable, Gluster and Ceph both can do remote replication but they have minimum scale sizes for that which would realistically be 3+3 (6 total) servers, at least for Ceph.

 

You wouldn't want anything that is synchronous replication unless the distance and latency is very low or you'll destroy your storage performance.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×