Jump to content

Building a SAN?

Hi Guys!

 

 

     Looking over the inter-webs I see a ton of NAS build vlogs, but almost no SAN builds. Now before the enterprise junkies jump all over me.. I know there is a huge difference between NAS and SAN... huge. That being said other than buying something like the 1/4 million dollar NetApp SAN, is their a way to build a single storage box, that I could expand to other boxes in the future. Up to about a Petabyte. Maybe ten boxes connected over fiber. In short i'm looking for a DIY datacenter solution, that can get me from 10TB to almost 1 PB. See my crude diagram. 

 

Appreciate the feedback!

Cody

Screen Shot 2016-06-28 at 12.41.24 PM.png

Link to comment
Share on other sites

Link to post
Share on other sites

wtf do you need this for? i don't see you using this for years even if you do hosting.

Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler

What are you looking for?

 

 

 

 

 

 

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

So I take it you are basically setting up a small host? You design is pretty normal in general. I would say use ubuntu or debian with nfs for the software. NFS I think runs at kernel level, so you should get good performance.

My native language is C++

Link to comment
Share on other sites

Link to post
Share on other sites

What type of storage do you actually need? Block, File or Object? What do you want to connect to this storage? Few more details will help.

 

There are some really good open source scale out solutions but there are also off the shelf enterprise solutions that are much cheaper than NetApp.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, codefus said:

 

Well, if you want the basic functilaity you can have linux as a nfs server and iscsi target. You can use it as a san for 90% of what you do and this will work fine if you need storage for something like vmware. If you want a real commerical grade san you spending 100's of thousands for a new one, or you might be able to get a used one for cheap, but it won't be much faster than a diy one. Its only advantage will be the approvals.

Link to comment
Share on other sites

Link to post
Share on other sites

If you decide to go the enterprise solution route, EMC's Unity midrange storage could be another option. You can search and compare at the EMC Store. http://bit.ly/1SIZ9N7

Link to comment
Share on other sites

Link to post
Share on other sites

You should look into stuff like GlusterFS or ceph. We currently do this with running openzfs and then running gluster above that. 

Link to comment
Share on other sites

Link to post
Share on other sites

For this type of storage you would be looking scale out storage which can be used with commodity hardware to achieve what you are trying to build.  I have experience in this field, but finding exactly what works best for you will be down to your requirements.

 

There are a number of different ways to do this, someone has already mentioned Ceph but others are available if you aren't familiar with primarily Linux based administration.  It depends on if you want to simply provide simple block storage with no bells and whistles or if you need features for clustering or S3 object storage.  


I would always look at the type of disk bus you are providing from your hardware before venturing for a choice.  If you are simply running pass through AHCI/JBOD based disks from a HBA that is capable of pass-through then I would look at something along the lines of EMC ScaleIO or Windows Storage Spaces Direct (still in tech preview).  If you have hardware which can only present SCSI/RAID block devices and 'hide' behind a RAID controller for access, then your options might be more limited as you will find getting information out of the disks for monitoring more difficult.

The below would be options if all you want to do is present storage, there are loads more options about but these are the ones I have successfully used without a crap tonne of drama getting them implemented.  

Windows OS based options (that I have used)
EMC provide ScaleIO which is a free software package that could accomplish what you are trying to achieve but support for it is provided by the community for the project; http://uk.emc.com/storage/scaleio/index.htm

 

Windows Storage Spaces Direct - This provides HA deployments like you are looking for with internal disks and takes advantage of SMB 3 based access to have remote access to disks.  This is still in technical preview but I have used it quite a bit in testing and it seems stable, monitoring of disks etc needs to be done by your own S.M.A.R.T utilities/powershell scripts etc;  https://technet.microsoft.com/en-GB/library/mt126109.aspx

 

Linux OS based options (that I have used)
Ceph object based storage that leverages S3/Swift to present storage to virtual platforms etc.  It does require a lot of reading an understanding before you get stuck in.  In my experience of a 1.5PB Ceph cluster, it is not up to par on performance terms as of yet.   It is in its infancy as far as storage technology goes but it is on the upcoming list of big hitters.  Recent changes and optimisations from SanDisk have certainly helped improve disk performance and reduce latency   http://ceph.com/

 

QuadStor is another option which is fully open source and has a lot of options available for fiber/iscsi and even options for NFS and CIFS/SMB based storage to be presented.  Supports ODX which is a bonus if you are going to be using Windows 2012 R2 or above for your storage needs (see accelerated write info for ODX) and it has the ability to present storage for use in clusters;  http://www.quadstor.com/

 

In all honesty, I would need to know more about EXACTLY what you are attempting to accomplish and what feature sets you want to provide the end point device.  Here are some things I think you should review that might narrow down the choices for you;

  • Are you going to be running any virtualisation?
  • Do you require Cluster/HA storage availability (RDM for VMWare / vHBA for HyperV etc etc)
  • Do you require block storage only?
  • Do you require any NFS/SMB based 'shared' storage
  • Management overhead (e.g are you looking for easy or hands on)
  • Connectivity for the end point device e.g Fiber/Ethernet? (Fiber is handled at the back end, what are the clients connecting via?)
  • Do you require highly available disks or is this simply for testing/verification?
  • Do you need features such as thin provisioning, deduplication or replication/mirroring?
  • How are you looking to handle hardware failures and replacements?
  • Are you running RAID, Direct connect SAS or HBA Pass-through based disk options?

If it was me, I would be looking at using ScaleIO or QuadStor depending on how familiar you are with the management of the operating system it resides on so are you Windows or Linux based administrator?

Hardware wise would be dependant on if you want to go full Fiber or use 10Gb ethernet with iSCSI, iSCSI is usually a cheaper option but do NOT skimp on the network devices you are using.  Let me know what your preferred is and I will give you some hardware config's I have used in the past to accomplish similar things.

 

Without more information its going to be hard to give you a definitive answer.  Big reply for a first post haha.  Feel free to ask me anything in this area though :)

Please quote or tag me if you need a reply

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×