Jump to content

SAS HBA/RAID Mezzanines; OCP vs. QCT... WHY???

Phas3L0ck

We all know that servers are growing in component density by the day, and with that there's less and less space to install add-in cards and devices that may be necessary for a specific application... but is this really a problem among machines that sit in a rack all the time and do nothing more than information storage and processing? Not likely.

 

Some servers have reached a complexity point at which add-on PCI devices are often "nice to have," but not always necessary. For example, it's very commonplace to have a RAID card or HBA for storage systems using high grade hard drives to accommodate varying requirements, from large file structures to high speed, real-time data operations.  And when speed becomes a priority, and even the slightest hint of solid state media plays a part, it's time to consider a brand new 10Gb network adapter for instant communication between systems at speeds up to 1.2GB/second, (assuming the classic 8/1 bit/byte ratio) ideal for virtual system operations, multimedia distribution, and general purpose high volume data transfers. There are larger interface sizes for system loads which few have achieved or maintained for anything other than extremely converged infrastructure... (more expensive adapters available are capable of 25Gb, 40Gb, 50Gb, 56Gb, 80Gb, 100Gb, and recently, as high as 200Gb speeds)

 

On select motherboards, companies offer what you could call a 'package deal' with advanced features fully integrated to the board and pre-configured for quick and easy use right out of the box. Things like IPMI and dedicated system management, (standard on newer boards) multiple gigabit LAN ports or an upgrade to hybrid 1/10GE LAN, or even a dedicated 10Gb NIC to replace GE LAN all together, are increasingly common in the server industry. Some systems also have an SAS RAID HBA chip and dependent components (like DRAM or cache) built onto the mainboard. There are some minor drawbacks that can drive away attention from these boards-- things like having an SAS3108 but only being compatible with up to a 145W CPU (which isn't a problem if you're a normal guy that wouldn't care for the Xeon E5-2687W v4) or a board that is compatible with up to a 160W CPU, but only has a SAS3008 chip, which doesn't have cache or RAID-5/6/50/60 support (but maybe you don't care if you intend to use Linux and software RAID). And then there's the architectural aspect; format and form factor. SSI CEB, EEB, or TEB? You tell me. (I prefer ATX/EATX.) But that's another topic for another place and time.

 

Why do some boards have integrated extras, but not all? Reasons include, but are not limited to; cost, convenience, component availability, IP rights, design complexity, and potential benefit to the end customer...

 

Because of this, there is a debate among some as to "PCIe or not PCIe" and the degree of effort that should be put forth to make expansion possible and for which systems.

As a result, Mellanox was the first to introduce the ConnectX v3 as a mezzanine network adapter using the OCP v0.5 interface by Open Compute Project. Because the OCP connector appears to be simply a more dense version of the PCIe x8 slot, OCP created an opportunity to build more devices residing just above the motherboard without a riser, using a shared standard. Companies like Tyan and Gigabyte have developed storage controllers (custom variants of current ones, that is) in Mezzanine form factor using the OCP standard. However, like PCI-X, not all new things catch on or are what they're intended to be. Although Dell, HP, and several others have mostly dropped production of proprietary LAN and SAS/RAID adapters in the wake of OCP, not a lot of boards make use of mezzanine connectors, partly due to enhanced PCIe riser/horizontal slot designs being more favorable for some systems. It's also because another type of mezzanine connector increasingly popular, though it is still only used by a handful of board designs just like OCP. Most manufacturers would refer to it as just another PCIe x8 slot, but it's actually called QCT. (see attached document for details) Supermicro made a fairly reasonable but also questionable move by staying out of this mess of what I would call "connector wars" and primarily sticking to regular PCIe slots, but they also have their own proprietary mezzanine adapters and compatible boards floating around... the interface is still PCIe x8, but the connector looks like a longer, inverted OCP slot, which I can't find any further info on. One would imagine this unusual variant is even less desirable and thankfully never caught on. The only use it had was a couple of custom HBA cards Supermicro made for the less-than handful of boards the connector was incorporated into.

 

Here's a quick list of OCP and QCT HBA mezzanine cards to demonstrate what companies are doing, and how well they're doing it...

 

OCP:

Tyan M7076-3008-8I

 

Gigabyte CRAO338

 

QCT:

Quanta QS-3008-8i-IR

 

Gigabyte CSA6648

 

Dell LSI 2008 SAS Mezz for PowerEdge C6150 and C6220

 

And an honorable mention, the Gigabyte CSA6548, the one and only SAS mezzanine with external SFF-8644 ports.

 

The fact that there are more QCT SAS adapters goes to show that OCP was intended for network cards, but still has potential elsewhere. Similarly, there are a lot of SAS adapters for QCT, but a lot more network adapters still. It's worth pointing out the Gigabyte attempted to produce both OCP and QCT variants of almost the same SAS adapters, but it's clear the QCT models "came out of the oven better," so to speak. One could rationalize that design complexity was the cause of the epic failure of Gigabyte's OCP SAS card, but another would argue that, knowing how Tyan got their version right and with a much simpler, more effective design, Gigabyte either didn't care or wasn't trying hard enough.

 

So what would you use and why?

Should manufacturers use OCP or QCT, or stick to vanilla PCIe slots? You decide.

 

Let me know if I should add or change anything here...

QCT Mezzanine Card Portfolio-v1.2.pdf

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Phas3L0ck said:

Although Dell, HP, and several others have mostly dropped production of proprietary LAN and SAS/RAID adapters in the wake of OCP, not a lot of boards make use of mezzanine connectors, partly due to enhanced PCIe riser/horizontal slot designs being more favorable for some systems.

HPE still do them and is used on the Proliant series of servers, the 'internal' RAID card are those but in every other aspect are a regular RAID card of the same model in PCIe slot form factor, QCT. The HPE internal NICs are QCT as well, used slightly longer than the SAS RAID cards which were inbuilt in to the motherboard before (Gen8).

 

I.E. HPE Smart Array P440ar vs HPE Smart Array P440. Both are PCIe using 8 lines, ar model is split dual x4 connector.

i00086157.png

https://buy.hpe.com/pdp?prodNum=726740-B21&country=us&locale=en&catId=329290&catlevelmulti=329290_5317224_7109730_7274889

 

 

i00086177.png

https://buy.hpe.com/pdp?prodNum=820834-B21&country=us&locale=en&catId=329290&catlevelmulti=329290_5317224_7109730_7274881

Link to comment
Share on other sites

Link to post
Share on other sites

38 minutes ago, Phas3L0ck said:

So what would you use and why?

Should manufacturers use OCP or QCT, or stick to vanilla PCIe slots? You decide.

Also forgot to answer this, QCT, easily. I trust the PCIe spec to get updated and technologically advance at a better rate with more features than I do of OCP. Internal default RAID cards and NICs don't need to be or use PCIe slots as well as that limits form factor options and number of PCIe devices for things like hybrid blade chassis. Dedicated slots or places for more default/standard components that also require different options that don't use up PCIe slots makes sense, it does limit certain re-usability scenarios but all of those I can think of are post decommissioning lab use at homes etc where more open non proprietary form factors are much easier to work with. 

Link to comment
Share on other sites

Link to post
Share on other sites

The question is to what gain? If they all use the same communication protocols and the only real difference is the connector shape I could see this as beneficial depending on the use case. If I needed more SATA ports and I could attach a little surface mount board with a couple SFF-8087 or SFF-8643 ports to the right side of the board and that freed up a standard PCIe slot then great. But daughterboards take up more space on the surface of the motherboard so unless they were stack-able like raspberry pis then for expanding rear connectivity you'd be extremely limited.

 

I'm currently building a server that requires 40 SATAIII ports. I'd have to purchase a very proprietary system for a server to have the space for such daughterboards on the right side of the board or have that many built in (I have a hard time imagining that would exist)

 

Considering open standard/non-proprietary boards can come with up to 8+ PCIe slots of varying bandwidth I don't see surface mount boards becoming the mainstream except for in proprietary form factor systems like leadeater discussed.

 

So I see a market for both. If density (U size) is more important than connectivity these are very useful form factors. If expandability (within a single box) is more important then keeping PCIe is also a good idea.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Also forgot to answer this, QCT, easily. I trust the PCIe spec to get updated and technologically advance at a better rate with more features than I do of OCP. Internal default RAID cards and NICs don't need to be or use PCIe slots as well as that limits form factor options and number of PCIe devices for things like hybrid blade chassis. Dedicated slots or places for more default/standard components that also require different options that don't use up PCIe slots makes sense, it does limit certain re-usability scenarios but all of those I can think of are post decommissioning lab use at homes etc where more open non proprietary form factors are much easier to work with. 

I totally agree with you. The popularity and active development of QCT makes it very desirable, but if you look closely enough some of the designs are still as half-baked as Gigabyte's "idea" of an OCP SAS card. OCP is still the leading standard for LAN cards, but for SAS, I'm honestly torn between to two interfaces since Tyan and Quanta are head to head in terms of quality and amount of effort for SAS Mezz design. Personally, I'll stick with OCP, simply because that's what my S7086 uses for both LAN and SAS. I really wish someone would review Tyan's SAS OCP HBA for comparative reasons. At the very least Tyan fully accounted for the surrounding structures beyond the OCP SAS card and only put chips on one side-- the right side; the top.

23 minutes ago, Windows7ge said:

The question is to what gain? If they all use the same communication protocols and the only real difference is the connector shape I could see this as beneficial depending on the use case. If I needed more SATA ports and I could attach a little surface mount board with a couple SFF-8087 or SFF-8643 ports to the right side of the board and that freed up a standard PCIe slot then great. But daughterboards take up more space on the surface of the motherboard so unless they were stack-able like raspberry pis then for expanding rear connectivity you'd be extremely limited.

So I see a market for both. If density (U size) is more important than connectivity these are very useful form factors. If expandability (within a single box) is more important then keeping PCIe is also a good idea.

That's one of the golden reasons for making and owning a mezzanine storage adapter.

Stackable? I DREAM of seeing that on mezzanine cards!

 

Yeah density is always a factor whenever you're building a server with lots of features. And there has to be room for airflow to cool the CPU and other components, so goodbye risers. I'm seeing less and less need or realistic use for expandability beyond a few slots, so it would make sense to mezzinate more devices.

 

Now if only someone would think to design an MXM slot into a server board so that people could use any slim/mobile GPU they want, we'd be all set.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Windows7ge said:

I'm currently building a server that requires 40 SATAIII ports. I'd have to purchase a very proprietary system for a server to have the space for such daughterboards on the right side of the board or have that many built in (I have a hard time imagining that would exist)

 

Considering open standard/non-proprietary boards can come with up to 8+ PCIe slots of varying bandwidth I don't see surface mount boards becoming the mainstream except for in proprietary form factor systems like leadeater discussed.

Outside of storage servers the number of slots you actually want populated is small and you want to get as many servers in to the space as possible. It's ideal to have the OS stored on mirrored M.2 or SD Card/USB and an internal mezz or inbuilt SAS card for storage (if at all needed) with any PCIe slots only being used for Infiniband/Ethernet cards or GPUs/Accelerators.

 

In a half width 1U hybrid blade you have very few PCIe slots, the half width 2U ones are slightly better but those are typically used for GPUs. Converged IB + Ethernet NICs are a god send in these cases because if you need to independent dual path everything that's 4 NIC cards required, dropping down to 2 means you could actually do it where it wouldn't otherwise be possible.

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, leadeater said:

Outside of storage servers the number of slots you actually want populated is small and you want to get as many servers in to the space as possible. It's ideal to have the OS stored on mirrored M.2 or SD Card/USB and an internal mezz or inbuilt SAS card for storage (if at all needed) with any PCIe slots only being used for Infiniband or Ethernet cards.

 

In a half width 1U hybrid blade you have very few PCIe slots, the half width 2U ones are slightly better but those are typically used for GPUs. Converged IB + Ethernet NICs are a god send in these cases because if you need to independent dual path everything that's 4 NIC cards required, dropping down to 2 means you could actually do it where it wouldn't otherwise be possible.

Yep, this is what I mean by use case. Let's say my server wasn't for storage. Let's say it was for compute. CPU specifically. I'd probably only have use for 1 PCIe slot IF THAT. Then it'd fit in a 1U chassis with ease.

 

The daughter boards would also make for great modularity so you could pick & choose what you need in that tight form factor.

 

I forget what they're called but what could work well in tandem with a file server that reduced or eliminated the PCI form factor are the boxes that are pure storage. They connect via like an external PCIe x8 cable (we're drifting out of my territory I'm kind of throwing guesses here) this would allow a very slim server to handle mass storage (of course the storage box would take up it's own space in the rack but you get my point).

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Windows7ge said:

I forget what they're called but what could work well in tandem with a file server that reduced or eliminated the PCI form factor are the boxes that are pure storage. They connect via like an external PCIe x8 cable (we're drifting out of my territory I'm kind of throwing guesses here) this would allow a very slim server to handle mass storage (of course the storage box would take up it's own space in the rack but you get my point).

There isn't really a special name for them, they're just external disk shelves. You can connect them up to a larger 3U/4U server with internal disks to add more or just use a server with none. Still SAS btw just a different connector type, couple of them since they are different for each SAS generation plus sometimes some extra odd ball ones because... why not.

 

sas-cable-external-sff-8644-to-sff-8088.

SAS3 12Gb left SFF-8644 and SAS2 6Gb right SFF-8088

 

Nice thing about those external shelves is they have SAS expanders in them so you can chain shelves off a single external port, usually dual pathed though but all one giant logical chain up to the max number of disks SAS supports then you just and another card and another chain and keep going all hanging off a single server.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Windows7ge said:

I forget what they're called but what could work well in tandem with a file server that reduced or eliminated the PCI form factor are the boxes that are pure storage. They connect via like an external PCIe x8 cable (we're drifting out of my territory I'm kind of throwing guesses here) this would allow a very slim server to handle mass storage (of course the storage box would take up it's own space in the rack but you get my point).

If I'm reading this right I think I know exactly what you're referring to. It's called Direct Attached Storage, but the assemblies that perform this function are more commonly referred to as JBOD boxes. This is one of the reasons that we still have those awful, crappy 1U servers. Unless you seriously know what you're doing and have a huge controller card with DRAM and cache protection, it's a pretty dirty way to go, but overall it works.

 

And actually, SFF8643 and 8644 run PCIe x4. Why do you think those strange U.2 NVMe things are possible?

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Phas3L0ck said:

And actually, SFF8643 and 8644 run PCIe x4. Why do you think those strange U.2 NVMe things are possible?

I thought that was two different standards using the same connector but electrically different, because you can get both SFF-8643 and SFF-8644 ports and cards that do and don't support PCIe.

Link to comment
Share on other sites

Link to post
Share on other sites

Yep those are what I meant. For an application like cluster computing CPU/GPU a new standard for the shape of the PCI_e slots could definitely help push the density even farther. However we already have servers cramming 4x GPU's in a 1U server so I do see such technological advances only being useful in select applications.

 

I do like the idea for moving AIC's that don't require external access (internal RAID/HBA controllers, PCI_e SSDs, etc) so as to limit the necessary slots for external access (NIC's, SAS expanders)

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×