To date, deploying enterprise computing has meant using three tier systems, storage arrays and networking. Many companies saw the need to remove costly integration costs and devised “converged infrastructure” (CI).

CI has given way Hyper Converged Infrastructure (HCI). HCI brought new densities and deployment models with the advent of the Software Defined Datacenter (SDDC). HCI required even greater emphasis on the network as bandwidth demands increased due the increased density of the workloads.

Many of the HCI platforms are CPU rich. But real estate confined how many networking adapters/ports could be deployed per node.

HCI continues to be a viable platform to deploy very dense workloads especially on all flash systems.

But new choices are emerging for enterprise-scale storage and server infrastructures, driven by the pioneers of the web called hyper scale computing.

As organisations scale their physical infrastructure, two choices are available to manage that growth – scale-up and scale-out. Scaling up means increasing the resources within the server or storage device, for example adding processors, interface cards and memory. That’s the traditional enterprise model.

The problem with this way of doing things is that redundancy in a single device translates into high cost. The alternative is to scale out, adding additional servers and storage in a distributed computing environment.

Hyper scale storage/computing is the term coined to represent the scale-out design of IT systems that caters for very high volumes of processing and data. Rather than use multiple redundant components within a device, the level of redundancy becomes the server itself, including its storage.

In hyperscale computing, the server and its storage becomes the basic unit, and forms part of a network or grid of thousands of physical devices. Typically, the server is not built with redundant components and in the event of a failure, its workload is failed over to another server. The faulty device can then be removed for replacement and repair.

Hyperscale computing environments work with multi-petabytes of storage and tens of thousands of servers. To date, they have mainly been used by large cloud-based organisations such as Facebook and Google. The use of commodity components, allows for the much higher scale required of these environments while keeping costs as low as possible.

Hyperscale computing designs work well for such emerging web companies because they take a new approach to database and application design. These environments typically consist of a small number of very large applications, in contrast to typical enterprise IT environments where there are a larger number of specialized applications.

The key behind hyperscale is NVMe (Non Volatile Memory express).

NVM Express allows levels of parallelism found in modern SSDs to be fully utilized by the host hardware and software. As a result, NVM Express reduces I/O overhead and brings various performance improvements in comparison to previous logical device interfaces, including multiple, long command queues, and reduced latency.

Reducing I/O latency is key in HCI systems. The lower the latency the fewer outstanding I/O’s there are backing up in memory. This means you can more workloads (VM’s, Containers) on a single system.

Now the danger in reducing I/O is your hypervisor vendor will hate you. Rather than run 8 servers at 10% CPU consumption you will only need perhaps 3 at 80%. So sorry license by the socket vendors.

We at Rugged Cloud got to thinking a while back, what if we combined NVMe and SSD’s in the same server. Provided wicked amounts of bi-sectional bandwidth that would allow the servers to scale horizontally over a PCIe fabric while still running traditional virtual machine workloads.


A lot of the HCI vendors need to worry about power and cooling. So they lower the core count on CPU’s and of course use less DDRX memory. Memory is hot, requires a lot air to cool it.

We didn’t have that problem in a single motherboard based system. So we threw Intel V4, dual E5-2699v4 (22 cores) CPU’s in the system and up to 1.5TB of DRAM (64GB x24).

Now in hyperscale, we have to feed this thing. So we added a lot of I/O bandwidth and massive I/O for both the SSD sub-system and the new NVMe based fabric.

Now we need to cluster them for hyperscale. How do we do that?

Get the paper of Rugged Cloud Hyper Scale Computing