Companies are generating more data than they can store let alone analyze. Coupled with new devices which generate data outside the customers enterprise and the need for application flexibility where new analytical app’s can be rapidly stood up to cull nuggets from petabytes it is rather obvious that we need a new architecture and technologies to support data analytics.
At Rugged Cloud, we see a convergence of virtual High Performance Computing, Data Analytics and a new I/O memory technology called NVMe.
As companies scale their physical infrastructure, two choices are available to manage that growth – scale-up and scale-out. Scaling up means increasing the resources within the server or storage device, for example adding processors, interface cards and memory. That’s the traditional enterprise model.
Along came Hyper Converged Infrastructure which promises the convergence of compute, storage and networking into a single system.
HCI makes sense in a number of ways:
- Collapsing external components into a smaller system reduces Space Weight and Power.
- Can provide denser Virtual Machine support, note that mileage will vary.
- SDDC software coupled with a proper HCI system further abstracts and virtualizes the data center.
- Further adoption of automation s/w can decrease system administration costs.
What the vendors won’t tell you about HCI:
- HCI systems come with lots of CPU sockets. Your hypervisor vendors will love you.
- HCI systems are I/O poor. Systems designers have a real estate problem. Adding more CPU’s means something else has to be removed from the system. They can’t remove cooling (hopefully) or power. If you want to run a lot of VM’s you will need a lot of memory. So they can’t reduce the amount of memory DIMM’s. What’s left? I/O, especially I/O to disk and the network.
- The number of disk drives SSD or otherwise you can get in a single 2U HCI chassis varies. We have seen 12 and even 24 drives, with the exception of Rugged Cloud’s 48. More drives means more adapters to feed them. Given the real estate problem, you will have fewer adapters to drive high speed SSD’s which will affect VM performance.
- HCI tout VM density as a principal reason for HCI. This will require more storage. Where do you get that? Well, add external storage arrays. Not only does this violate the theme of HCI but external storage arrays require I/O. If you use NAS (iSCSI or NFS) you will need network adapters, or for block mode, fiber channel, where do you get the PCIe slots for those adapters?
The simple truth about HCI is it was never really designed for enterprise class workloads. Now I know your HCI vendors sales rep, who probably drives a really expensive BMW told you otherwise.
But think about this. Hardware costs are nothing compared to the software licenses. Therefore hardware net margins are chump change to HCI vendors. The real money, and the new shinny BMW your sales rep drives, comes from software net margins. The more software he or she sells, the more fancy lunches they can give you.
How does the BMW driving sale rep sell more software? He sells you more CPU’s. CPU’s that will never be consumed with a proper consumption number. So when you run out of memory on your fancy new HCI system, your BMW driving sale rep will sell you more underutilized CPU’s in more HCI systems. We call this the term life insurance for HCI. The “term” is the first HCI system. When you outgrow it (which you quickly will), the sale rep waltz’s in with a different “term”. You gotta love POC’s sans real TCO’s.
Now a few of you are saying i don’t care. I run all my “stuff” off-prem. That’s the cloud providers problem! You might want to check you cloud providers Service Level Agreements. Way back on page 100,091, it probably has some legal mumbo jumbo about best case or even worse, no guarantee’s again mileage will vary.
Regardless of on-prem or off, you are going to have an I/O problem, this will be especially true if you are running or looking at a hybrid cloud.
New choices are emerging for enterprise-scale storage and server infrastructures, driven by the pioneers of the web hyperscale computing environments. It’s called Hyper Scale Computing (HSC). Hyperscale storage/computing is the term coined to represent the scale-out design of IT systems that caters for very high volumes of processing and data. Rather than use multiple redundant components within a device, the level of redundancy becomes the server itself, including its storage.
Coupling HSC to NVMe and NVMe over Fabrics allows systems designers to overcome many of the issues inherent in HCI.
In hyperscale computing, the server and its direct-attached storage (DAS) becomes the basic unit, and forms part of a network or grid of thousands of physical devices. Typically, the server organized into clusters or fabrics which storage is NVMe for massive I/O and massive horizontal scale.
Hyperscale computing environments work with multi-petabytes of storage and tens of thousands of servers. To date, they have mainly been used by large cloud-based organisations such as Facebook and Google. The use of commodity components, allows for the much higher scale required of these environments while keeping costs as low as possible.
Hyperscale computing designs work very well for web companies because they take a new approach to database and application design. These environments typically consist of a small number of very large applications, in contrast to typical enterprise IT environments where there are a larger number of specialized applications.
New open-source platforms have emerged that offer storage and data services in hyperscale clusters. These include Ceph, a distributed storage platform, the Cassandra and Riak distributed database platforms, as well as Hadoop, a database and data analysis platform.
What if we found a way to “converge” both open source technologies such as Ceph with a VMware SDDC including VSAN and we did it on top of HSC clusters? This would provide a massively scalable object store that can accessed over the PCIe bus by traditional VM’s running on VMware.
This new architecture offers a different operating paradigm. Data is spread across the NVMe servers in a redundant fashion, enabling protection from any individual server failure and if a server fails there is no service impact. The use of multiple servers also distributes processing power across many devices, providing scale-out performance for large workloads.
Coupling HSC systems with HPC software then becomes a new possibility where in the past hyerscale and HPC was not for the faint of heart.
HPC has always been a niche technology reserved for universities and the government. HPC generated big amounts of data. Big data arguably originated in the global high-performance computing (HPC) community in the 1950s for government applications such as cryptography, weather forecasting, and space exploration. HPC moved into the private sector in the 1980s, especially for data-intensive modeling and simulation to develop physical products such as cars and airplanes. In the late 1980s, the financial services industry (FSI) became the first commercial market to use HPC technology for advanced data analytics (as opposed to modeling and simulation). Investment banks began to use HPC systems for daunting analytics tasks such as optimizing portfolios of mortgage-backed securities, pricing exotic financial instruments, and managing firm-wide risk. More recently, high-frequency trading joined the list of HPC-enabled FSI applications.
Coupling HSC with vHPC now allows for a completely different architecture and a new way to get at massive amounts of data in real time. HSC and vHPC represents the convergence of long-standing, data-intensive modeling and simulation (M&S) methods in the HPC industry newer high-performance analytics methods that are increasingly employed in these segments as well as by commercial organizations that are adopting HPC.
Companies can now employ either long-standing numerical M&S methods, newer methods such as large-scale graph analytics, semantic technologies, and knowledge discovery algorithms, or some combination of long-standing and newer methods. Given we have fully virtualized the HSC NVMe clusters with VMware technologies, we now have a rapid way to deploy different types of applications and workloads that can quickly get at massive amounts of data in real time.
There are some interesting factors driving businesses to adopt vHPC for big data analytics:
- High complexity. vHPC technology allows companies to aim more complex, intelligent questions at their massive volumes of data. Coupled with HSC and vHPC this can provide important advantages in today’s increasingly competitive markets. vHPC technology is especially useful when there is a need to go beyond query-driven searches in order to discover unknown patterns and relationships in data — such as for fraud detection, to reveal hidden commonalities within millions of archived medical records, or to track buying behaviors through wide networks of relatives and acquaintances. Given the complexity of HPC architectures and the need for massive I/O, we believe that vHPC and NVMe fabrics will play a crucial role in the transition from today’s static searches to the emerging era of higher-value, dynamic pattern discovery that reduces the complexity of traditional HPC clusters while providing massive bandwidth and storage in a small footprint.
- High time criticality. Information that is not available quickly enough may be of little value. The move to high-performance data analysis using NVMe and vHPC technology will provide a time critical and fully flexible architecture that hyper scales.
- High variability. People generally assume that big data is “deep,” meaning that it involves large amounts of data. They recognize less often that it may also be “wide,” meaning that it can include many variables. Think of “deep” as corresponding to lots of spreadsheet rows and “wide” as referring to lots of columns (although a growing number of high-performance data analysis problems don’t fit neatly into traditional row-and-column spreadsheets). A “deep” query might request a prioritized listing of last quarter’s 500 top customers in Europe. A “wide” query might go on to analyze their buying preferences and behaviors in relation to dozens of criteria. An even “wider” analysis might employ graph analytics to identify any fraudulent behavior within the customer base. A cluster must be able to be varied according to the type workload it will be asked to support. Normally this would require different physical HPC clusters. Using the Software Defined Data Center construct allows the designers to build a new vHPC SDDC where multiple virtual vHPC cluster run on the same physical cluster.
The move by traditional vendors into hyperscale is bringing new hardware platforms, management tools and support for organizations that perhaps would be wary of the home grown nature of hyperscale computing. This means we’re likely to see the adoption of hyperscale techniques and architectures supplant HCI across the industry. This won’t be on the scale of a Facebook or Google, but as companies evolve it will become a mainstream option in IT deployment.
And as hyperscale computing becomes more prevalent, it will be interesting to see how today’s incumbent vendors will deal with the challenge to their existing HCI product architectures.
To get the complete white paper on vHPC, Data Analytics and NVMe send an e-mail to email@example.com