Last week at Hewlett Packard Enterprise (HPE) Discover, the company announced its entry into the AI cloud market by expanding its HPE GreenLake portfolio. In 2019, HPE acquired supercomputing vendor Cray Inc.
At that time, Antonio Neri, the CEO of HPE, said, “Answers to some of society’s most pressing challenges are buried in massive amounts of data.” Yet supercomputing was reserved mainly for academic researchers or companies with deep pockets. The need for more advanced computing infrastructures grew exponentially as companies began adopting AI.
While cloud computing services hold the promise of infinite bandwidth, these services aren’t always the most efficient or cost-effective way to train large-scale models. The new HPE AI cloud service, called HPE GreenLake for large language models (LLMs), enables organizations to train, tune, and deploy AI models at scale and with security and compliance to foster responsible and sustainable AI adoption.
HPE GreenLake for LLMs runs on HPE Cray XD supercomputers. This new GreenLake offering eliminates the need for customers to purchase and manage supercomputers, which can be costly and complex. The offering leverages the HPE Cray Programming Environment and HPE’s AI and ML software suite, providing comprehensive tools for optimizing HPC and AI applications, training large-scale models, and managing data.
HPE teamed up with Aleph Alpha, a German AI startup, to offer ready-to-use large language models (LLMs) as part of its on-demand, multi-tenant supercomputing cloud service. HPE GreenLake for LLMs offers access to Aleph Alpha’s 13 billion parameters pre-trained large language model, Luminous, available in multiple languages. HPE GreenLake for LLMs enables enterprises to use their company data to train, tune, and deploy large-scale AI models using HPE’s AI software and supercomputers. By using a company’s databases, articles, and industry-specific data, businesses can ground the model to factual data and relevant user context to eliminate issues such as hallucinations. A hallucination in artificial intelligence (also called confabulation or delusion) is a confident response by an AI that is incorrect or does not seem to be justified by its training data.
HPE didn’t announce any specific support for OpenAI, but it’s clear that OpenAI is deeply involved with Microsoft. HPE’s offering provides another option in the market if a company is looking for an alternative to Microsoft or OpenAI. What does it mean for companies? One of the common challenges with training LLM is that they may not run or will stall during operation due to a lack of computing power. Insufficient computing power slows down the time it takes to train the model while increasing the cost of a model because of restarts.
Companies need a way to run HPC-focused machine learning applications without building and managing on-premises supercomputers. As noted above, another issue with using open-source LLMs today is that they are generic models trained on a large corpus of Internet data that may be inaccurate and lack specific industry data. Supercomputing for LLMs can address different sectors’ unique requirements and challenges by linking industry data with more generic LLMs. HPE said it plans to launch a series of industry and domain-specific AI applications in the future.
These applications will cater to various sectors, including climate modeling, healthcare and life sciences, financial services, manufacturing, and transportation. It makes sense to address these verticals because each industry has vast volumes of specific data. These industries also have use cases where the financial benefits of successful outcomes would far outweigh the cost of AI supercomputing.
Speed is another area where AI supercomputing provides interest. For example, detecting fraud and security threats in financial services transactions requires a combination of high data volume processing in compressed timeframes. You may be asking why you can’t use general-purpose cloud offerings for AI. The answer is you can for specific workloads, but not AI workloads are the same. HPC workloads require an AI-native architecture specifically designed to handle a single large-scale AI training and simulation workload at full computing capacity. Unlike parallel workloads, HPE GreenLake for LLMs supports AI and high-performance computing jobs on hundreds or thousands of CPUs or GPUs simultaneously. This infrastructure design allows for more effective and efficient AI training, resulting in more accurate models that accelerate problem-solving for enterprises. Now this doesn’t imply the hyperscalers are resting on their laurels.
Google recently announced an A3 supercomputer with Nvidia H100 GPUs, while Amazon recently updated its HPC7g instances. However, HPE’s offering should be particularly appealing to existing HPE GreenLake customers that are building hybrid cloud strategies and have HPC requirements.
GOOGLE CLOUD BLOG
Introducing A3 supercomputers with NVIDIA H100 GPUs | Google Cloud Blog Sustainability is another issue within the AI landscape. Most large enterprises that Lopez Research consults with are defining sustainability metrics, adding ESG monitoring solutions and actively seeking out ways to reduce energy consumption. The processing power required to support new AI models is at odds with these goals in many ways. As a result, businesses are leaning heavily on their infrastructure providers to deliver solutions that help address sustainability concerns. All the hyperscaler cloud companies are designing various infrastructure, cooling and optimization solutions to address this problem. HPE’s also in this camp. It said its GreenLake for LLMs operates in colocation facilities that utilize nearly 100% renewable energy. The HPE GreenLake for LLMs service requires specialized supercomputing data centers optimized for power and cooling, which means HPE GreenLake for LLMs will be available in North America by the end of the calendar year 2023, followed by Europe in early 2024. By partnering with QScale in North America, HPE ensures the service aligns with environmentally friendly practices.
However, not all AI use cases require supercomputing power. If your company doesn’t require supercomputing power but needs servers optimized for AI workloads, HPE updated its ProLiant Gen11 servers with 4th Gen Intel Xeon Scalable CPUs, offering improved performance for AI inference tasks. It’s exciting and surprising to see HPE bring a new as-a-service computing offering to the market during its Discover conference. HPE’s move aims to significantly shift the AI landscape by making HPC more accessible to a broader range of organizations. This type of service appears best suited for larger enterprises with large volumes of data and the money to pay for on-demand supercomputing, but it’s still a leap forward in the market. Even if a large enterprise could afford to build large-scale AI infrastructure, it’s well-known that there is a shortage of NVIDIA GPUs.
You can’t train models if you don’t have the proper hardware. It will be interesting to see what type of companies embrace these new services and what workloads will now be possible with access to increased processing power.