Artificial intelligence (AI) and machine learning (ML) have taken the globe by storm, and it doesn’t seem to be slowing any time soon. While AI has primarily dominated the high-tech industry, it’s starting to integrate into other markets, such as healthcare, finance, and education. In fact, NVIDIA’s healthcare vice president, Kimberly Powell, claims that AI-powered discovery and diagnosis is rising towards 50% of NVIDIA’s medical applications. Universities, drug giants, and biotech have AI centers that use NVIDIA technology to screen for new drugs or anticipate mutations of the Covid-19 virus. Suffice it to say, NVIDIA has positioned themselves nicely to benefit from computing needs required by AI models like ChatGPT. Furthermore, IDC predicts that “75% of large enterprises will rely on AI-infused processes to enhance asset efficiency, streamline supply chains, and improve product quality.” There’s also a strong interest in computer vision tasks and generative AI. Forrester predicts that 10% of Fortune 500 companies will generate content with AI tools in 2023.
More and more companies are preparing for widespread AI adoption and its repercussions. Understanding this market demand, Meta is restructuring its data center projects to prepare for an AI-powered future. This strategic shift will prepare Meta for the future of high-performance computing and AI reliance. In addition, this restructure will accelerate Meta’s implementation of liquid cooling to support their AI-powered data center operations. At Open Compute Summit, Meta already outlined a roadmap towards a liquid-cooled infrastructure. So, it’s not surprising that the prominence of AI/ML could accelerate those plans. Meta’s plans highlight how impactful AI/ML will be on the data center market. It’s clear that the benefits of AI have arrived. Let’s dive into how data centers will have to follow suit to stay competitive.
Liquid Cooling Accelerates AI/ML Innovation
It’s no surprise that the data center market has seen immense growth over the past few years. The increase of big data isn’t slowing any time soon, and new technologies that utilize AI/ML only increase those workloads. With the increased AI integration, liquid cooled data centers will be necessary to support future projects. In 2022, Forbes predicted that liquid cooling will be an inevitable solution anticipating that the increase in edge and unmanned data centers, especially those in harsh climates, will require liquid cooling to function efficiently. AI workloads have contributed greatly to liquid cooling adoption, and liquid cooling is sure to play a pivotal role in both advancing AI research and facilitating development.
Dense AI Workloads Require High-Powered Processors
The reliance on AI/ML brings several concerns to ensure device longevity and scalability. The dense AI workloads require advanced processing capabilities to ensure optimal performance. As a result, chip manufacturers are focusing on high-powered AI processors to fulfill the demands of high-performance computing. AMD announced a range of new AI-centered chips at CES 2023 and debuted a new MI300 chip that promises to provide a fivefold performance-per-watt increase for AI workloads. Furthermore, a new AI architecture, coined XDNA, is changing how AMD designs future chips. AMD CEO, Lisa Su, claims the AI architecture empowers AMD to scale AI “to edge devices and into the cloud” and that “AI is truly the most important megatrend for future tech.” Intel also unveiled its 4th Gen Intel Xeon scalable processors that aim to offer new computing capabilities for AI, the cloud, the network, edge, and some of the world’s most powerful supercomputers. Specifically, the processors aim to unlock new performance levels for AI training and workloads.
In addition, edge computing is driving AI chip development. Edge computing is an extremely viable option to support AI/ML technologies and advancement. For context, AI/ML applications perform well when data is being processed close to where it’s created. Consequently, the data center market relies heavily on edge computing to fuel future AI/ML projects. As a result, chip manufacturers such as SiMa.ai, Hailo Technologies, AlphaICs, Intel, AMD, and NVIDIA are preparing for edge AI chips and production. Global Market Insights even predicts edge AI will top $5B in 2023.
However, developing new accelerators is only part of the battle. AI integration alongside higher processors and edge computing technologies result in denser workloads for data centers. Unfortunately, with increased density comes increased heat that must be adequately cooled to prevent hardware damage and poor performance. As a result, there is a strong interdependence between AI, edge, and liquid cooling to achieve a profitable gain. Data center operators must extend TDP limitations to keep pace with advanced AI chips and edge computing. Fortunately, new liquid cooling solutions can support more than 1,000W TDP helping break thermal barriers and support AI advancement. Enterprises that plan to use AI technologies need to invest in supportive liquid cooling solutions that provide flexibility for AI/ML innovation.
Scalable and Repeatable Solutions Are the Future
Data center infrastructure must be scalable and repeatable to be competitive in the market. However, the significant adoption of AI in data centers forces operators to account for AI scalability in data center design. This refers to both geographic and load scalability. Specifically, data centers must be easily repeatable and scalable in their hardware and software components. Subsequently, AI software needs to quickly speed up its performance to match the desired computing power or project requirements. This means that there will be large fluctuations in compute loads. Investing in scalable liquid cooling solutions that are flexible with project-specific needs will be necessary to support an AI-powered ecosystem. However, the full potential of scalable AI may only be achievable if there is some standardization and integration across all market devices. Cooling solutions that require little infrastructure changes and fit inside an air-cooled form factor will help promote scalability, enabling companies to implement AI faster and without expensive facility upgrades.
Increase Device Longevity and Support Sustainable Practices
Sustainability has been a key trend in the data center market for the past few years. Unfortunately, AI technology can both help and hurt sustainability efforts. On the one hand, the emissions incurred from running large machines will contribute to global warming. On the other hand, society could benefit greatly from the simulations and models run on advanced AI. Furthermore, scalable AI adds another level of environmental concern.
For now, it’s hard to nail down the exact amount of energy needed to run large, long models. But, for context, an industry expert mentions that some of largest language models can take up to an entire rail-car worth of coal to train. Simply training the AI systems can require an immense amount of energy, and that’s before they start performing their end functions. In addition, Gartner predicts that without sustainable AI practices, AI will “consume more energy than the human workforce, significantly offsetting carbon-zero gains” by 2025.
Beyond the emission increase, developing AI chips is also extremely resource intensive. The chip shortage during the start of the COVID-19 is still reverberating to this day, hindering the chip supply and demand chain. The overwhelming scarcity of resources only heightens the need for device longevity, especially for AI-powered data centers. Investing in an AI infrastructure is already an expensive endeavor. With prices upwards of $50K just to get started, enterprises must ensure their hardware will stand the test of time. Fortunately, liquid cooling solutions can also help increase device lifetime while also cutting AI operational costs. Subsequently, liquid cooling makes AI implementation much more feasible for enterprises looking to move towards an AI-centered ecosystem.
Conclusion
Overall, these trends indicate the world is preparing for an AI-driven future, whether that be running advanced computing simulations or simply generating artwork for social media. As more and more users adopt these compute-intensive AI/ML solutions, data centers increasingly need efficient cooling techniques to deal with extreme heat loads. In this regard, liquid cooling systems are becoming a viable and increasingly popular solution for large-scale AI/ML applications. With their total cost of ownership being lower than air-cooled systems, liquid cooling technologies offer a uniquely efficient approach to reducing the operating costs while increasing the chip performance associated with AI-driven technologies.