Photo credit: venturebeat.com
Rapt AI, known for its AI-driven automation solutions for GPUs and AI accelerators, has partnered with AMD to improve the landscape of AI infrastructure.
This long-term strategic partnership is aimed at enhancing the management and performance of AI inference and training workloads on AMD’s Instinct GPUs, thereby providing customers with a scalable and economically viable approach to deploying AI applications.
As the implementation of AI technology continues to rise, many organizations are facing challenges related to resource allocation, performance bottlenecks, and the intricacies of GPU management.
By integrating Rapt’s advanced workload automation platform with AMD’s Instinct MI300X, MI325X, and the forthcoming MI350 series of GPUs, this collaboration seeks to deliver a solution that is both scalable and cost-effective, allowing customers to optimize AI inference and training efficiency across on-premises and multi-cloud environments.
A more efficient solution
AMD Instinct MI325X GPU.
According to Charlie Leeming, CEO of Rapt AI, “Today’s AI models are increasingly large, dynamic, and unpredictable, rendering traditional optimization tools inadequate. We’ve observed this trend, and organizations are investing heavily in AI talent, driven by concerns from CFOs and CIOs regarding the return on investment. Some sectors are spending tens to hundreds of millions or even billions on GPU infrastructure.”
Leeming pointed out that Anil Ravindranath, Rapt AI’s CTO, identified potential solutions that include deploying monitoring tools to facilitate infrastructure observation.
“We believe we have the right solution at the right time. We emerged from stealth last fall and are now working with an expanding number of Fortune 100 companies, including two utilizing our code within leading cloud service providers,” Leeming noted.
He added, “We have established strategic partnerships, and our discussions with AMD were particularly productive. They produce exceptional GPUs and AI accelerators, and we excel at maximizing GPU workloads. Inference is currently in a production phase, and the demand for AI workloads is surging. Data scientists are racing against time and require efficient automation tools to address significant inefficiencies, including a staggering 30% underutilization of GPU resources. There’s a clear need for flexibility from large customers, and many are inquiring about AMD support.”
Improvements that once took nine hours can now be completed in a mere three minutes, according to Ravindranath, who emphasized that the Rapt AI platform can enable up to ten times greater model run capacity while keeping AI compute spending consistent. This translates to up to 90% in cost savings and eliminates the need for human intervention and code changes. The resulting productivity means less waiting for computing resources and reduced time spent on tuning infrastructure.
Leeming acknowledged that existing techniques have not adequately addressed these issues. While competitor Run AI operates in a similar space, he affirmed that Rapt AI focuses on accommodating unpredictable results, assessing infrastructure in minutes rather than hours, and optimizing performance accordingly.
“Our approach involves running the model to derive solutions, which is particularly advantageous for inference workloads. It needs to occur automatically,” Ravindranath stated.
The benefits: lower costs, better GPU utilization
AMD Instinct MI300X GPU.
AMD Instinct GPUs, featuring outstanding memory capabilities, combined with Rapt’s smart resource optimization, promise to enhance GPU utilization for AI workloads, significantly reducing total cost of ownership.
The Rapt platform simplifies GPU management, freeing data scientists from the burden of trial-and-error infrastructure adjustments. By automatically optimizing resource allocation tailored to specific workloads, it allows for greater innovation focus rather than technical infrastructure issues. Furthermore, it elegantly supports a range of GPU environments—AMD and others—whether in the cloud, on-site, or in hybrid setups, ensuring maximum infrastructure flexibility.
This joint solution effectively streamlines job density and resource allocation for AMD Instinct GPUs, leading to enhanced inference performance and scalability for real-world AI applications. Rapt’s auto-scaling features further bolster efficient resource management according to demand, minimizing latency and maximizing cost-effectiveness.
The partnership has resulted in a platform that works seamlessly with AMD Instinct GPUs, leading to immediate performance enhancements. Continued collaboration between Rapt and AMD is expected to foster further advancements, especially in areas such as GPU scheduling and memory utilization, equipping customers with a future-ready AI framework.
Negin Oliver, AMD’s corporate vice president of business development for data center GPU, commented, “At AMD, our goal is to provide high-performance, scalable AI solutions that enable organizations to realize the full potential of their AI workloads. Our collaboration with Rapt AI merges the innovative features of AMD Instinct GPUs with Rapt’s intelligent workload automation, facilitating improved efficiency, flexibility, and cost savings across AI infrastructures.”
Source
venturebeat.com