Run.ai partners with Nvidia as it sets its sights on inferencing

news@insiderapps.com

Published on 07/23/2022

Run.ai partners with Nvidia as it sets its sights on inferencing

The well-funded AI workload orchestration firm Run.ai gained notoriety in the previous few years by assisting its users in making the most of their GPU resources, both on-premises and in the cloud, to train their models. It is common knowledge that developing models is one thing, but putting them into use is quite another. Unfortunately, this is where many of these projects still fall short. The company, which views itself as an end-to-end platform, is now going beyond training to support its customers in executing their inferencing workloads as effectively as possible, whether in a private or public cloud, on edge, or wherever, which may come as no surprise. As a result of close cooperation between the two businesses, the company's platform now also integrates with Nvidia's Triton Inference Server software.

The goal is to make the deployment of business models as simple as feasible. Run.ai guarantees a two-step deployment procedure without the need to create YAML files. With the new Nvidia integration into the Run.ai Atlas platform, users can even deploy multiple models — or instances of the same model — on the Triton Inference Server, with Run.ai, which is also a part of Nvidia's LaunchPad program, handling the auto-scaling and prioritization on a per-model basis.

Nvidia's Manuvir Das, the company's VP of corporate computing, said that these models are becoming increasingly huge and that installing those on a CPU just isn't viable, even though inferencing doesn't demand the same sorts of massive computational resources that it takes to train a model.

Models will only become more complicated over time, Geller added. He pointed out that there is, after all, a clear relationship between a model's correctness and computing complexity and, consequently, the issues that a company may resolve using those models.

Although Run.ai first concentrated on training, it was able to use many of the tools it developed for that purpose for inferencing as well. The resource-sharing mechanisms that the business developed for training, for instance, also apply to inferencing, where specific models require additional resources to operate in real-time.

It's important to note that Run.ai today also announced a number of other platform enhancements in addition to the Nvidia relationship. These feature brand-new KPIs and dashboards that are centred on inference, as well as the ability to deploy models on fractional GPUs and automatically scale them in accordance with each GPU's unique latency Service Level Agreements. Now that installations can scale all the way down to zero, the platform can help lower costs.

About Nvidia

Nvidia Corporation is a technology company for designing and manufacturing graphics processing units (GPUs). Jen-Hsun "Jensen" Huang, Curtis Priem, and Chris Malachowsky formed the business in 1993. Its headquarters are in Santa Clara, California.

The creators of Nvidia anticipated that a dedicated GPU would be necessary to advance computer graphics. Computer games used to be fully CPU-based. However, gaming technology was improving and gradually switching from MS-DOS to Windows. The CPU's math coprocessor was just insufficient to handle the amount of floating-point math work required for graphics, especially 3D graphics.

About Run.ai

Run.AI created the first compute-management platform for AI. Run.AI gives data scientists visibility and control over resource prioritization and allocation while streamlining processes and minimizing infrastructure hassles by centralizing and virtualizing GPU compute resources. The productivity of data science teams is significantly increased; as a result, enabling them to develop and train concurrent models without resource constraints. This also ensures that AI projects are aligned with business goals.

Real-time visibility is gained while IT and research teams maintain control. This includes observing and provisioning each job's run-time, queueing, and GPU utilization. Using a virtual pool of resources, researchers may view and distribute compute resources across various sites, whether on-premises or in the cloud.

Profile picture for user news@insiderapps.com
Peter Daniels
Peter Daniels is the lead journalist for InsiderApps.com


The business app store.
All the best web apps you need for your business. Curated and compared.
1,000+ Apps for every business category you can imagine. We independently review and compare software applications to find you the best ones for you what you need.
To accomplish your goals, you need the right tools.

interview news apps

Mesibo

Real-time communication and chat integration

Corpay One

Bill pay automation & spend management solution

HostGator

Web Hosting and Domain Service Provider

PandaDoc

Rapid document workflow personalization

Affise

Partnership Marketing Platform

Infinity

Customizable work management platform

NapoleonCat

Social media engagement and customer support platform

Backendless

Visual app development platform

AWeber

Email marketing and page builder platform

Linktree

Social Media Landing Page Service

Factoreal

Omnichannel Customer Engagement Platform

HiPeople

Talent Acquisition and Insights Platform