Rafay and Netris: Partnering to speed up consumption and monetization for GPU Clouds

 

Rafay, a pioneer in delivering platform-as-a-service (PaaS) capabilities for self-service compute consumption, and Netris, a leader in networking Automation, Abstraction, and Multi-tenancy for AI & Cloud operators , are collaborating to help GPU Cloud Providers speed up consumption and monetization of GPU-based infrastructure by offering self-service workflows for model training, finetuning and inferencing use cases. By partnering with Rafay and Netris, GPU Cloud Providers can go from GPUs to an enterprise-grade GPU Cloud in weeks.

The state of the GPU Cloud industry

Telecommunications companies, data center operators and sovereign entities are racing to establish their lead in the AI space by deploying AI Clouds in their respective regions. With billions of dollars being invested across the globe, high-performance GPUs are now being made available in various geographies.

But as Cloud Service Providers (CSPs) such as Amazon Web Service, Microsoft Azure and Google Compute Platform have learned over a period of 10+ years, the key to success for Cloud Providers is the ability to deliver self-service consumption of compute. Without self-service consumption workflows being available, adoption becomes incredibly challenging. Without fast adoption and high utilization, achieving high return on investment is incredibly difficult, if not impossible.

Today, many of the 1st generation GPU Clouds do not offer self-service consumption workflows, and require human interaction in the backend to meet consumer needs.

Rafay and Netris

Where Rafay delivers an enterprise-grade PaaS experience for developers and data scientists through its workflow management, virtualization and substrate management (aka soft tenancy) capabilities, Netris delivers CSP-grade network automation to segment the network (aka hard tenancy), to ensure a high degree of flexibility and segregation such that multiple enterprises can co-exist in the same GPU Cloud without being impacted by each other at the network level.

The diagram below illustrates the join value proposition:

 

With Rafay and Netris, GPU Cloud Providers can enjoy the following benefits:

        • SKU Automation and Management
          Cloud Providers can programmatically define SKUs that consist of GPUs, CPUs, AI applications, or some combination thereof.
        • Self-Service Portals for Developers & Data Scientists
          Cloud Providers can offer self-service portals for developers and data scientists to consume compute and AI applications on demand.
        • Enterprise-grade User Management
          Cloud Providers can offer support for enterprise single sign-on (SSO) and role-based access control (RBAC) to ensure secure consumption, along with deep audit trails that can be exported to enterprise SIEMs.
        • Enterprise Administration
          Cloud Providers can sell blocks of compute to enterprises and empower them to govern their allocated compute block through persona-specific configuration management portals and dashboards.
        • Kubernetes Cluster Lifecycle Management & Platform Management
          Cloud Providers can easily manage fleets of Kubernetes clusters in their data center(s) or in public cloud environments. Furthermore, customers can deliver secure, multi-tenant environments that meet enterprise security requirements through the use of features such as virtual clusters, network segmentation, RBAC, secure remote access, policy enforcement, quota enforcement, immutable auditing, etc.
        • Virtual Machine Provisioning & Lifecycle Management
          Cloud Providers can easily manage fleets of virtual machines for scenarios where customers prefer to leverage off-the-shelf AI applications that require virtual machines as the substrate.
        • Usage and Chargeback Data
          Cloud Providers get turnkey access to chargeback data, which can be easily integrated into billing systems for post-paid use cases.
        • Underlay (network-level) Automation
          Cloud Providers can support customers that need a large number of GPUs on demand by programmatically configuring the underlying networking layer (switches, etc.) to ensure hardware-level multi-tenancy and the highest levels of performance.
        • Network lifecycle Management and Automation
          Cloud providers can streamline operation and management of the underlying east-west and north-south switch fabrics gaining guaranteed stability and scalability. Cutting time to market by eliminating the guesswork and in-house development leveraging built-in automation of NVIDIA networking guidelines, and rigorously tested best practices
        • Cloud Networking Functions
          Cloud Providers can also offer essential cloud networking functions that end users expect to see as part of a competitive cloud offering such as Internet gateways, NAT gateways, Elastic Load Balancers, Direct connect, and others.
        • Support
          Cloud providers gain highly qualified 24/7/365 support from Rafay and Netris to handle any issues or ongoing questions. Rafay and Netris, each had 6+ years of extensive experience in supporting live cloud provider customers based on NVIDIA and other networking technologies. It’s a critical value-add giving cloud providers the confidence of offering world-class services.

Call to Action

If you are currently operating a GPU Cloud or plan on launching a GPU cloud services and would like to expand your customer base to thousands of enterprise customers by providing a CSP-like experience, please feel free to reach out to either Rafay or Netris to learn more. We will be delighted to assist.