Strengths

Ultimate Parallel Computing Capabilities
Enhance your business’s efficiency and competitive edge through advanced parallel computing capabilities, boosting performance.

Quick Environment Deployment

Native Acceleration Engine
Enhance distributed training and inference speed by leveraging TACO Kit, an off-the-shelf computing acceleration engine offered by Cloud.
Solutions
All-True Internet Solution
Rendering Solution
AI-Enabled Moderation Solution for Game Live Streaming Platform
Autonomous Driving Solution

Customer challenges:
Creating an immensely engaging and realistic environment, virtual worlds depend on robust computational power for rendering and other resource-intensive tasks. Yet, the majority of mobile phones and similar end-user devices lack the hardware capabilities to consistently handle such demanding rendering tasks. Furthermore, software packages housing necessary rendering engines and other essential components often balloon to gigabytes in size, consuming substantial storage space on the user’s device.
Solution:
Leveraging Cloud’s robust GPU computing capabilities, this solution seamlessly incorporates enterprise-level rendering, qGPU-based container-level resource partitioning and virtualization technologies, video encoding and decoding capabilities, and cloud-based streaming solutions. It provides substantial computing power for cloud-based rendering, enabling end users to access high-performance rendering simply by connecting to the network. This approach alleviates the burden on end-user devices’ resources and storage while ensuring a seamless integration experience between the cloud and edge computing.
Benefits of cloud deployment:
The cloud-native solution enables canary releases and swift deployment for your business. Elastic scaling empowers you to efficiently allocate extensive resources, facilitating seamless scalability to accommodate peak and off-peak demands. Paired with qGPU virtualization sharing technology, which leverages GPU computing power and VRAM isolation, it significantly boosts GPU utilization while mitigating enterprise costs.
The next-generation Cloud GPU Service delivers high-density encoding computing power and enhanced network performance. In collaboration with NVIDIA, it introduces China’s premier all-in-one CloudXR solution, ensuring a seamless and immersive user experience for VR applications.

Customer challenges:
In sectors like film, television, advertising, and architectural design, content creators and post-production teams heavily depend on numerous machines to accomplish rendering tasks associated with visual effects, 3D animations, and design drafts.
Conventional IDC resources frequently face challenges in fulfilling the substantial rendering requirements, particularly during sporadic peak periods, resulting in underutilized resources during off-peak times. Consequently, the initial investment costs are considerable, while the return on investment is sluggish, posing obstacles to enterprise growth.
Solution:
Offers a range of professional GPU rendering instances that collaborate with BatchCompute, allowing teams to automate their content rendering workflows. Utilizing the vast computing resources of the Cloud GPU Service and the job scheduling capabilities of BatchCompute, creative and technical professionals can streamline their visual creation projects by creating rendering-dependent processes tailored to their needs.
Benefits of cloud deployment:
The Cloud GPU Service enables swift cluster creation and management, empowering you to choose GPU models and quantities that precisely match your rendering requirements.
Configurable and user-friendly jobs are effortlessly deployable and reusable. Craft a job flow tailored to your specific processes and rendering methodologies, all within the cloud environment.
With a multi-dimensional job and resource monitoring system, Cloud GPU Service alleviates concerns about operations at the IaaS layer.

Customer challenges:
As the business expanded, the self-constructed cluster experienced prolonged scaling times, struggling to support the escalating volume of video moderation tasks.
The aging hardware components within the self-built cluster lacked sufficient computing, storage, and network capabilities to meet the company’s demands for high concurrency and minimal latency.
Solution:
The GPU-driven inference cluster accommodates an extensive array of data samples, capable of swiftly scaling to handle a substantial volume of concurrent requests, thereby mitigating performance constraints.
Employing Turbo high-throughput storage, the high-performance training cluster expedites model training and iteration processes, thereby enhancing video moderation accuracy and the success rate of moderation endeavors.
Benefits of cloud deployment:
The Cloud GPU Service enables elastic scaling, enabling swift adjustments to service scale in accordance with your present business requirements. It allows you to procure and utilize computing power as required, thereby significantly lowering upfront investment expenses.

Customer challenges:
Autonomous driving systems gather data using in-vehicle sensors and cameras, resulting in the generation of multiple terabytes of data daily. Swift analysis, processing, and persistent storage of these extensive datasets are imperative. Thus, meeting the computing and storage requirements across various stages like annotation, training, and simulation necessitates a sizable computing cluster equipped with high IOPS, high-performance storage, and high-bandwidth network infrastructure.
Solution:
The Cloud GPU Service offers a CPM High-Performance Computing cluster, empowered by robust V100 and A100 GPUs, capable of meeting the demanding computational needs of autonomous driving systems.
Interconnection within the High-Performance Computing cluster is facilitated via a 100 GiB RDMA network, supporting efficient collaboration, especially when utilized alongside GooseFS for enhancing the productivity of large-scale distributed clusters utilized in training.
COS ensures data storage across diverse infrastructure and devices with redundancy, while also offering remote disaster recovery and resource isolation features to ensure the durability and security of data.
Benefits of cloud deployment:
The Cloud GPU Service enables the execution of extensive parallel simulations, boasting high elasticity and cost-effective storage. Offering comprehensive support, it empowers automakers and R&D teams to accelerate the development and optimization of autonomous driving technology, all while minimizing expenses.
Success Stories
The Cloud GPU Service is a flexible computing solution offering GPU processing power and robust parallel computing capabilities. It delivers readily available computing resources, effectively alleviating computational burdens, enhancing business efficiency and competitiveness, and empowering business achievements.

Ubitus
Ubitus’ cloud gaming service uses a distributed service-oriented architecture to expedite numerous compute-intensive tasks in the cloud, including multimedia conversion and game image compression. The Cloud offer a variety of GPU instance specifications and storage resources tailored to Ubitus’ workload, enhancing utilization and helping achieve operational goals at reduced costs.

WeBank
WeBank’s face recognition identity verification technology leverages the Cloud GPU Service, deploying extensive inference clusters to handle verification requests in real time. This effectively addresses the primary challenge in optimizing financial service efficiency.

The vast data on WeChat significantly increases compute resource usage and extends training time. Traditional CPU clusters would typically require several hours to finish a training task, greatly hindering service iteration speed. However, with multi-instance multi-card distributed GPU training, tasks can now be completed in just minutes.
FAQs
General
Network
Storage
Regions and AZs
Security
Image
In what cases should I use a GPU instance?
A GPU contains more arithmetic logic units (ALUs) than a CPU and excels at large-scale multi-thread parallel computing. It is particularly well-suited for the following applications:
- AI computing: Deep learning inference and training
- Graphics and image processing: Cloud game, cloud phone, cloud desktop, and CloudXR
- High-performance computing: Fluid dynamics, molecular modeling, meteorological engineering, seismic analysis, genomics, etc.
How do I select a GPU instance model?
Select an instance model based on your specific use case:
- AI training: GN10Xp, GN10X, GT4, GN8, and GN6/GN6S
- AI inference: GN7, GN10Xp, GN10X, PNV4, GI3X, GN6, and GN6S
- Graphics and image processing: GN7vw, GNV4, GNV4v, and GI1
- Scientific computing: GN10Xp, GN10X, GT4, and GI3X
How do I select a driver based on the instance model and scenario?
NVIDIA GPU instance models encompass both physical passthrough instances, which feature entire GPUs, and vGPU instances, which do not possess complete GPUs but rather fractions like 1/4 GPU.
In physical passthrough instances, the GPU can leverage either the Tesla or GRID driver (though some models lack support for the GRID driver), enhancing computational performance across diverse scenarios.
VGPU instances exclusively utilize the GRID driver on specific versions to accelerate computing tasks.
Does Cloud GPU Service allow you to adjust the configuration of an instance?
GPU instance models such as PNV4, GT4, GN10X, GN10Xp, GN6, GN6S, GN7, GN8, GNV4v, GNV4, GN7vw, and GI1 within the same instance family allow for instance configuration adjustments. However, GI3X does not support instance configuration adjustment.
What should I do if the resources are sold out when I purchase an instance?
You may consider the following actions:
- Adjust the region
- Modify the availability zone (AZ)
- Update the resource configuration
If the issue persists, please reach out to us for assistance
What is the difference between a private IP and a public IP of a GPU instance?
A private IP is a connection address that provides services for a client with a source IP from the private network. A public IP is a connection address that enables public network communication for a client with a source IP from the public network. They can be directly mapped to each other through network address translation. GPU instances in the same region can communicate over the private network, while those in different regions can only communicate over the public network.
What is an EIP?
An Elastic IP (EIP) is a fixed IP address tailored for dynamic cloud computing environments. It is tied to a specific region. You have the ability to swiftly reassign an EIP to another GPU instance (or a CVM/NAT gateway instance) within your account to prevent instance failures.
What storage options does Cloud GPU Service offer?
The Cloud GPU offers a variety of data storage choices tailored for GPU instances, encompassing cloud disks, local disks, COS, and block storage device mapping. Each option varies in both performance and cost, catering to diverse use cases.
What storage options does a CPM GPU instance offer?
Certain models of CPM GPU instances offer local storage capabilities, while remote storage can be utilized as required.
1. Local storage
Specific CPM GPU instances come outfitted with NVMe SSD disks boasting exceptional read/write speeds, tripling the performance compared to standard models. This ensures steadfast stability for high-performance computing tasks.
2. Remote storage
CFS: Turbo CFS can be mounted using smart ENI technology, facilitating adaptable storage capacity expansion and ensuring robust consistency among three replicas.
COS: Integrated with GooseFS’s distributed cluster architecture, COS enhances data locality and employs high-speed cache functionality to boost storage performance and enhance bandwidth for writing data to COS.
How do I back up data in a GPU instance?
1. Should your GPU instance be utilizing a cloud disk, you have the option to safeguard your business data through the creation of a personalized system disk image and a snapshot of the data disk.
2. Should your GPU instance employ a local disk, safeguarding your business data entails crafting a tailored system disk image. However, it’s imperative to establish customized backup protocols for the business data stored in your data disk. Typically, data backup for GPU instances can be achieved via FTP. For further insights into FTP deployment, please get in touch with us.
3. Furthermore, for enhanced data security needs, you have the option to acquire specialized third-party custom backup services.
What is a region?
Regions operate in complete isolation, ensuring optimal cross-region stability and fault tolerance. We will progressively introduce nodes in additional regions to extend coverage. We advise selecting the region nearest to your end users to minimize access latency and enhance download speeds.
How do I select an appropriate region?
Our recommendation is to opt for the region nearest to your users and to align your GPU instances with the same region. This ensures communication takes place over the private network.
- Proximity to your users matters: Opting for a region geographically near your users minimizes their access latency and enhances their access speed. For instance, if the majority of your users are in Southeast Asia, choosing Singapore or Thailand would be optimal.
- Intra-region communication is key: GPU instances within the same region can communicate via a private network at no additional cost. However, if they’re in different regions, communication must occur over the public network, incurring fees. To enable private network communication between instances, you need to ensure they’re located in the same region.
How are AZs isolated from each other?
Each Availability Zone (AZ) operates on its own separate and physically isolated infrastructure, engineered for high reliability. AZs do not share critical components like power supplies and cooling systems, minimizing vulnerabilities. Furthermore, they are entirely independent of each other, meaning that in the event of a natural disaster like a fire, tornado, or flood, only the affected AZ would experience disruption, while other AZs continue to operate without interruption.
Where can I find more information on security?
Offers a range of network and security solutions, including security groups, encrypted login, and EIP, ensuring the secure, efficient, and unrestricted operation of your instances.
How do I prevent others from viewing my system?
Managing access to your GPU instances involves adding them to a security group. Additionally, you have the option to customize communication between security groups and define which IP subnets are permitted to interact with your instances.
Does Cloud GPU Service provide images with a preinstalled GPU driver?
When opting for a vGPU or Rendering instance, you have the option to choose an image from the “Public images” section on the purchase page, featuring a preinstalled GRID driver.
What types of images are available?
Offers three types of images: public images, shared images, and custom images.
What is a shared image?
Users have the option to distribute personalized images with others or access images shared by fellow users. For detailed insights into the constraints and utilization guidelines of shared images, refer to the “Sharing Custom Images” section.
How many users can I share an image with?
You can share an image with a maximum of 50 users. Shared images are not included in your personal image quota.