Two months after their debut sweeping the MLPerf inference benchmarks, NVIDIA H100 Tensor Core GPUs set world information for enterprise AI workloads within the business group’s newest AI coaching exams.

Collectively, the outcomes present that H100 is the only option for customers who demand optimum efficiency when constructing and deploying superior AI fashions.

MLPerf is the business customary for measuring AI efficiency. It’s backed by a big group that features Amazon, Arm, Baidu, Google, Harvard College, Intel, Meta, Microsoft, Stanford College, and College of Toronto.

In a associated MLPerf benchmark additionally launched right this moment, NVIDIA A100 Tensor Core GPUs raised the bar they set final yr for high-performance computing (HPC).

NVIDIA H100 GPUs have been as much as 6.7 instances quicker than A100 GPUs when first subjected to MLPerf coaching.

H100 (aka Hopper) GPUs have raised the efficiency bar per accelerator in MLPerf Coaching. They delivered as much as 6.7x extra efficiency than earlier technology GPUs when first subjected to MLPerf coaching. By the identical comparability, right this moment’s A100 GPUs are 2.5 instances extra muscular, because of software program developments.

Thanks partially to its Transformer Engine, Hopper has excelled in coaching the favored BERT mannequin for pure language processing. It is among the largest and most performance-intensive MLPerf AI fashions.

MLPerf offers customers the arrogance to make knowledgeable buying selections as benchmarks cowl right this moment’s hottest AI workloads – pc imaginative and prescient, pure language processing, recommender techniques, machine studying by reinforcement and extra. The exams are peer-reviewed, so customers can belief their outcomes.

A100 GPUs attain a brand new excessive in HPC

Within the separate suite of MLPerf HPC benchmarks, the A100 GPUs swept all AI mannequin coaching exams in demanding scientific workloads working on supercomputers. The outcomes present the flexibility of the NVIDIA AI platform to adapt to the world’s hardest technical challenges.

For instance, A100 GPUs skilled AI fashions within the CosmoFlow check for astrophysics 9 instances quicker than the highest outcomes two years in the past within the first spherical of MLPerf HPC. On this identical workload, the A100 additionally delivered as much as 66 instances extra throughput per chip than another providing.

HPC benchmarks type fashions for work in astrophysics, climate forecasting and molecular dynamics. They’re a part of many technical fields, like drug discovery, embracing AI to advance science.

A100 leads in MLPerf HPC
In exams around the globe, the A100 GPUs got here out on high in each velocity and coaching throughput.

Supercomputing facilities in Asia, Europe and the USA participated within the newest spherical of MLPerf HPC exams. In its debut on the DeepCAM benchmarks, Dell Applied sciences confirmed robust outcomes utilizing NVIDIA A100 GPUs.

An unprecedented ecosystem

Within the Enterprise AI Coaching Benchmarks, a complete of 11 corporations, together with cloud service Microsoft Azure, submitted submissions utilizing NVIDIA A100, A30, and A40 GPUs. System producers together with ASUS, Dell Applied sciences, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo and Supermicro used a complete of 9 NVIDIA-certified techniques for his or her submissions.

Within the final spherical, no less than three corporations joined NVIDIA in submitting outcomes on all eight MLPerf coaching workloads. This versatility is vital as a result of real-world functions usually require a collection of assorted AI fashions.

NVIDIA companions take part in MLPerf as a result of they know it’s a helpful instrument for patrons evaluating AI platforms and distributors.

underneath the hood

The NVIDIA AI platform offers a full stack from chips to techniques, software program, and companies. This enables for steady efficiency enhancements over time.

For instance, submissions within the newest HPC exams utilized a collection of software program optimizations and strategies described in a technical article. Collectively, they lowered the execution time of a benchmark by 5 instances, from 101 minutes to solely 22 minutes.

A second article describes how NVIDIA optimized its platform for enterprise AI benchmarks. For instance, we used NVIDIA DALI to effectively load and pre-process knowledge for a pc imaginative and prescient benchmark.

All the software program used within the exams is accessible within the MLPerf repository, so anybody can get these world-class outcomes. NVIDIA constantly packages these optimizations into containers accessible on NGC, a software program hub for GPU functions.

Supply :

Leave A Reply