Eblogtip.com
  • Categories
    • News
    • Technology
    • Domains
    • Hosting
    • Promotions

Archives

  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • December 2022

Categories

  • News
  • Technology
  • Uncategorized
eBlogTip
  • Categories
    • News
    • Technology
    • Domains
    • Hosting
    • Promotions
  • News

Hey Presto! Nvidia pulls software hack out of AI hat and doubles performance of H100 GPU for free

  • September 11, 2023
Total
0
Shares
0
0
0


Nvidia is banding together with a list of tech partners on a game-changing piece of software that’s set to double the performance of its flagship H100 Tensor Core GPUs. 

The open source TensorRT-LLM update, which is set for release in the coming weeks, sees an up-to-date system outperform the A100 by eightfold, whereas H100s would previously outperform the A100 by just fourfold. This was tested on the GPT-J 6B, a model that’s used to summarise articles from CNN and Daily Mail.

When tested on Meta’s Llama2 LLM, TensorRT-LLM-powered H100s outperformed A100s by 4.6 times – versus 2.6 times before the update.

Nvidia H100s faster than ever

The versatility and dynamism of large language models (LLMs) can make it difficult to batch requests and execute them in parallel, which means some requests finish much earlier than others.

To solve this, Nvidia and its partners embedded TensorRT-LLM with a more powerful scheduling technique called in-flight batching. This takes advantage of the fact text generation can be broken down into multiple subtasks. 

Put simply, instead of waiting for an entire batch of tasks from one request to finish before moving on to the next request, the system can continue processing new batches from different requests in parallel. 

TensorRT-LLM comprises a TensorRT deep learning compiler and includes optimized kernels, pre-processing and post-processing steps, as well as multi-GPU and multi-node communication primitives. 

The result? Groundbreaking performance on Nvidia’s GPUs paving the way for new large language model experimentation, quick customization, and peak performance. 

This software uses tensor parallelism, in which individual weight matrices are split across devices, in turn, allowing efficient inference at scale; each model runs in parallel across multiple GPUs and across multiple servers.

TensorRT-LLM also includes fully optimized and read-to-run versions of popular LLMs including Llama 2, GPT-2 and GPT-3, as well as Falcon, Mosaic MPT, BLOOM, and dozens of others. These can be accessed through a Python API.

The update is available in early access, and will soon be integrated into the Nvidia NeMo framework, which is part of Nvidia AI Enterprise. Researchers can access this through the NeMo framework, the NGC portal, or through the source repository on GitHub.

More from TechRadar Pro


Source link

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Previous Article
  • Technology

Max Q: Elon says Starship is ready, FAA says not quite

  • September 11, 2023
View Post
Next Article
  • News

Quordle today – hints and answers for Tuesday, September 12 (game #596)

  • September 11, 2023
View Post
You May Also Like
View Post
  • News

Asus sells the largest microLED monitor ever for a cool $200,000 — but it’s only 4K and a low refresh rate

  • September 21, 2023
View Post
  • News

Facebook now lets you create alt accounts for better privacy and organization

  • September 21, 2023
View Post
  • News

The world’s most famous magician invests in data storage startup that wants to send 100GB disks to the Moon for future humanoids

  • September 21, 2023
View Post
  • News

YouTube reveals powerful new AI tools for content creators – and we’re scared, frankly

  • September 21, 2023
View Post
  • News

CEO of DuckDuckGo Testifies in Google Case

  • September 21, 2023
View Post
  • News

Windows Copilot might be the biggest change Microsoft has ever made to its long-running OS

  • September 21, 2023
View Post
  • News

Despite general investment downturn AI sees surge in spending, survey shows

  • September 21, 2023
View Post
  • News

Everything Microsoft announced at its 2023 Surface Event

  • September 21, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

eBlogTip.com
  • Categories

Input your search keywords and press Enter.