Microsoft and Nvidia Build 530 Billion Parameter Language Model

Microsoft and Nvidia Build 530 Billion Parameter Language Model-01

Nvidia and Microsoft have joined forces to create the Megatron-Turing Natural Language Generation model, which both the companies claim to be the “most powerful monolithic transformer language model trained to date”.

The AI model has 530 billion parameters, 105 layers and runs on chunky supercomputer hardware like Selene. By comparison, the vaunted GPT-3 has 175 billion parameters.

“Each model replica spans 280 NVIDIA A100 GPUs, with 8-way tensor-slicing within a node, and 35-way pipeline parallelism across nodes,” the pair said in a blog post.

The model was trained on 15 datasets that had 339 billion tokens, and it demonstrated how larger models require less training to function well.

To Read More: ZDnet

Check Out The New Enterprisetalk Podcast. For more such updates follow us on Google News Enterprisetalk News.

Previous articleIT Leaders Need to Rethink Governance in Hybrid Workplace
Next articleMicrosoft to Launch VS 2022 on November 8