Nvidia has announced the launch of TensorRT 8, claiming that the AI software is twice as accurate and powerful as its predecessor, and that inference time for language queries has been cut in half.
Siddharth Sharma, Head of the Product Marketing team for Nvidia’s AI software said, “TensorRT 8 is twice as powerful as 7, twice as accurate as TensorRT 7, and it supports sparsity which can dramatically reduce the amount of compute and memory needed for running applications. With this achievement, you can now deploy the entire Bert-Large within a millisecond. That is huge and I believe that is going to lead to a completely new generation of conversational AI applications. A level of smartness, a level of latency that was unheard of before.”
To Read More: zdnet