By ET Bureau - August 20, 2023 1 Mins Read
Arthur, an AI startup based in New York City, has announced the release of Arthur Bench, an open-source tool for evaluating and comparing the performance of large language models (LLMs) such as OpenAI’s GPT-3.5 Turbo and Meta’s LLaMA 2.
Companies can use Arthur Bench to evaluate how various language models perform in their particular use cases. It offers metrics for evaluating models’ accuracy, readability, hedging, and other attributes.
Arthur has provided a number of starter criteria for comparing LLM performance, but since the tool is open source, businesses using it are free to add additional criteria that best suit their requirements.
Read More: Arthur unveils Bench, an open-source AI model evaluator
Check Out The New Enterprisetalk Podcast. For more such updates follow us on Google News Enterprisetalk News.
The platform covers e entire enterprise technology space- including emerging technologies like RPA, AI, cloud, automation, and the entire gamut of digital transformation tools, strategies and management decisions.
A Peer Knowledge Resource – By the CXO, For the CXO.
Expert inputs on challenges, triumphs and innovative solutions from corporate Movers and Shakers in global Leadership space to add value to business decision making.
Media@EnterpriseTalk.com