Alibaba’s DAMO Academy, the group’s global research program, has had another major breakthrough in the machine-reading capabilities that underpin success in artificial intelligence.
DAMO’s Natural Language Processing (NLP) model topped the GLUE benchmark rankings, an industry table perceived as the most-important baseline test for the NLP model on March 3. Alibaba’s model also significantly outperformed human baselines, marking a key milestone in the development of robust natural language understanding systems.
DAMO’s existing model has already been deployed widely in Alibaba’s ecosystem, powering its customer-service AI chatbot and the search engine on Alibaba’s retail platforms, as well as anonymous healthcare data analysis. The model was used in the text analysis of medical records and epidemiological investigation by CDCs in different cities in China for fighting against COVID-19.
“We are excited to achieve a new breakthrough in driving research of the NLP development,” said Si Luo, head of NLP Research at Alibaba DAMO Academy. “Not only NLP as a core technology underpinning Alibaba’s various businesses, which serve hundreds of millions of customers, but it also becomes a critical technology now in fighting the coronavirus. We hope we can continue to leverage our leading technologies and contribute to the community during this difficult time.”
General Language Understanding Evaluation (GLUE), a platform for evaluating and analyzing NLP systems, attracts global key AI players, including Google, Facebook, Microsoft, and Standard to participate every year. Alibaba’s multitask machine-learning model, StructBERT, which is based on the pre-trained language model BERT, while also incorporating word and sentence structures, delivers impressive empirical results on a variety of downstream tasks, resulting in the GLUE benchmark as high as 90.3 – outperforming the human baselines of 87.1. It also boosts the performance in many language-understanding applications, such as sentiment analysis, textual entailment, and question-answering.
This is not the first time Alibaba’s machine-learning model has topped others. On June 20, 2019, Alibaba’s model bested human scores when tested by the Microsoft Machine Reading Comprehension dataset, one of the artificial intelligence world’s most challenging tests for reading comprehension. The model scored 0.54 in the MS Marco question-answering task, outperforming the human score of 0.539, a benchmark provided by Microsoft. In 2018, Alibaba also scored higher than the human benchmark in the Stanford Question Answering Dataset – also one of the most-popular machine reading-comprehension challenges worldwide.
Over the past months, Alibaba has leveraged its proprietary technologies to help contain the coronavirus. Alibaba DAMO Academy has teamed up with Chinese medical institutions to develop an AI system that can expedite diagnosis and analysis of the virus. In February, Alibaba Cloud made its cloud-based AI-powered computing platform available for free to global research institutions to accelerate viral gene-sequencing, protein-screening and other research in treating or preventing the virus.