Microsoft is doubling down on multilingual large language models - and Europe stands to benefit the most

"We have learnt that, basically, one needs to record several hundred hours of people, speaking a particular language in order to support the multi-modal capability of AI."

"It's important to underscore that all of this work is designed to donate more data so that others can use it."

Microsoft is expanding multilingual LLMs in a new partnership across Europe. Currently, Europe's 24 official and 250 indigenous languages are underrepresented in web content for LLM training. To address this, Microsoft will collaborate with Hugging Face to provide multilingual data from GitHub. A call for applications will be issued to support content development in 10 underrepresented languages. Initiatives will include recording audio, creating digitized content, and supporting research. This data will be publicly accessible to European citizens, aiming to improve AI capabilities in these languages.

#microsoft #multilingual #language-models #artificial-intelligence #europe

Read at IT Pro

Unable to calculate read time

Collection

[

...

]

Microsoft is doubling down on multilingual large language models - and Europe stands to benefit the mostMicrosoft is doubling down on multilingual large language models - and Europe stands to benefit the most Briefly

Microsoft is doubling down on multilingual large language models - and Europe stands to benefit the most
Microsoft is doubling down on multilingual large language models - and Europe stands to benefit the most
Briefly