[ad_1]
A new Arabic artificial intelligence large language model developed in Abu Dhabi has been unveiled, aiming to bring one of the world’s most widely spoken languages into the artificial intelligence mainstream.
Jais is an open source Arabic-English bilingual model developed by Inception, a division of Abu Dhabi artificial intelligence company G42, Mohamed bin Zayed University for Artificial Intelligence, and Silicon Valley-based Cerebras Systems.
The developers say that Jais is more accurate than other existing Arabic grammar masters. It can be downloaded on the machine learning platform Hugging Face.
How artificial intelligence can bring back lost loved ones
Inception chief executive Andrew Jackson said the launch of Jais was a further step in encouraging the science and computing community to focus more on non-English speaking LLMs, similar to efforts in Japan and India. Nationwide.
“We see Jais becoming really useful in generative use cases, such as generating answers to questions, generating documents, translations, emails, and even providing comments and suggestions,” he said.
The company says it captures the linguistic nuances of various Arabic dialects and can understand linguistic, contextual and cultural references “making it more accurate and contextually relevant than other models”.
Jais is a tribute to the highest peak in the UAE, Ras Al Khaimah, and was developed for government as well as the financial, energy, climate and healthcare industries.
Several public and private organizations in the UAE have signed up as launch partners for Jais, including the Ministry of Foreign Affairs, Ministry of Industry and Advanced Technology, Abu Dhabi Ministry of Health, ADNOC, Etihad Airways, FAB and e&, the technology group Formerly known as Etisalat.
Jais was trained on Condor Galaxy, “the world’s largest artificial intelligence supercomputer” launched by G42 and Cerebras in July, using 116 billion Arabic tokens and 279 billion English tokens. The companies said it is expanding as more Arabic content is collected to generate new instruction sets.
Tokens are the building blocks of the LLM language, the basic units of text or code used to process and generate other parts of the language and code.
According to WorldData, Arabic is one of the most widely spoken languages in the world, with more than 400 million speakers. It is an official language in 22 countries and is also partially spoken in 11 other countries. However, its online presence is minimal, with about 1% of Arabic content available online, according to data provided by the companies.
Mr Jackson said Jess would help boost that figure.
“We are spearheading a program to collect more Arabic data from offline sources. So this has officially begun and is the first approach we will take to promote the Arabic language,” Mr Jackson said.
“We’re also looking at new ways to synthesize Arabic, translate existing English into Arabic and improve Arabic conversions… We still have a long way to go, but I think we have to be very optimistic and really push forward.”
Organizations have been using artificial intelligence for a long time, but with the emergence of generative AI, Microsoft-backed OpenAI’s ChatGPT has made it popular and gained significant momentum.
Overall, it has opened up a new battlefield in the tech world, with companies racing to get a head start and expand their scope in generative artificial intelligence.
The availability of the LL.M. will aid companies in their efforts, especially as developers continue to improve their artificial intelligence capabilities.
“Speed performance is important to developers, not only because it allows them to bring new models into the community, production, or market more quickly, but also because it allows data scientists and machine learning researchers to quickly propose and iterate on different models. said Mr. Jackson.
Updated: August 30, 2023 7:29 am
[ad_2]
Source link