Beyond Microsoft-G42: How data is driving UAE life-saving LLMs and predictive AI


  • English
  • Arabic

While Microsoft's $1.5 billion investment in UAE artificial intelligence company G42 took centre stage in April, it is earlier data exchanges and partnerships demonstrating its significance and accelerating the Gulf nation’s rapid AI growth.

Established as the UAE’s premier AI solutions authority in 2018, the Abu Dhabi-based entity has poured billions into research and development to build the foundation of an AI global capital.

This is recently exemplified by its partnership with American AI training company Cerebras Systems which began last year and led to the July announcement of what is being named the largest global network of AI supercomputers, called Condor Galaxy.

Together, the nine cloud-connected supercomputers would reduce the time it takes to train coveted AI language learning models (LLMs), but by how much is yet to be determined. The goal is to generate AI solutions and commercial products in energy, climate change, and also health care among others at rates beyond present capacity.

G42’s tech elite operate as its own Magnificent Seven - which includes its health-tech arm M42 and Core42 to drive enterprise and commercial business with AI at scale.

But before all that, its oldest, Bayanat, Abu Dhabi’s geospatial data products and services provider, has been collecting data for about 50 years before joining the AI company in 2020.

Over the decades, Bayanat has trained its AI to visualise regional environments and detect specific events such as oil spills within minutes of receiving satellite data (which can take hours or days to transmit) during their early stages. It can also monitor floods to prevent them from wreaking havoc on homes and infrastructure. The latter was utilised as recently as April when the UAE faced its heaviest rainfall in nearly 75 years.

Its AI-enabled disaster management platform (AID) provides data and analytics to first responders to identify which locations need priority attention. And with the advent of LLMs, it's now exploring the integration of social media data to utilise more precise information and dispel rumours of misinformation that may cause further delays.

Data inflation

LLMs, first popularised in the mainstream with ChatGPT in late 2022, are changing the game of data acquisition and utilisation to create never-thought-possible solutions, and with precision.

In the UAE, global partnerships are leading to the creation of localised data centres such as Khazna. On Wednesday, G42 announced that it will soon launch its Hindi LLM named Nanda in partnership with Cerebras Systems and the Mohamed bin Zayed University of Artificial Intelligence, during the India-UAE business Forum in Mumbai. It is a 13-billion parameter model to help increase the accuracy of LLM predictions and will have about 2.13 trillion tokens of language data sets.

In the case of Bayanat – which grew from a military surveying department before integrating with G42 four years ago – is one of the country's early adopters of AI-enabled object detection to perform better land deformation monitoring services, predict weather changes, and get a jump start on the future of mass data utilisation.

Currently, its weather detection mainly uses meteorological data provided by satellites discerning surface observations such as river gauges and wind patterns.

In the future, data scientists at Bayanat – which is set to merge with Mubadala-owned Yahsat and its fleet of satellites to form Space42 – said they’ll be able to incorporate more sources by expanding their observation tools and integration. This will be enhanced by their joint endeavour from August, which included the launch of the UAE's first Synthetic Aperture Radar satellite into the Earth's lower orbit to improve detection capabilities.

Bayanat and Yahsat launch the UAE’s first Synthetic Apertuer Radar satellites as part of its Earth observation space programme on August 16, 2024, Photo: Bayanat
Bayanat and Yahsat launch the UAE’s first Synthetic Apertuer Radar satellites as part of its Earth observation space programme on August 16, 2024, Photo: Bayanat

The possibilities include monitoring what's happening in the background of an Instagram video at a nearby location tag, incorporating messages issued by first responders on the ground, or being able to translate speech-to-data from a person sharing content on TikTok. Altogether, this could one day be used to provide better information about what's happening during a natural disaster and at speeds exponentially faster than now.

This could mean more lives saved, and less damage to properties and roads, according to Bayanat. In addition to being more efficient, the complex data can be more relatable, with the data presented in a way that people can understand, and use.

“It’s not just about throwing data at people,” said Prashanth Marpu, vice president of research and development at Bayanat.

“We need to give insights in a way that people understand on the ground and can respond to the information they’re getting,” he added. This includes translating information and guidance in the form of reports, video summaries, and PDF documents that are more accessible and easier to comprehend.

Mr Marpu is doing so using AID, Bayanat’s AI-enabled disaster management platform first announced in late 2023. It is one of many products created from the Geospatial Artificial Intelligence Solutions (gIQ) platform, developed in partnership with the UAE Space Agency.

In its public interface, users can access its open-source data to conduct their own AI-powered geospatial analysis. Organisations are granted greater access upon approval and can download and integrate applications and analysis models into their own systems to provide tailored solutions.

Dr Prashanth Marpu of Bayanat. Victor Besa / The National
Dr Prashanth Marpu of Bayanat. Victor Besa / The National

What makes geospatial LLMs so special?

Imagine for a moment that you need to travel from one place to another, and it’s raining at your destination, ideally you would want to know the most advisable route to take.

A standard LLM, which can understand and generate human language based on large amounts of text, wouldn’t be able to help in this particular scenario simply due to the volume of image processing and remote sensory data required.

A geospatial LLM, however, can acquire a weather forecast, satellite images readily available for the desired journey, and real-time road congestion data, which using more sources, would make it possible to generate a more accurate depiction of the best possible route to take.

It’s that sort of accurate solution provided by platforms like AID, that are slowly but surely becoming a game-changer for first responders trying to save lives and prevent catastrophe during a severe weather event, said Dr Marpu.

Gravity of historical data

The openly sourced data needed to make all this work comes from partnerships with organisations such as the United States Geological Survey (USGS), the EU Copernicus Programme, as well as data sets compiled by Bayanat, among others.

“We are working on relations with other satellite data providers who have access to other historical data as well,” he said.

In terms of building satellites that will increase the effectiveness of AID, Bayanat is also working with ICEYE, a Finnish space and microsatellite company.

That collaboration, according to Mr Marpu, would also help Bayanat with synthetic aperture radar (SAR) data, which is critical to obtaining images from space amid cloud cover and other elements that might hinder image quality so crucial during extreme weather events and natural disasters.

That data, in turn, allows for Bayanat to also determine whether or not certain weather events are anomalies or the result of something bigger like climate change.

“We have to monitor all of this carefully because it’s changing fast,” he said. “Seasonality of rainfall is changing, seasonality of temperature peaks are changing … we have to understand all those things to be ready,” he said, before turning to how the data might be used after an extreme weather event.

“It’s important to look at the damage assessments and that you’re able to make recommendations for future resilience.”

For non-weather disasters, such as the deadly 2023 Turkey-Syria earthquake, Bayanat was able to apply its geospatial analytics know-how to assist search and rescue teams.

“We tasked the satellite data, ran our automated change detection algorithms we have on our platform … and we quickly uploaded a report showing damage in certain regions,” Mr Marpu said, noting that first responders were hypothetically able to use the data quickly during search and rescue.

It also followed up on news articles, videos and images from social media purporting to show deep cracks on a dam in the region around the Turkish city of Kahramanmaraş which called for immediate attention.

“We could verify that there was no damage there,” he said. “In the age of social media, false rumours fly very easily during an earthquake.”

Instead, and using space satellite data for fast detection, Bayanat was able to shift its focus to a nearby village which suffered significant damage but receive little to no social media attention.

Bayanat's AI-enabled disaster management platform verifies that a dam reported to suffered major damage was intact after the impact of the Syria and Turkey Earthquakes in February 2022. Photo: Bayanat
Bayanat's AI-enabled disaster management platform verifies that a dam reported to suffered major damage was intact after the impact of the Syria and Turkey Earthquakes in February 2022. Photo: Bayanat

Sourcing critical data

Given that extensive volumes of historical data over decades are needed to make sense of current geospatial findings being collected and analysed, an equally monumental ability to process and also integrate the sheer magnitude of information is required.

Bayanat has been deploying its cloud-agnostic giQ platform on Microsoft’s Azure cloud computing platform since late 2023 to achieve greater capability and process the fast-growing volumes of data.

Microsoft identifies this torrent of data as a limiting factor to utilizing the information effectively.

“A lot of problems require a significant amount of computational power,” said Juan Lavista Ferres, vice president and chief data scientist of AI at Microsoft.

He said that Microsoft and the cloud have ample experience harnessing that level of computing power for entities.

“In the case of satellite data, we are working with several UN agencies trying to build a map of every single structure around the world and to do that, we’re working with Planet Labs,” Mr Lavista Ferres said, referring to the California-based satellite imagery and earth data analytics company.

“They have more than 200 satellites that take a picture every single day of every single square metre in the world … using that data a human would take 400 years to go through just one day’s worth of data, just looking at that a picture every 30 seconds would take 400 years to go through it all,” he said. “You need the cloud to expedite this,” he added.

Mr Lavista Ferres said that while existing meteorological models work well for basic weather forecasting, machine-learning has made it possible to increase the potential of those models, and in turn, improve accuracy.

“The world is very complex,” he said. “There’s no way these existing meteorology models can adapt to every single geography … a lot of the models might work well but they have a significant amount of assumptions.”

The machine-learning approach, he said, makes it possible to observe the earth on several granular levels with incredibly large amounts of data.

“What we have seen now in a significant amount of studies is that these models are performing better in some cases than the regular models,” he said.

Mapping out possibilities

For Microsoft, all this research combining AI, machine learning, meteorology and computing power is taking place in an entity known as the AI For Good Lab, where geospatial advancements are applied to various health and environmental issues.

Similar to Bayanat, Microsoft’s research has yielded results for first responders and offered a glimpse into how the future of disaster preparedness might be improved.

Although it may seem like an unorthodox area of focus, AI and geospatial technology are at the centre of finding out exactly where humans live in the world, according to Mr Lavista Ferres.

“If you don’t know where people live, you cannot help them if there is a disaster,” he said, noting that Afghanistan’s deadly 2023 earthquake which claimed the lives of more than 1,300 people, proved that there’s definitely a need to use AI to expedite the process of finding people.

“We put a satellite on top of the disaster, take pictures and download the pictures and run disaster assessment maps,” he said, pointing specifically to the Afghanistan earthquake.

“The area was huge and it would’ve normally taken thousands of humans looking at the satellite data to figure out where people live and what was affected and what wasn’t,” he added. “But with the compute power it’s quite fast, and the teams on the ground that need the maps can become aware about who is affected and where they’re affected … these models do it all within hours, that’s impossible for humans to do.”

Expediting these mission-critical tasks often utilised by organisations like the Red Cross and UNHCR in a timely manner, according to Mr Lavista Ferres, is an example where AI is not just a solution, but rather, the only solution that just several years ago would not have been possible.

“We could not solve this using just humans,” he said.

As satellite technology reaches an unprecedented crescendo and the ability to take and retrieve photos of the world daily becomes more commonplace, Mr Lavista Ferres said that identifying deforestation and illegal mining will be easier than before, while disaster preparedness and disaster response will also improve. Those solutions, he said, just scratch the surface.

AI can also help monitor sounds in nature, and therefore, potentially diagnose animal and forest health.

That research is under way through something Microsoft calls Project Guacamaya, which is using AI to identify nature sounds in the Amazon rainforest, and in turn, map and monitor environmental health.

“This is a very powerful use of AI because you don’t have a lot of people in the world who can distinguish these animal and bird sounds,” he said. “A lot of these recordings would take a huge amount of time for humans to listen to and recognise patterns.”

Meanwhile, back in the UAE, Bayanat is moving full-steam ahead with its AID platform, hoping to increase its technological presence in the region, while also providing a much-needed service for those who rely on information to sometimes make life and death decisions depending on the weather.

That sort of technological advancement could prove to be paramount, especially with the Middle East and North Africa fast becoming one of the world’s most vulnerable to the impacts of climate change, according to the World Bank.

Creating effective response models from a region studying how to function under the condition of extreme desertification, is an understanding that Microsoft and the rest of the world are eager to benefit from.

“Those models have to be locally trained for local conditions, and that’s something we’re looking forward to … especially for disaster management applications,” said Mr Lavista Ferres.

WOMAN AND CHILD

Director: Saeed Roustaee

Starring: Parinaz Izadyar, Payman Maadi

Rating: 4/5

Israel Palestine on Swedish TV 1958-1989

Director: Goran Hugo Olsson

Rating: 5/5

Updated: September 15, 2024, 4:05 PM`