Revealed: Israeli military creating ChatGPT-like tool using vast collection of Palestinian surveillance data

4 hours ago 1

Israel’s military surveillance agency has used a vast collection of intercepted Palestinian communications to build a powerful artificial intelligence tool similar to ChatGPT that it hopes will transform its spying capabilities, an investigation by the Guardian can reveal.

The joint investigation with Israeli-Palestinian publication +972 Magazine and Hebrew-language outlet Local Call has found Unit 8200 trained the AI model to understand spoken Arabic using large volumes of telephone conversations and text messages, obtained through its extensive surveillance of the occupied territories.

According to sources familiar with the project, the unit began building the model to create a sophisticated chatbot-like tool capable of answering questions about people it is monitoring and providing insights into the massive volumes of surveillance data it collects.

The elite eavesdropping agency, comparable in its capabilities with the US National Security Agency (NSA), accelerated its development of the system after the start of the war in Gaza in October 2023. The model was still being trained in the second half of last year. It is not clear whether it has yet been deployed.

The efforts to build the large language model (LLM) – a deep learning system that generates human-like text – were partially revealed in a little-noticed public talk by a former military intelligence technologist who said he oversaw the project.

“We tried to create the largest dataset possible [and] collect all the data the state of Israel has ever had in Arabic,” the former official, Chaked Roger Joseph Sayedoff, told an audience at a military AI conference in Tel Aviv last year. The model, he said, required “psychotic amounts” of data.

Three former intelligence officials with knowledge of the initiative confirmed the LLM’s existence and shared details about its construction. Several other sources described how Unit 8200 used smaller-scale machine learning models in the years before launching the ambitious project – and the effect such technology has already had.

“AI amplifies power,” said a source familiar with the development of Unit 8200’s AI models in recent years. “It’s not just about preventing shooting attacks, I can track human rights activists, monitor Palestinian construction in Area C [of the West Bank]. I have more tools to know what every person in the West Bank is doing.”

Details of the new model’s scale sheds light on Unit 8200’s large-scale retention of the content of intercepted communications, enabled by what current and former Israeli and western intelligence officials described as its blanket surveillance of Palestinian telecommunications.

The project also illustrates how Unit 8200, like many spy agencies around the world, is seeking to harness advances in AI to perform complex analytical tasks and make sense of the huge volumes of information they routinely collect, which increasingly defy human processing alone.

A signal intelligence-gathering installation of Unit 8200, an Israeli intelligence corps unit responsible for collecting signal intelligence and code decryption, located on an observation point on the Israeli-Lebanese border near Rosh HaNikra crossing.
A signal intelligence-gathering installation of Unit 8200, an Israeli intelligence corps unit responsible for collecting signal intelligence and code decryption, located on an observation point on the Israeli-Lebanese border near Rosh HaNikra crossing. Photograph: Eddie Gerald/Alamy

But the integration of systems such as LLMs in intelligence analysis has risks as the systems can exacerbate biases and are prone to making mistakes, experts and human rights campaigners have warned. Their opaque nature can also make it difficult to understand how AI-generated conclusions have been reached.

Zach Campbell, a senior surveillance researcher at Human Rights Watch (HRW), expressed alarm that Unit 8200 would use LLMs to make consequential decisions about the lives of Palestinians under military occupation. “It’s a guessing machine,” he said. “And ultimately these guesses can end up being used to incriminate people.”

A spokesperson for the Israel Defence Forces (IDF) declined to answer the Guardian’s questions about the new LLM, but said the military “deploys various intelligence methods to identify and thwart terrorist activity by hostile organisations in the Middle East”.

A vast pool of Arabic-language communications

Unit 8200 has developed an array of AI-powered tools in recent years. Systems such as The Gospel and Lavender were among those rapidly integrated into combat operations in the war in Gaza, playing a significant role in the IDF’s bombardment of the territory by assisting with the identification of potential targets (both people and structures) for lethal strikes.

For almost a decade, the unit has also used AI to analyse the communications it intercepts and stores, using a series of machine learning models to sort information into predefined categories, learn to recognise patterns and make predictions.

After OpenAI released ChatGPT in late 2022, AI experts at Unit 8200 envisaged building a more expansive tool akin to the chatbot. Now one of the world’s most widely used LLMs, ChatGPT is underpinned by so-called “foundation model”, a general-purpose AI trained on immense volumes of data and capable of responding to complex queries.

Initially, Unit 8200 struggled to build a model on this scale. “We had no clue how to train a foundation model,” said Sayedoff, the former intelligence official, in his presentation. At one stage, the unit sent an unsuccessful request to OpenAI to run ChatGPT on the military’s secure systems (OpenAI declined to comment).

However, when the IDF mobilised hundreds of thousands of reservists in response to the Hamas-led 7 October attacks, a group of officers with expertise in building LLMs returned to the unit from the private sector. Some came from major US tech companies, such as Google, Meta and Microsoft. (Google said the work its employees do as reservists was “not connected” to the company. Meta and Microsoft declined to comment.)

The small team of experts soon began building an LLM that understands Arabic, sources said, but effectively had to start from scratch after finding that existing commercial and open-source Arabic-language models were trained using standard written Arabic – used in formal communications, literature and media – rather than spoken Arabic.

“There are no transcripts of calls or WhatsApp conversations on the internet. It doesn’t exist in the quantity needed to train such a model,” one source said. The challenge, they added, was to “collect all the [spoken Arabic] text the unit has ever had and put it into a centralised place”. They said the model’s training data eventually consisted of approximately 100bn words.

One well-placed source familiar with the project told the Guardian this vast pool of communications included conversations in Lebanese as well as Palestinian dialects. Sayedoff said in his presentation the team building the LLM “focused only on the dialects that hate us”.

An Israeli soldier from the special intelligence unit 8200 taking part in a Cyber Defense Challenge event in which teams compete against each other in stopping malicious hackers from invading vital infrastructures in a simulation game.
An Israeli soldier from Unit 8200 taking part in a cyber defense challenge event in which teams compete in stopping malicious hackers from invading vital infrastructures in a simulation game. Photograph: Eddie Gerald/Alamy

The unit also sought to train the model to understand specific military terminology of militant groups, sources said. But the massive collection of training data appears to have included large volumes of communications with little or no intelligence value about everyday lives of Palestinians.

“Someone calling someone and telling them to come outside because they’re waiting for them outside school, that’s just a conversation, that’s not interesting. But for a model like this, it’s gold,” one of the sources said.

AI-facilitated surveillance

Unit 8200 is not alone among spy agencies experimenting with generative AI technology. In the US, the CIA has rolled out a ChatGPT-like tool to sift through open-source information. The UK’s spy agencies are also developing their own LLMs, which it is also said to be training with open-source datasets.

But several former US and UK security officials said Israel’s intelligence community appeared to be taking greater risks than its closest allies when integrating novel AI-based systems into intelligence analysis.

One former western spy chief said Israeli military intelligence’s extensive collection of the content of Palestinian communications allowed it to use AI in ways “that would not be acceptable” among intelligence agencies in countries with stronger oversight over the use of surveillance powers and handling of sensitive personal data.

Campbell, from Human Rights Watch, said using surveillance material to train an AI model was “invasive and incompatible with human rights”, and that as an occupying power Israel is obligated to protect Palestinians’ privacy rights. “We’re talking about highly personal data taken from people who are not suspected of a crime, being used to train a tool that could then help establish suspicion,” he added.

Nadim Nashif, director of 7amleh, a Palestinian digital rights and advocacy group, said Palestinians have “become subjects in Israel’s laboratory to develop these techniques and weaponise AI, all for the purpose of maintaining [an] apartheid and occupation regime where these technologies are being used to dominate a people, to control their lives”.

Several current and former Israeli intelligence officers familiar with smaller-scale machine learning models used by Unit 8200 – precursors to the foundation model – said AI made the blanket surveillance of Palestinians more effective as a form of control, particularly in the West Bank where they said it has contributed to a greater number of arrests.

Two of the sources said the models helped the IDF automatically analyse intercepted phone conversations by identifying Palestinians expressing anger at the occupation or desires to attack soldiers or people living in illegal settlements. One said that when the IDF entered villages in the West Bank, AI would be used to identify people using words it deemed to indicate “troublemaking”.

“It allows us to act on the information of many more people, and this allows control over the population,” a third source said. “When you hold so much information you can use it for whatever purpose you want. And the IDF has very few restraints in this regard.”

‘Mistakes are going to be made’

For a spy agency, the value of a foundation model is that it can take “everything that has ever been collected” and detect “connections and patterns which are difficult for a human to do alone”, said Ori Goshen, co-founder of AI21 Labs. Several of the Israeli firm’s employees worked on the new LLM project while on reserve duty.

But Goshen, who previously served in Unit 8200, added: “These are probabilistic models – you give them a prompt or a question, and they generate something that looks like magic. But often, the answer makes no sense. We call this ‘hallucination.’”

Brianna Rosen, a former White House national security official and now a senior research associate at Oxford university, notes that while a ChatGPT-like tool could help an intelligence analyst “detect threats humans might miss, even before they arise, it also risks drawing false connections and faulty conclusions”.

She said it was vital for intelligence agencies using these tools to be able to understand the reasoning behind the answers they produce. “Mistakes are going to be made, and some of those mistakes may have very serious consequences,” she added.

In February, the Associated Press reported AI was likely used by intelligence officers to help select a target in an Israeli airstrike in Gaza in November 2023 that killed four people, including three teenage girls. A message seen by the news agency suggested the airstrike had been conducted by mistake.

The IDF did not respond to the Guardian’s questions about how Unit 8200 ensures its machine learning models, including the new LLM being developed, do not exacerbate inaccuracies and biases. It also would not say how it protects the privacy rights of Palestinians when training models with sensitive personal data.

“Due to the sensitive nature of the information, we cannot elaborate on specific tools, including methods used to process information,” a spokesperson said.

“However, the IDF implements a meticulous process in every use of technological abilities,” they added. “That includes the integral involvement of professional personnel in the intelligence process in order to maximize information and precision to the highest degree.”

Do you have information about this story? Email [email protected], or (using a non-work phone) use Signal or WhatsApp to message +44 7721 857348.
Read Entire Article
International | Politik|