Google disrupts Gemini model extraction attempts

Google Threat Intelligence Group (GTIG) has identified and disrupted model extraction activity targeting Gemini, including one cluster of 100,000+ prompts that attempted to coerce the model into revealing “reasoning” behavior that could be used to replicate capabilities elsewhere.

GTIG described model extraction, also called distillation, as an intellectual property threat where an adversary uses legitimate access to probe a model at scale to collect outputs that can help train a “student” model.

A ‘student’ model’ refers to a smaller or separate model trained to imitate the target model’s outputs.

GTIG said the activity violates Google’s terms and may be subject to takedowns and legal action.

It said it continuously detects and mitigates model extraction efforts to protect proprietary logic and specialized training data and described the use of real-time defenses intended to degrade the performance of a student model.

GTIG’s report also placed model extraction alongside a broader pattern it says it tracked in late 2025: government-backed groups integrating generative AI into familiar phases of intrusion work.

AI is accelerating familiar attack work, not creating “new magic”

The group said it observed actors linked to North Korea, Iran, the People’s Republic of China and Russia using Gemini and other AI tools for technical research, targeting, phishing and development tasks, and said its visibility improved as it identified direct and indirect links between Gemini misuse and activity in the wild.

In examples GTIG outlined, an Iran-linked cluster it tracks as APT42 used AI to research targets and craft scenarios to establish credible pretexts for outreach. A North Korea-linked cluster it tracks as UNC2970 used AI to synthesize open-source information and profile high-value targets for campaign planning.

APT is shorthand for advanced persistent threat, a label commonly used for state-backed or state-aligned groups, while ‘UNC’ is Google’s temporary naming for clusters not yet publicly attributed.

GTIG also said a China-linked cluster it tracks as APT31 prompted with an “expert cybersecurity persona” to automate vulnerability analysis and generate testing plans, including work on exploitation themes such as web application firewall bypass and SQL injection testing.

In summary, GTIG said that generative AI is raising attacker productivity in steps that defenders already recognize. It described the current phase as integration into existing tactics rather than a shift to entirely new categories of threat.

HONESTCUE shows how AI can be pulled into malware workflows

The report also described a malware family it tracks as HONESTCUE that uses the Gemini API during execution to generate C# source code for later-stage functionality.

C# is Microsoft’s programming language for building .NET and Windows software, so GTIG is saying the malware is pulling generated code to power later steps of execution.

GTIG said the fileless secondary stage compiles and runs that code in memory using the legitimate .NET CSharpCodeProvider framework. Microsoft documents CSharpCodeProvider as a .NET component used to generate and compile C# code, which can enable runtime compilation rather than relying on a precompiled binary.

Since the second stage runs in process memory rather than as a file-backed payload, the tradecraft can reduce file-based traces compared with dropping a static executable to disk, though it still requires outbound calls to an AI service endpoint.”

MITRE explicitly contrasts reflective/in-memory execution with execution “backed by a file path on disk,” which supports that implication.

Industry Implications of Model Extraction

The model-extraction angle matters beyond Google because it targets the work behind a model, not only the infrastructure that serves it. Melissa Ruzzi, director of AI at AppOmni, in a statement to TechInformed, said the cost of training new models makes “model extraction attacks” an attractive illegal shortcut, comparing them to reverse engineering in other industries such as automotive.

Ruzzi added that as model providers add guardrails, attackers may try to extract models “in an effort to use their power without guardrails.”

What Google said it changed

Google said it took action against identified threat actors by disabling assets tied to the activity. It also said Google DeepMind used the observations to harden both model-level and classifier-based controls to reduce the risk of similar misuse.

The report described enforcement actions tied to abuse of Gemini sharing features and to services that marketed “custom AI” while relying on third-party models.

What enterprises can take from the report

For enterprise security teams, Ruzzi said “we can expect more and more AI to be used in attacks,” adding that “attacks on SaaS apps, such as data breaches stemming from misconfigurations may become even more common than they are now, with AI facilitating the exploitation.” She also warned that “the cost of SaaS breaches will grow” and that “proper configurations for permissions and data will become even more important than they are now.”

GTIG said it has not observed APT or information operations actors achieving “breakthrough” capabilities that fundamentally change the threat landscape. It described the activity it saw as actors experimenting with AI tools and said it is tracking whether that use becomes more consistently operational over time.

Google disrupts Gemini model extraction attempts

The rise of the silver collar workforce