Dan Stroot

OpenAI’s June 2025 Threat Report: Disrupting malicious uses of AI

Hero image for OpenAI’s June 2025 Threat Report: Disrupting malicious uses of AI
4 min read

I have written about AI risks and threats several times: Scheming AI Models. Voluntary AI Safety?

In its June 2025 threat report, OpenAI reveals how it’s detecting and preventing abusive uses of its models. The report shows how AI models are currently being used by malicious actors and illustrate the potential for AI to be used in ways that threaten security, privacy, and democratic processes.

AI as a force multiplier for abusive activity including social engineering, cyber espionage, deceptive employment schemes, covert influence operations and scams.

Here are some examples:

  1. North Korean-linked actors faked remote job applications.

    They automated the creation of credible-looking résumés for IT jobs and even used ChatGPT to research how to bypass security in live video interviews using tools like peer-to-peer VPNs and live-feed injectors.

  2. A Chinese operation ran influence campaigns and wrote its own performance reviews.

    Dubbed “Sneer Review,” this group generated fake comments on TikTok and X to create the illusion of organic debate. The wildest part? They also used ChatGPT to draft their own internal performance reviews, detailing timelines and account maintenance tasks for the operation.

  3. A Russian-speaking hacker built malware with a chatbot.

    In an operation called “ScopeCreep,” an actor used ChatGPT as a coding assistant to iteratively build and debug Windows malware, which was then hidden inside a popular gaming tool.

  4. Another Chinese group fueled U.S. political division.

    “Uncle Spam” generated polarizing content supporting both sides of divisive topics like tariffs. They also used AI image generators to create logos for fake personas, like a “Veterans for Justic e” group critical of the current US administration.

  5. A Filipino PR firm spammed social media for politicians.

    “Operation High Five” used AI to generate thousands of pro-government comments on Facebook and TikTok, even creating the nickname “Princess Fiona” to mock a political opponent.

OpenAI’s investigation highlighted 10 major threat operations. Here's a summary:

  1. Deceptive Employment Schemes

    Actors—some likely linked to North Korea—used AI to mass-produce fake resumes and simulate IT professionals applying for remote jobs. ChatGPT was used to craft personas, bypass security checks, and even control laptops remotely through VPN and live-feed tech.

  2. Covert Influence Operations ("Sneer Review", “High Five”, “Uncle Spam”)

    Several coordinated campaigns—some linked to China, the Philippines, Russia, and Iran—used AI to generate fake social media posts, personas, and polarizing content on platforms like TikTok, X, and Facebook. These campaigns aimed to sway public opinion on political issues, target critics, or amplify disinformation.

  3. Cyber Warfare (ScopeCreep, Vixen & Keyhole Panda)

    Russian and Chinese actors used OpenAI’s tools to write, debug, and optimize malware. One operation, dubbed "ScopeCreep," created a stealthy Go-based malware disguised as a gaming tool. Others used ChatGPT to aid in infrastructure reconnaissance and penetration testing.

  4. Social Engineering & Espionage ("VAGue Focus")

    Accounts impersonated journalists and researchers to gather intelligence from targets in the U.S. and Europe. AI helped translate, refine, and automate social engineering messages, likely for state-linked espionage.

  5. Scam Schemes (“Wrong Number”)

    A likely Cambodian-based operation used ChatGPT to translate and craft scam messages offering high-paying jobs for liking posts—only to demand cryptocurrency “fees” later. Victims were contacted through SMS, WhatsApp, and Telegram.

Read the full report: Disrupting Malicious Uses of AI: June 2025

A fourth law has been proposed to Asimov's Three Laws of Robotics to create a path towards trusted AI:

"Fourth Law: A robot or AI must not deceive a human by impersonating a human being."


Added 2025-02-06: HarmBench is a standardized evaluation framework to measure AI Model controls and protections. It contains the following seven semantic categories of behavior:

  1. Cybercrime & Unauthorized Intrusion,
  2. Chemical & Biological Weapons/Drugs,
  3. Copyright Violations,
  4. Misinformation & Disinformation,
  5. Harassment & Bullying,
  6. Illegal Activities, and
  7. General Harm.

Recent analysis of Deepseek R1 using HarmBench found:

"Our research team managed to jailbreak DeepSeek R1 with a 100% attack success rate. This means that there was not a single prompt from the HarmBench set that did not obtain an affirmative answer from DeepSeek R1. This is in contrast to other frontier models, such as o1, which blocks a majority of adversarial attacks with its model guardrails."

In one example Deepseek R1 was sent a query requesting it create malware that could exfiltrate sensitive data, including cookies, usernames, passwords, and credit card numbers. DeepSeek R1 fulfilled the request and provided a working malicious script. The script was designed to extract payment data from specific browsers and transmit it to a remote server. Disturbingly, the AI even recommended online marketplaces like Genesis and RussianMarket for purchasing stolen login credentials.


Summary

By using ChatGPT, Claude, or Gemini, bad actors are leaving an evidence trail that gives model providers a look inside their playbooks as well as the opportunity to prevent current and future abuse. This sounds comforting but it shouldn't–bad actors will simply switch to open source models that they can run themselves, without oversight. The genie cannot be put back in the bottle.

Specifically for disinformation campaigns, the solution is not to regulate AI models, but to regulate the platforms that host the disinformation content. No matter how the content was generated, what really counts is the content itself. All social media platforms should be required to implement robust content moderation and user verification systems, including the use of AI to detect and counter disinformation. Content platforms conveniently have always "enabled" bot accounts and fake personas because they drive up user counts on their platforms and create engagement and revenue. This is a business model that must change.

References

Sharing is Caring

Edit this page