A group of researchers has created a first-generation AI worm that can steal data through email clients, spread malware, and spam other users to spread across multiple systems. The worm has been developed and works well as designed in a test environment using a popular LLM. Based on their findings, the researchers advised GenAI developers and shared their concerns about the potential dangers such malicious programming could pose. The team shared a research paper and published a video showing how the two methods can be used to steal data and influence other email clients.
The worm was written by Ben Nassi of Cornell Tech, Stav Cohen of Israel Institute of Technology, and Ron Bitton of Intuit. They named it “Morris II” after the original Morris, the first computer worm that caused worldwide nuisance online in 1988. The worm works by targeting Gen AI apps and also works with Gen AI-enabled email assistants that generate text and images against AI models. Gemini Pro, ChatGPT 4.0, LLaVA, etc.
It works by using an adversarial self-replicating prompt used against its models, similar to a jailbreak feature that uses AI to spread harmful content. Researchers demonstrated this by creating an email system with these generative AI engines and using self-replicating prompts embedded in text or image files.
The text prompt uses LLM to infect the email assistant and uses additional data from outside the system. That data is then sent to his GPT-4 or Gemini Pro to create text content. This content successfully jailbreaks his GenAI service and steals data. The image prompt method involves encoding a self-replicating prompt into an image, having an email assistant forward messages containing propaganda or exploits to everyone, infecting new email clients, and forwarding the infected email. Both processes can allow researchers to mine sensitive information, including but not limited to credit card details and social security numbers.
That such worms work even in controlled environments means that it is no longer just a theory, but a serious consideration for effective solutions to be deployed whenever such malicious prompts are found. It proves that you need to. This is where research papers like this come in, shared with affected parties, and made available for simulation and validation by others.
GenAI leader response and deterrence deployment plan
Like all responsible researchers, the team reported its findings to Google and OpenAI. Wired contacted Google, which declined to comment on the study, but an OpenAI spokesperson responded. They said, “It appears that we have found a way to exploit a prompt injection vulnerability by relying on unchecked and unfiltered user input.” They also ensured that systems were made more resilient, adding that developers should use methods to ensure that harmful inputs are not used.
We find that such techniques can infect generated AI applications and compromise users' systems. Obtained when implemented, these are important. In some cases, AI-enabled SSDs can identify and eliminate ransomware. But on the other hand, there are worms and custom LLMs that can create malware.
This is where the industry needs to pace itself and take steps to attack or deploy effective solutions for all genAI-based products released to the public. New solutions and innovations can create new problems. As studies like this reveal such issues in the early stages of AI apps, securing potentially harmful GenAI engines should be a priority.