beyond the headlines
March 1, 2023by Ebin Sandler, Ilana Touboul

Cybercriminals outfox ChatGPT safeguards and exploit API to create malicious content

Threat actors were observed on the underground, sharing tools to skirt safeguards and attempting to incorporate ChatGPT into Telegram channels via ChatGPT API, which has fewer restrictions than the web interface.

The Headline

On February 8, 2023, security researchers exposed cybercriminals bypassing restrictions in ChatGPT designed to prevent it from creating malicious content. While ChatGPT is designed to systematically block its abuse for generating illegal content, cybercriminals are using it to create malware and phishing services. Currently, ChatGPT rejects explicit requests to produce phishing emails or malicious code, returning messages that flag such queries as “illegal, unethical and damaging,” but malicious actors are using ChatGPT API to bypass the restrictions on ChatGPT's web-based interface.

Read: Reshaping the Threat Landscape in 2023: Cybersixgill Announces Top Trends in Cybersecurity

Using ChatGPT API, malicious developers incorporate the AI bot into their applications with unrestricted access to potentially harmful content. According to the researchers who discovered this bypass method, OpenAI’s current version of API seems to have very few anti-abuse protections. External apps use API to integrate OpenAI’s GPT-3 model, using it in situations such as Telegram conversations. Due to the current lack of restrictions in ChatGPT API, cybercriminals can use the tool to create malicious content, such as phishing emails and sophisticated malware.

Threat actors were observed underground selling a service to combine ChatGPT API and the Telegram messaging app. After 20 free queries, they charge $5.50 for every additional 100 queries. Cybercriminals have also been observed on underground forums offering free ChatGPT scripts to generate harmful content. Researchers who tested the ChatGPT API bypass’s effectiveness successfully created a phishing email and a script that steals PDF files from PCs, sending them to a third party through the File Transfer Protocol (FTP).

Initially, the anti-abuse controls in the ChatGPT web user interface were also insufficient to prevent the creation of malicious content. Throughout December 2022 and January 2023, threat actors could create malware and phishing emails using the ChatGPT web user interface. Recently, ChatGPT's anti-abuse controls were significantly improved, which appears to be the impetus for cybercriminals’ shift to API. To date, OpenAI has not publicly addressed the creation of malicious content using the API interface.

ChatGPT's ability to generate malware and other malicious content could transform this revolutionary AI tool into a powerful weapon in the hands of cybercriminals. On the defense side, ChatGPT can also be used to create methods to detect threats and prevent damaging cyber-attacks. While ChatGPT’s developers no doubt believe their product’s beneficial applications far outweigh the dangers it poses, OpenAI will need to remain a step ahead of cybercriminals constantly working to exploit this powerful new tool.

Diving Deeper

Cybersixgill collected a post from an established cybercrime forum member sharing a filter bypass tool to skirt ChatGPT’s restrictions. According to the forum member, this tool enables ChatGPT users to enter any prompt they want, regardless of the query's harmful, illegal, or unethical characteristics.

The forum member claimed they tested the tool and told other members not to “let censorship hold [them] back,” inviting them to “enjoy the freedom of unrestricted conversations with ChatGPT.” According to the forum’s rules, this post can only be viewed by replying to the post, prompting many members to respond. The responses were generally positive, expressing anticipation for and excitement about the bypass method.

Figure 1: A cybercrime forum post advertising a tool to bypass ChatGPT’s restrictions

Cybersixgill also collected a post on a Russian-language cybercrime forum from a member sharing a script allegedly written by ChatGPT that improves a previously written Python code for creating stealer malware. The forum member shared the commands used to improve the code and requested feedback from members proficient in Python.

The forum member also offered to write an article about creating malware with ChatGPT, including methods to bypass ChatGPT’s harmful content restrictions. Multiple forum members responded with interest in the article, further illustrating interest in ChatGPT restriction-bypass methods, which continue to generate attention on the underground.

Figure 2: A forum member demonstrates ChatGPT’s potential for improving stealer malware’s code

In addition, Cybersixgill collected a post from a member of a third cybercriminal forum who claimed to have discovered several methods for creating a Telegram bot coupled with ChatGPT. The forum member also shared a link to two GitHub repositories containing directions for setting up a Telegram bot to parse code and text to ChatGPT.

The installation directions in one of the repositories provided download instructions corresponding to different operating systems. While some members reported bugs, others expressed satisfaction with ChatGPT’s malware development capabilities.

Figure 3: A forum member shares GitHub links to create a Telegram bot using ChatGPT


While the revolutionary ChatGPT tool provides myriad benefits to legitimate users, the technology also opens doors for novice threat actors to launch cyber attacks. The phenomenon of hackers abusing legitimate tools is longstanding, and cybercriminals are eager to harness ChatGPT’s power for malicious ends.

In addition to the ChatGPT restriction-bypass trend, Cybersixgill observed cybercriminals creating ChatGPTxTelegram combinations, facilitated by OpenAI’s GPT-3 model, enable unrestricted access to potentially harmful content directly from malicious actors’ applications. ChatGPT API’s apparent lack of restrictions compared to the web-based interface represents a major boon to cybercriminals. While OpenAI is likely working to address this gap in safeguards, cybercriminals will continue to look for workarounds.

Cybersixgill automatically aggregates data leaks and alerts customers in real time.

Learn More

You may also like

CPO Gabi reisch

May 25, 2023

Cybersixgill generative AI sets a new industry standard for CTI

Read more

March 10, 2023

How CISO's and IT teams are using AI

Read more
Beyond The Headlines Issue 2

March 07, 2023

Beyond the Headlines: This month’s critical cybersecurity threats and emerging dark web trends

Read more