AI Vishing
The Future of Social Engineering Attacks

We are not prepared
Introduction
Having used AI voice cloning during social engineering engagements and CTF labs this year, it became clear to me quickly that most support desks, and organizations overall, are not prepared to deal with this threat. To demonstrate the ease of these attacks, let's walk through the following attack setup.
Attack Flow
- Obtain the impersonation victim's phone number - This can be done in a variety of ways, from simply looking on LinkedIn to phishing.
- Obtain a voice sample of the victim - This is surprisingly easy in practice and can also be done in a variety of ways. Some examples include: cutting it from a podcast or investor call, or simply dialing the person and ripping it from their voicemail.
- Use ElevenLabs to clone the voice - (Breaks TOS; do not do this unless given explicit permission to do so.)
- Create a voicemail that requests a password reset or callback - The ElevenLabs sound effect generator can be used along with Audacity to make the call extremely realistic, adding in background noise, distortion, and distractions like a bus or airplane going by.
- Spoof the impersonated victim's caller ID - Tools like Spooftel or Spoofcard work well for testing. You can also use the AI voice to create a voicemail for a callback number created through something like Google Voice to make the attack seem even more real. (Number spoofing is not legal; do not do this unless given permission.)
- Send the AI voicemail and gain a foothold - Through a password reset or other payload means, you can achieve initial access. ("Look at me, I am the CEO now!")
Why It Works
In this scenario, you are calling from the CEO's number, using the CEO's voice, and requesting a callback to a number that has a voicemail using the CEO's voice. And this is all using easily available tools.
Advanced Techniques
It's almost at the point where you can rapidly translate text to speech through the AI-cloned voice in a realistic fashion. At that point, you wouldn't need to use asynchronous communications either, vastly increasing the effectiveness of the technique. Furthermore, I'm sure some readers have seen the AI video scams as well; that technology is rapidly approaching the point where a stressed and overloaded help desk employee would very likely not be able to tell it was an AI video Teams message.
LLM-Powered Vishing
One other interesting and more sophisticated way this AI vishing attack could be conducted is by leveraging a bot that is synced to the voice and uses an LLM to communicate, primed with the goal of obtaining the payload (password reset, callback, etc.). In this way the victim is put even more at ease by the conversational nature of the call and the back and forth and realistic pushback using social engineering strategy employed by the AI call bot. This could further be enhanced by training the LLM on a victim's writings (for example, emails, social media, or books) allowing the bot to even impersonate their speaking patterns. This would be a very sophisticated attack requiring large amounts of resources, but it demonstrates the larger possibilities of AI social engineering.
Defenses
What does this all mean? What is the action item from an organization's perspective? Checklists and verification procedures must be followed every time. Does your organization have written-down procedures? Are they always followed?
Additionally, all forms of voice security should be deemed insecure and obsolete. The ease of cloning voices and the availability of samples makes the risk of bypass very high.
Conclusion
Day by day it becomes harder to differentiate the artificially generated from the real, and this is not a trend that will change anytime soon. Social engineering is already a common real-world attack vector that results in breaches. I predict it will only increase in effectiveness due to AI tooling.
Note: Do not break the law.

All hail the AI overlords