Editor choice


OpenAI unveils voice cloning tool with strict safeguards

OpenAI says ways to verify people consented to having their voices immitated by artificial intelligence and to automatically detect audio deep fakes involving prominent people should be built in widely deployed 'synthetic voice' tools.


In a move that underscores the profound implications of artificial intelligence (AI) on society, OpenAI has revealed a groundbreaking voice cloning tool called "Voice Engine." However, recognizing the serious risks associated with such technology, the company plans to keep it tightly controlled until robust safeguards are in place to thwart the spread of audio fakes meant to dupe listeners.

The Voice Engine model, as described in an OpenAI blog post, can essentially duplicate someone's speech based on a mere 15-second audio sample, a capability that has both exciting and concerning implications. While the potential applications of such technology are vast, ranging from entertainment to accessibility aids, the risks of misuse are equally significant.

"We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year," OpenAI stated. "We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build."

Disinformation researchers have raised alarms about the potential for rampant misuse of AI-powered voice cloning tools, especially in the context of a pivotal election year. These tools are cheap, easy to use, and difficult to trace, making them a potential weapon in the hands of bad actors seeking to sow discord and undermine public trust.

The cautious unveiling of Voice Engine comes on the heels of a recent incident in which a political consultant working for a Democratic presidential candidate admitted to being behind a robocall impersonating President Joe Biden. The AI-generated call, created by an operative for Minnesota congressman Dean Phillips, featured what sounded like Biden's voice urging people not to cast ballots in January's New Hampshire primary, causing alarm among experts who fear a deluge of AI-powered deepfake disinformation in the 2024 White House race and other key elections around the globe.

In response to these concerns, OpenAI has taken a proactive stance, emphasizing that it is "taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse." The company has implemented a set of safety measures, including watermarking to trace the origin of any audio generated by Voice Engine, as well as proactive monitoring of how the tool is being used.

Additionally, OpenAI has established strict rules for partners testing Voice Engine, requiring explicit and informed consent of any person whose voice is duplicated using the tool. It is also mandatory for audiences to be made aware when they are hearing AI-generated voices, ensuring transparency and accountability.

As AI continues to advance at an unprecedented pace, companies like OpenAI find themselves at the forefront of a complex ethical and societal challenge. While the potential benefits of technologies like Voice Engine are undeniable, the risks of misuse and the erosion of public trust cannot be ignored. OpenAI's cautious approach to the release of Voice Engine serves as a reminder that the responsible development and deployment of AI must be a collaborative effort, involving stakeholders from government, media, entertainment, education, civil society, and beyond.

In the words of OpenAI, "We are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse." As the world grapples with the implications of this powerful technology, it is clear that a delicate balance must be struck between innovation and safeguarding the integrity of information and communication.

Share with friends:

Write and read comments can only authorized users