OpenAI has just introduced a groundbreaking new version of its AI voice assistant, and, as reported by The New York Times, it’s already drawing comparisons to the virtual assistant from the 2013 movie “Her.” The latest model, GPT-4o, is designed to interact with users in an impressively lifelike manner, recognizing emotions, adjusting its tone, and even singing on command. Here’s a closer look at what makes this new voice assistant so revolutionary.

Woman shrugging
✅ AI Essay Writer ✅ AI Detector ✅ Plagchecker ✅ Paraphraser
✅ Summarizer ✅ Citation Generator

Key Takeaways:

  • OpenAI’s new AI voice assistant, GPT-4o, interacts in a highly lifelike manner, recognizing emotions and adjusting its tone.
  • GPT-4o addresses issues of latency and conversational nuances seen in traditional voice assistants like Siri and Alexa by using “native multimodal support” for direct audio processing, resulting in smoother and more fluid conversations.
  • PT-4o is faster than its predecessor, GPT-4, and introduces the ability to understand and analyze audio and video files directly. However, it still may lack in accuracy similar to its earlier version.

OpenAI demonstrated GPT-4o’s capabilities in a series of live demos. The assistant can read stories with varying levels of drama, sing “Happy Birthday,” and act as a real-time translator. Its ability to change its voice—shifting from a soprano to a contralto, adding filler phrases like “hmm” and “let’s see,” and even giggling at jokes—makes it sound more human than previous AI assistants.


Another example of the new GPT-4o model! Truly incredible #ai #openai #gpt4o #artificialintelligence

♬ original sound – aiupnext

Traditional AI voice assistants like Siri and Alexa often sound flat and rather impersonal, failing to pick up on conversational nuances. They also suffer from delays that remind users they’re interacting with a machine. OpenAI’s GPT-4o addresses these issues with “native multimodal support,” which allows it to process audio prompts directly, reducing latency and making conversations smoother and more fluid.

If you fear that such interactions might be recorded and passed down to be sued by other companies, you don’t need to worry. OpenAI has specifically emphasized that they value the privacy and security of their users. Therefore, all interactions with GPT-4o are encrypted, and user data is handled with care, adhering to industry standards.

Besides the Voice Assistant, Are There Any Differences from GPT-4 Version?

Aside from the new voice assistant capabilities, GPT-4o brings several advancements over its predecessor, GPT-4. As shown in the research carried out by Tom’s Guide, one of the most significant differences is its speed. GPT-4o is notably faster in processing and responding to prompts.

It also introduces native multimodal support, allowing the model to understand and analyze audio and video files directly, a feature not available in previous versions. This allows for more fluid and dynamic interactions, improving the overall user experience. Additionally, GPT-4o offers more detailed and descriptive responses, which can be particularly useful for complex and creative tasks (like writing a haiku as done in the research, a poem, or maybe even a narrative). These improvements, along with the ability to handle video content, position GPT-4o as a certain upgrade from GPT-4.

The only thing that didn’t improve much is the accuracy of responses. Many users reported, that even after trying working with the voice assistant on the mobile app, they had to try for a couple of times before getting a correct response.

byu/synystar from discussion

There’s hope that as this is just a demo version (because the full access to GPT-4o voice assistant will be opened a bit later), the accuracy will also improve with time, especially if this AI learns from its mistakes.


OpenAI’s new voice assistant, GPT-4o, is undoubtedly a significant leap forward in AI technology. Looking ahead, the company plans to continue enhancing GPT-4o’s capabilities. Potential updates may include better integration with smart home devices, expanded language support, and even more sophisticated AI-driven features. Thus, as artificial intelligence continues to evolve, tools like this one are set to play an increasingly important role in our daily lives, providing assistance that feels almost human. What does it mean for us? Right now it’s hard to say, but if we wait we will certainly see a change in our day-to-day interactions with technology.

Opt out or Contact us anytime. See our Privacy Notice

Follow us on Reddit for more insights and updates.

Comments (0)

Welcome to A*Help comments!

We’re all about debate and discussion at A*Help.

We value the diverse opinions of users, so you may find points of view that you don’t agree with. And that’s cool. However, there are certain things we’re not OK with: attempts to manipulate our data in any way, for example, or the posting of discriminative, offensive, hateful, or disparaging material.

Your email address will not be published. Required fields are marked *


Register | Lost your password?