OpenAI's ChatGPT-4o launched earlier this month and its new capabilities are designed to improve performance, support more multimodal inputs, and offer broader language support. Let's delve into the significance of this latest version:
I use ChatGPT in my everyday life. When I need to know the answer to a burning question, I unanimously use ChatGPT as my first choice. Traditional Google search is merely my regrettable plan B if I cannot find an acceptable answer on ChatGPT.
So far, I have not seen a notable difference in ChatGPT-4o when it comes to using OpenAI's interface. That said, ChatGPT powers so many other applications and developers have the authority to decide when they will adopt the latest version, so it's difficult for a layman to test drive all of the new features.
In order to get a closer look at the new features, I decided to ask ChatGPT-4o about itself directly through OpenAI's ChatGPT interface. How meta! Here's what it had to say about itself, and my response to each.
ChatGPT-4o: One of the standout features of ChatGPT-4o is its ability to process and generate responses not only from text but also from images and, potentially, audio inputs (aka multimodal capabilities). This allows users to interact with the AI in more dynamic ways, such as uploading a picture of a menu to get a translation or asking questions about visual content.
Brian (Reclaim Saturday): This capability is becoming more advanced but still needs to mature before it's more useful. I tried a few tests such as uploading a casual photo of myself and asked ChatGPT-4o to turn it into a professional headshot and it merely cropped the image (and not well). I also asked it to turn an audio file of one of Beyonce's lastest hits into a visual and it thought the song was Frank Sinatra's "Come Fly with Me" and returned an irrelevant image. In my experience some 3rd parties offer much higher quality AI-generated image and audio services than those offered directly through OpenAI's ChatGPT interface. Given ChatGPT's open source model, 3rd party developers are able to fine-tune OpenAI's pre-trained models and enrich them with their own data to better perform specfic tasks. This highlights the importance of ChatGPT's growing developer network.
ChatGPT-4o: ChatGPT-4o shows advanced capabilities in maintaining the context of a conversation over longer interactions. This improvement makes it particularly useful for detailed and complex dialogues in customer support, therapy sessions, or educational tutoring.
Brian (Reclaim Saturday): I haven't yet noticed a big difference here. This could be because of how I'm using the service which is typically for shorter, highly focused tasks.
ChatGPT-4o: The model now supports more languages, greatly expanding its accessibility and effectiveness for a global user base. This makes ChatGPT-4o a valuable tool for international businesses and educational platforms that cater to diverse populations.
Brian (Reclaim Saturday): I take this at face value. As more global users adopt ChatGPT, it will be interesting to see if an how OpenAI's governance model evolves.
ChatGPT-4o: Enhancements in the model's efficiency not only mean quicker responses but also more energy-efficient operations. This is crucial for integrating AI into consumer devices and applications that require real-time feedback without draining system resources.
Brian (Reclaim Saturday): If anytime, I am finding the new version slower than pervious versions. I expect this performance will improve over time.
ChatGPT-4o: OpenAI has continued to refine the safety features of its models, reducing the risk of generating harmful or biased content. This makes ChatGPT-4o a safer choice for businesses and developers who need to ensure the ethical use of AI technologies.
Brian (Reclaim Saturday): When I first started using ChatGPT I was frustrated that I could not understand the sources it consulted to generate its responses. OpenAI now exposes more information about ChatGPT's content sources, and I would like to see this transparency increase further.
I ended by asking ChatGPT when its next version will be released. Apparently we can expect ChatGPT-5.0 (or will they stick with the cute "o" naming convention?) sometime this summer. In addition to further advancing some of the features described above, ChatGPT-5.0 is "also expected to support autonomous AI agents capable of performing real-world tasks without human oversight​." This certainly makes sense given our understanding that AI technology like ChatGPT will start to become more independent from human developers as it evolves.
What would you like to see in the next version of ChatGPT?
Comments