OpenAI Unveils New ChatGPT Capable of Advanced Math and Science Reasoning

MarcoMarco
3 min read

OpenAI has introduced a new version of its popular chatbot, ChatGPT, equipped with advanced capabilities to handle complex tasks in math, coding, and science. Powered by a new AI technology called OpenAI o1, this update aims to address some of the common limitations seen in chatbots, such as struggling with simple math problems or generating incomplete code.

On Thursday, OpenAI unveiled this improved version of ChatGPT, which is designed to "reason" through tasks more effectively than its predecessors. Unlike earlier models, which generated responses instantly, the new ChatGPT takes a more thoughtful approach. "This model can take its time, think through the problem in English, and break it down to find the best solution," said Jakub Pachocki, OpenAI’s chief scientist.

During a demonstration, Pachocki and OpenAI technical fellow Szymon Sidor showcased the chatbot solving an acrostic puzzle, answering a Ph.D.-level chemistry question, and diagnosing an illness based on a patient’s detailed medical history. These tasks demonstrated the bot’s enhanced ability to handle complex reasoning.

The OpenAI o1 technology is part of a broader movement to create AI systems capable of logical, step-by-step problem-solving—much like how humans approach complex issues. Competitors like Google and Meta are developing similar systems, while Microsoft, which partners with OpenAI, plans to integrate this technology into its products.

The aim is to develop AI that can solve problems in a structured way, making it a valuable tool for a range of applications, such as helping programmers write code or creating automated tutors for subjects like math. Additionally, OpenAI said this new model could assist physicists with generating complex mathematical formulas and support healthcare researchers in their experiments.

Since its debut in late 2022, ChatGPT has transformed how AI interacts with users, allowing it to answer questions, write papers, and even generate code. However, the earlier versions were not without flaws—at times making mistakes, generating buggy code, or repeating misinformation found online.

OpenAI has worked to reduce these issues in the new system through reinforcement learning, a process where the AI learns from trial and error over an extended period. For instance, the system can work through math problems repeatedly to identify successful strategies and discard flawed ones. While this approach improves accuracy, the AI is still prone to making mistakes. "It’s not going to be perfect," acknowledged Sidor, "but it’s more likely to provide the right answer by working harder."

The upgraded ChatGPT technology is now available to subscribers of ChatGPT Plus and ChatGPT Teams, as well as businesses and developers looking to integrate it into their own applications.

According to OpenAI, the new model has significantly improved performance on certain standardized tests. On the International Mathematical Olympiad (IMO) qualifying exam, the previous version of ChatGPT scored 13%, whereas the OpenAI o1 model achieved an 83% score. However, experts warn that test performance does not always reflect real-world capabilities. While the system may excel at answering math test questions, it may still face challenges in tutoring students effectively.

"There’s a difference between problem-solving and assistance," noted Angela Fan, a research scientist at Meta. "New models that reason can solve problems, but that doesn’t necessarily mean they can guide someone through their homework."

Despite its limitations, OpenAI’s latest innovation marks a significant step forward in the development of AI systems that can tackle more complex, real-world challenges

0
Subscribe to my newsletter

Read articles from Marco directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Marco
Marco