Home
News
You are here

Apple researchers show that AI can't even solve grade-school math problems very well

0comments

By Iskra Petrova

Published: Oct 17, 2024, 2:08 AM

Apple

Apple researchers have now confirmed some serious logical faults in generative AI's reasoning, especially when it comes to numbers and math. In fact, it seems AI isn't as "smart" as is believed, and couldn't achieve a stellar result when solving basic elementary-school math problems.

A newly-published paper from six Apple researchers called "GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models", shows that math reasoning that advanced large language models (LLMs) can be inaccurate and fragile.

What the researchers did was start with GSM8K (this is a dataset of high-quality linguistically diverse grade school math word problems) and its standardized set of 8,000 grade-school level math problems.

This is a common benchmark for testing LLMs. Then, the researchers slightly altered the wording without changing the problem logic and called this test the GSM-Symbolic test.

How smart could AI be? | Image Credit - Solen Feyissa on Unsplash

The first set of testing recorded a performance drop between 0.3 percent and 9.2 percent. The second set, which had a statement included in some of the problems that had nothing to do with the answer, showed "catastrophic performance drops" from about 17.5 percent to a massive 65.7 percent.

For some people, this is not at all surprising. I've personally seen AI struggle with some simple tasks related to numbers. In fact, AI doesn't properly solve math problems but instead uses simple "pattern matching" to convert statements to operations without truly grasping what it all means.

It seems that the AI tended to fail to solve simple math problems because the words were essentially too confusing or didn't follow the exact pattern. All in all, it seems that AI just gives the illusion of "reasoning" and instead just relies on hoarding data and then processing it.

But what would that mean for the bigger picture? We've all been way too focused on AI recently and it seems that some people are expecting wonders from it (I'm also guilty of similar thinking). But it has serious limitations and I'm not sure if those would be able to be mitigated. Of course, I'm not an AI scientist, but it'll be very curious to see where AI's growth will stagger (well, apart from math, that is!).

$50

Gift cards for Back Market - from $20 to $500, and it never expires!

Get at Back Market

View Full Bio

Izzy, a tech enthusiast and a key part of the PhoneArena team, specializes in delivering the latest mobile tech news and finding the best tech deals. Her interests extend to cybersecurity, phone design innovations, and camera capabilities. Outside her professional life, Izzy, a literature master's degree holder, enjoys reading, painting, and learning languages. She's also a personal growth advocate, believing in the power of experience and gratitude. Whether it's walking her Chihuahua or singing her heart out, Izzy embraces life with passion and curiosity.

Read the latest from Iskra Petrova