Apple has its own AI model coming: Is the tech giant finally catching up in the AI race?
TL;DR:
You've probably seen how the tech world is going wild for generative artificial intelligence, but guess who's been a bit low-key about it? Yep, Apple. But guess what? Recent chatter says the Cupertino tech giant is chatting up Google about borrowing its Gemini AI to give Siri a boost and jazz up iOS with some AI new tricks. And now, there's even more info popping up.
Last week, Apple discreetly dropped a research paper (via Wired) detailing its efforts on a multimodal large language model (MLLM) dubbed MM1 that can handle both text and images. The report shows MM1 answering questions about photos and flaunting a wide range of general knowledge skills akin to chatbots like ChatGPT. While the model's name remains a mystery, MM1 might just stand for MultiModal 1.
In a thread on X, Brandon McKinzie, an Apple researcher and the lead author of the MM1 paper, commented:
MM1 is a multimodal large language model, or MLLM, which means it is trained on both images and text. This unique training enables the model to respond to text prompts and tackle intricate questions about specific images.
In an example from the Apple research paper, MM1 was given a picture of a restaurant table with beers and a menu. When prompted about the expected cost of "all the beer on the table," the model accurately identifies the price and calculates the total expense.
Apple's iPhone already features an AI assistant, Siri. However, with the rapid emergence of competitors like ChatGPT, Siri's once groundbreaking capabilities are starting to feel constrained and outmoded. Both Amazon and Google have announced plans to incorporate Large Language Model (LLM) technology into their respective assistants, Alexa and Google Assistant. Google has even enabled Android phone users to swap out the Assistant with Gemini.
- Apple recently revealed research on its own AI model, MM1, which can understand both text and images.
- This development suggests Apple is working on more powerful AI capabilities for its products.
- The research indicates Apple is playing catch-up and gearing up for a bigger role in the AI race.
MM1 seems to share similarities in design and complexity with recent AI models from other tech titans, like Google's Gemini and Meta's open-source Llama 2. Research conducted by Apple's competitors and academic circles indicates that models of this caliber can fuel proficient chatbots or develop "agents" capable of executing tasks by coding and taking actions such as interacting with computer interfaces or websites. This hints that MM1 might eventually become a key component in Apple's lineup of products.
In a thread on X, Brandon McKinzie, an Apple researcher and the lead author of the MM1 paper, commented:
This is just the beginning. The team is already hard at work on the next generation of models. Huge thanks to everyone that contributed to this project!
MM1 is a multimodal large language model, or MLLM, which means it is trained on both images and text. This unique training enables the model to respond to text prompts and tackle intricate questions about specific images.
In an example from the Apple research paper, MM1 was given a picture of a restaurant table with beers and a menu. When prompted about the expected cost of "all the beer on the table," the model accurately identifies the price and calculates the total expense.
With competitors like Samsung and Google rolling out a slew of generative AI features for their smartphones, Apple is under pressure to stay competitive. Apple CEO Tim Cook has assured investors that the company will unveil more details about its generative AI initiatives this year.
What is more, just recently, Apple acquired DarwinAI, a Canadian AI startup known for developing compact and efficient AI systems. All this suggests that Apple is gearing up to make a big splash in the AI arena, so we can expect plenty more updates in the near future. Keep an eye out for further developments!
Things that are NOT allowed: