In a bold move to challenge the hype surrounding artificial general intelligence (AGI), French AI researcher François Chollet has introduced the $1 million ARC Prize. This reward aims to debunk the notion that large language models (LLMs) like ChatGPT can achieve AGI. Chollet asserts that OpenAI’s focus on LLMs has hindered AGI progress, describing them as an “off-ramp on the path to AGI.” His ARC test demands AI systems to demonstrate genuine adaptability and reasoning beyond mere memorization, a capability current LLMs struggle with.
Apple’s Ingenious Approach to AI: Privacy and Precision
Apple’s latest AI strategy, revealed at WWDC, subtly revolutionizes the field by prioritizing user privacy and task-specific efficiency. Unlike competitors grappling with issues like hallucinations and overpromising, Apple employs compressed, on-device models fine-tuned for specific tasks such as summarization, proofreading, and auto-replies. This method ensures most AI tasks are securely and effectively handled on the device, with complex queries sent to Apple’s servers or ChatGPT, all under stringent privacy controls. This innovation not only enhances AI performance but also offers a compelling reason to upgrade Apple devices.
Google’s AI Fiasco: The Pizza Glue Incident
Google’s AI continues to struggle with accuracy, as highlighted by its infamous advice to add glue to pizza. This week, it repeated the blunder, citing past erroneous reports. This recursive error showcases a critical flaw in Google’s AI training process, which inadvertently reinforces incorrect information reported by journalists. Verge journalist Elizabeth Lopatto remarked, “Every time someone like me reports on Google’s AI getting something wrong, we’re training the AI to be wronger.”
OpenAI Researchers Predict AGI Within Three Years
Despite skepticism, OpenAI researchers Leopold Aschenbrenner and James Bekter foresee the advent of AGI within the next three years. Aschenbrenner’s comprehensive analysis suggests AI models could soon match human researchers’ capabilities, potentially leading to superintelligence. Bekter shares a similar outlook, predicting significant advancements in system thinking and embodiment within a few years. However, critics argue that technological progress might plateau, delaying these breakthroughs.
LLMs Fail Basic Reasoning Tests
New research underscores the limitations of LLMs in tackling novel problems. Despite passing bar exams, models like GPT-4, Claude, and Gemini struggle with basic reasoning. For instance, when asked a straightforward question about siblings, these models often provide nonsensical answers with unwarranted confidence. This finding reinforces the argument that LLMs excel at data retrieval but lack true understanding and reasoning skills. Yann LeCun highlighted, “Reasoning abilities and common sense should not be confused with an ability to store and approximately retrieve many facts.”
AI Models Mislead on Election Information
A study by GroundTruthAI reveals that AI models, including Google Gemini and ChatGPT, frequently provide incorrect information about the 2024 U.S. election. The research indicates a significant margin of error, with Gemini answering correctly 57% of the time and GPT-4 achieving 81% accuracy. Common mistakes include incorrect ages of candidates and misinformation about voting regulations. Consequently, Google and Microsoft’s chatbots now avoid answering election-related queries to prevent misinformation.