AI Updates on 2025-12-22
AI Model Announcements
- Google DeepMind launches YouTube Playables Builder powered by Gemini 3, enabling creators to develop bite-sized games using text, video, or image prompts without coding @GoogleDeepMind
- Alibaba releases GLM-4.7, surpassing GLM-4.6 with substantial improvements in coding, complex reasoning, and tool usage, setting new open-source standards @Zai_org
- Google launches Gemini 3 Flash for small business applications, capable of analyzing customer feedback, drafting launch emails, and coding branded landing pages @GeminiApp
- Google integrates Gemini 3 into Google Search, introducing GenUI and frontier AI experiences @OfficialLoganK
AI Industry Analysis
- OpenAI publishes methodology on continuously hardening ChatGPT Atlas and other agents against novel prompt-injection attacks through automated red teaming, reinforcement learning, and rapid response loops @cryps1s
- YouTube Playables Builder demonstrates potential to usher in the next 100 million developers by making game creation accessible without traditional programming languages like C/C++/C# @OfficialLoganK
- Demis Hassabis suggests Google is positioning itself as a game publishing house for the public, potentially running AAA games on Google's platform with subscription model @AndrewCurran_
- Truemed raises $34 million Series A led by a16z to shift healthcare spending toward prevention, enabling consumers to use HSA and FSA dollars on evidence-based lifestyle interventions rather than treating chronic conditions after illness @a16z
- Amazon reportedly investing up to $10 billion in OpenAI, raising questions about how to define real revenue with circular deals where investment money returns to purchase the investor's products @TechCrunch
AI Ethics & Society
- Demis Hassabis challenges Yann LeCun's claim that general intelligence doesn't exist, arguing LeCun confuses general intelligence with universal intelligence, and that human brains and AI foundation models are approximate Turing Machines capable of learning anything computable given sufficient time, memory, and data @demishassabis
- Francois Chollet warns that the goal of AI should be to expand human thought and agency, not replace it, citing Dune's 1965 warning about turning thinking over to machines @fchollet
- Journal editors lack consensus on adjusting peer review for the flood of AI-written papers, where bad papers now appear as good papers, making reviewing harder and requiring second reads for quality assessment @emollick
- Simon Willison successfully uses Claude browser agent to navigate Cloudflare control panel, marking his first successful experience using a browser agent to solve a real problem @simonw
AI Applications
- Meta's Segment Anything Models advance flood monitoring and disaster response, with USRA and USGS fine-tuning SAM to automate river mapping for faster, scalable, and cost-effective disaster preparedness @AIatMeta
- Apple's Live Translate enables 30-minute conversation between users with language barriers, though accuracy issues persist with complex ideas and fast talking in languages like Chinese @brian_lovin
- Developer successfully uses AI agent to launch overnight run after exhausting manual debugging attempts, demonstrating practical automation of complex development tasks @aidan_mclau
- Gemini successfully builds interactive simulation explaining collider bias from a single prompt, working on first attempt with Canvas enabled @emollick
- NotebookLM introduces Data Tables feature powered by Google DeepMind research on data curation, helping users structure complex information and export to Google Sheets @lindsaywillmore
- OpenAI launches "Your Year with ChatGPT" personalized review feature, rolling out to users in US, UK, Canada, New Zealand, and Australia with chat history enabled @OpenAI
- Splat's app uses AI to transform photos into coloring pages for children @TechCrunch
- Developer builds robot that can see, hear, and move using Claude Code for heavy lifting in robotics debugging, with both apps reaching official app store @BioInfo
AI Research
- Ethan Mollick analyzes correlations between METR long-task measurement and other key benchmarks using GPT-5.2 Pro, finding high correlations across all benchmarks including ARC-AGI, suggesting either all benchmarks measure the same thing or AI improves uniformly across all measures @emollick
- Francois Chollet describes LLMs as representing the "library" phase of AI, with the next "scientist" phase focusing on finding answers that don't exist yet through algorithmic processes similar to Science @fchollet
- Physical Intelligence demonstrates fine-tuned robots successfully performing tasks including washing pans, cleaning windows, and making peanut butter sandwiches, with implications for Moravec's paradox and large models in embodied AI @physical_int
- Research suggests reinforcement learning can learn new capabilities beyond base model knowledge as long as entropy collapse is avoided, contrary to early pass@k experiments that suggested RL only sharpens existing knowledge @ChenSun92
- Researchers demonstrate transformers' potential for economic modeling beyond LLMs, testing transformer fit on data simulated from NK model with successful out-of-sample performance @alexolegimas
- Midjourney focuses on tools for guidance, curation, and creating variation among options rather than instruction following from text, emphasizing experimentation and refinement in image generation @emollick
- Ethan Mollick argues that high-quality image generators like Nano Banana Pro unlock new AI abilities including research and compelling slide generation, highlighting importance of addressing bottlenecks @emollick
- Context window and compaction identified as critical unsolved problem requiring resolution in 2026 @Suhail
- Robot Olympics proposed as method to regularize hype, with participants facing unknown environments and tasks to test generalization capabilities, addressing current robots' failure at generalization despite successful fine-tuning @Suhail