AI Updates on 2025-12-22

AI Model Announcements

Google DeepMind launches YouTube Playables Builder powered by Gemini 3, enabling creators to develop bite-sized games using text, video, or image prompts without coding @GoogleDeepMind
Alibaba releases GLM-4.7, surpassing GLM-4.6 with substantial improvements in coding, complex reasoning, and tool usage, setting new open-source standards @Zai_org
Google launches Gemini 3 Flash for small business applications, capable of analyzing customer feedback, drafting launch emails, and coding branded landing pages @GeminiApp
Google integrates Gemini 3 into Google Search, introducing GenUI and frontier AI experiences @OfficialLoganK

AI Industry Analysis

OpenAI publishes methodology on continuously hardening ChatGPT Atlas and other agents against novel prompt-injection attacks through automated red teaming, reinforcement learning, and rapid response loops @cryps1s
YouTube Playables Builder demonstrates potential to usher in the next 100 million developers by making game creation accessible without traditional programming languages like C/C++/C# @OfficialLoganK
Demis Hassabis suggests Google is positioning itself as a game publishing house for the public, potentially running AAA games on Google's platform with subscription model @AndrewCurran_
Truemed raises $34 million Series A led by a16z to shift healthcare spending toward prevention, enabling consumers to use HSA and FSA dollars on evidence-based lifestyle interventions rather than treating chronic conditions after illness @a16z
Amazon reportedly investing up to $10 billion in OpenAI, raising questions about how to define real revenue with circular deals where investment money returns to purchase the investor's products @TechCrunch

AI Ethics & Society

Demis Hassabis challenges Yann LeCun's claim that general intelligence doesn't exist, arguing LeCun confuses general intelligence with universal intelligence, and that human brains and AI foundation models are approximate Turing Machines capable of learning anything computable given sufficient time, memory, and data @demishassabis
Francois Chollet warns that the goal of AI should be to expand human thought and agency, not replace it, citing Dune's 1965 warning about turning thinking over to machines @fchollet
Journal editors lack consensus on adjusting peer review for the flood of AI-written papers, where bad papers now appear as good papers, making reviewing harder and requiring second reads for quality assessment @emollick
Simon Willison successfully uses Claude browser agent to navigate Cloudflare control panel, marking his first successful experience using a browser agent to solve a real problem @simonw

AI Applications

Meta's Segment Anything Models advance flood monitoring and disaster response, with USRA and USGS fine-tuning SAM to automate river mapping for faster, scalable, and cost-effective disaster preparedness @AIatMeta
Apple's Live Translate enables 30-minute conversation between users with language barriers, though accuracy issues persist with complex ideas and fast talking in languages like Chinese @brian_lovin
Developer successfully uses AI agent to launch overnight run after exhausting manual debugging attempts, demonstrating practical automation of complex development tasks @aidan_mclau
Gemini successfully builds interactive simulation explaining collider bias from a single prompt, working on first attempt with Canvas enabled @emollick
NotebookLM introduces Data Tables feature powered by Google DeepMind research on data curation, helping users structure complex information and export to Google Sheets @lindsaywillmore
OpenAI launches "Your Year with ChatGPT" personalized review feature, rolling out to users in US, UK, Canada, New Zealand, and Australia with chat history enabled @OpenAI
Splat's app uses AI to transform photos into coloring pages for children @TechCrunch
Developer builds robot that can see, hear, and move using Claude Code for heavy lifting in robotics debugging, with both apps reaching official app store @BioInfo

AI Research

Ethan Mollick analyzes correlations between METR long-task measurement and other key benchmarks using GPT-5.2 Pro, finding high correlations across all benchmarks including ARC-AGI, suggesting either all benchmarks measure the same thing or AI improves uniformly across all measures @emollick
Francois Chollet describes LLMs as representing the "library" phase of AI, with the next "scientist" phase focusing on finding answers that don't exist yet through algorithmic processes similar to Science @fchollet
Physical Intelligence demonstrates fine-tuned robots successfully performing tasks including washing pans, cleaning windows, and making peanut butter sandwiches, with implications for Moravec's paradox and large models in embodied AI @physical_int
Research suggests reinforcement learning can learn new capabilities beyond base model knowledge as long as entropy collapse is avoided, contrary to early pass@k experiments that suggested RL only sharpens existing knowledge @ChenSun92
Researchers demonstrate transformers' potential for economic modeling beyond LLMs, testing transformer fit on data simulated from NK model with successful out-of-sample performance @alexolegimas
Midjourney focuses on tools for guidance, curation, and creating variation among options rather than instruction following from text, emphasizing experimentation and refinement in image generation @emollick
Ethan Mollick argues that high-quality image generators like Nano Banana Pro unlock new AI abilities including research and compelling slide generation, highlighting importance of addressing bottlenecks @emollick
Context window and compaction identified as critical unsolved problem requiring resolution in 2026 @Suhail
Robot Olympics proposed as method to regularize hype, with participants facing unknown environments and tasks to test generalization capabilities, addressing current robots' failure at generalization despite successful fine-tuning @Suhail