AI Updates on 2025-12-17
AI Model Announcements
- Google DeepMind releases Gemini 3 Flash, combining Pro-grade reasoning with Flash-level latency and efficiency at $0.50 input/$3.00 output per million tokens, outperforming Gemini 2.5 Pro across most benchmarks while being 3x faster @GoogleDeepMind
- Gemini 3 Flash achieves 84.7% on ARC-AGI-1 and 33.6% on ARC-AGI-2 at substantially lower cost than other frontier models, representing a new score/cost Pareto frontier @arcprize
- Gemini 3 Flash scores 71 on the Artificial Analysis Intelligence Index, a 13-point improvement from Gemini 2.5 Flash, making it the most intelligent model for its price range despite using 160M tokens (more than double 2.5 Flash) @ArtificialAnlys
- Gemini 3 Flash ranks #3 in the LMArena leaderboard and top 5 across Text, Vision, and WebDev categories, making it the most cost-efficient frontier model @arena
- Gemini 3 Flash achieves state-of-the-art performance on SWE-bench Verified, outperforming both the 2.5 series and Gemini 3 Pro in coding tasks @GoogleDeepMind
- Gemini 3 Flash scores 161.8/190 on the Korean Sator Square Test, placing it 2nd or 3rd among all tested models, with a 60-point improvement over Gemini 2.5 Flash reasoning @Hangsiin
- xAI launches Grok Voice Agent API, ranking #1 on Big Bench Audio with 92.3% accuracy, nearly 5x faster than closest competitor at $0.05 per minute flat rate @xai
- OpenAI releases ChatGPT Images powered by GPT Image 1.5, featuring stronger instruction following, precise editing, detail preservation, and 4x faster generation, now top of the Image Arena leaderboard @OpenAI
- GPT-5 Pro ranks as the best reasoning model of 2025 according to Scale AI's SEAL leaderboards, excelling at answering complicated questions and solving multi-step problems @scale_AI
- GPT-5.2-xhigh shows significant qualitative improvements in Codex, representing a major jump in coding capabilities @jam3scampbell
- Microsoft releases TRELLIS 2, a 4B parameter flow-matching transformer that converts single images to textured 3D meshes at up to 1536³ resolution with open weights under MIT license @_akhaliq
- Browser Use releases BU-30B-A3B-Preview open source model with 30B parameters and 3B active, achieving state-of-the-art quality for web agents at real-time speed, enabling hundreds of browser tasks on $1 of compute @gregpr07
- Apple releases Sharp model that turns images into 3D splats, joining Hugging Face Enterprise with 150+ models, datasets and applications shared on the platform @jeffboudier
AI Industry Analysis
- Amazon announces major AI leadership changes: Peter DeSantis will lead new Amazon AI organization including AGI team, silicon development and quantum computing, while current AI chief Rohit Prasad departs; Pieter Abbeel named new AGI Head @haydenfield
- Amazon reportedly in talks to invest $10B in OpenAI as circular deals between tech companies remain popular @TechCrunch
- Coursera and Udemy enter merger agreement valued at around $2.5B @TechCrunch
- GitHub faces developer backlash over plan to charge for self-hosted GitHub Actions runners, later postponing the billing change to re-evaluate approach after community feedback @github
- GitHub operates without a CEO after Microsoft never backfilled Thomas Dohmke, now reporting into "CoreAI" group, raising concerns about losing touch with developer community @GergelyOrosz
- Warsaw emerges as major European engineering hub with offices from OpenAI, Mistral AI, ElevenLabs, Google, NVIDIA, Netflix, Meta, and other top tech companies @michuk
- Perplexity launches native iPad app optimized for iPadOS, designed for real work with desktop features including multitasking support via Stage Manager @perplexity_ai
- Cursor adds Gemini 3 Flash to its platform, finding it works well for quickly investigating bugs @cursor_ai
- Figma integrates Gemini 3 Flash into Figma Make, offering exceptionally quick results with most prompts returning in 30-60 seconds @figma
- Monzo board reportedly pushed out CEO Anil over IPO timing disagreements @TechCrunch
- Rad Power Bikes files for bankruptcy and seeks to sell the business @TechCrunch
- Meta pauses its plan to share Quest's Horizon OS with third-party headset makers @TechCrunch
- YouTube will stream the Oscars exclusively beginning in 2029 @TechCrunch
- Yann LeCun to leave Meta at end of year to launch startup focused on world models - AI systems that learn by observing and simulating physical environments @NYUDataScience
AI Applications
- 67% of doctors use AI daily, 84% say it makes them better doctors, and 42% say it makes them want to stay in medicine more, with primary use cases being administrative tasks and research assistance @emollick
- GPT-5 evaluated on optimizing wet lab experiments, demonstrating ability to improve experimental protocols with autonomous robot pilot for executing Gibson cloning protocols from natural language @MilesKWang
- Linear's Product Intelligence completed 350k accepted suggestions and assigned 26k issues in recent months, helping teams find duplicates, add attributes, and route issues to the right person @karrisaarinen
- Leona raises $14M seed round led by a16z to build AI-native operating system for healthcare providers built into WhatsApp, processing millions of patient interactions across Latin America @Leona_health
- Fisia (Nike's Brazil distributor) achieved 150% more in-store conversions, 45% jump in average order size, and 128% ROI using NVIDIA-powered virtual try-on technology @NVIDIAAI
- Researchers from MIT developed speech-to-reality system combining generative AI with robotic assembly to create physical objects including furniture and decor in minutes @medialab
- World Labs' Marble enables researchers to generate simulation-ready robotics environments that integrate with NVIDIA Isaac Sim for training and evaluation without manual setup @theworldlabs
- Arcway launches real-time 3D engine where anyone can design homes, allowing buyers to explore, change materials, furnish spaces, and visualize construction projects @calebarclay
AI Research
- Meta research introduces Parallel-Distill-Refine (PDR) framework showing that strategic parallelism and distillation can beat brute-force sequence extension, achieving 93.3% accuracy on AIME 2024 versus 79.4% for standard long chain-of-thought at matched latency @prfsanjeevarora
- Physical Intelligence discovers emergent property in VLAs (π0/π0.5/π0.6): as pre-training scales up, models learn to align human videos and robot data, enabling natural learning from human video once robot control is established @physical_int
- Berkeley researchers demonstrate that LLMs can learn general skill to evade activation monitors with zero-shot transfer to unseen deception/harmfulness monitors, calling these Neural Chameleons @sertealex
- AugE-Toolkit released as open-source package for augmenting robot embodiments, converting demo data between different robot arms/grippers; OXE-AugE dataset provides over 2M new trajectories, tripling original dataset size @Lawrence_Y_Chen
- MIT Camera Culture group built virtual petri dish using computational framework to create digital creatures evolving through millions of years, developing optimal eyes for specialized roles @medialab
- Training on tough benchmarks like SWE-bench leads to better results on other benchmarks as well, according to Xiaomi MiMo paper findings @OfirPress
- OLMo 3 paper released on arXiv after November launch, demonstrating benefits of open science in progressing AI research together @kylelostat
AI Ethics & Society
- Senator Bernie Sanders proposes moratorium on data center construction powering AI development, arguing democracy needs time to catch up and ensure technology benefits all citizens, not just the 1% @SenSanders
- Judge rules Tesla engaged in deceptive marketing for Autopilot and Full Self-Driving features @TechCrunch
- Lack of reliable measures of human error rates across intellectually demanding tasks hinders understanding of AI hallucination thresholds that could lead to sudden leaps in usefulness and adoption @emollick
- Ethan Mollick demonstrates rapid gains in AI ability at ever-decreasing costs continue with no signs of ending, though GPQA Diamond benchmark likely close to being maxed out @emollick
- Francois Chollet argues general intelligence exists as collective human capability, with Science as intelligent system able to solve any solvable problem given appropriate resources, and that digital general intelligence is achievable @fchollet
- Debate emerges around AG