AI Updates on 2025-12-12

AI Model Announcements

  • OpenAI releases GPT-5.2 with knowledge cutoff updated to August 2025, priced at 1.4x over GPT-5.1, showing significant improvements in long-context handling and needle-in-haystack tasks @simonw
  • GPT-5.2 Pro (X-High) achieves 90.5% on ARC-AGI-1 at $11.64/task, representing a 390x efficiency improvement over an unreleased o3 (High) version from a year ago that scored 88% at $4.5k/task @simonw
  • Ai2 releases Olmo 3.1 with 32B Think and 32B Instruct models, extending their RL run for three additional weeks and achieving continued performance improvements on AIME and coding benchmarks at approximately $250K total cost @natolambert
  • Google releases updated Gemini 2.5 Flash Native Audio model with improvements to handle complex workflows, navigate user instructions, and hold natural conversations @GoogleAI
  • Gemini 2.5 Flash and 2.5 Pro Text-to-Speech preview models bring improved adherence to style prompts, precision pacing with context-aware speed adjustments, and character voice consistency for multi-speaker scenarios @GoogleAI
  • Moonshoot AI releases Kimi K2 Thinking model, now available in Tinker platform with extensive search capabilities @AndrewCurran_
  • ByteDance releases Dolphin-v2, a 3B document parsing model with MIT license that works on PDFs, scans, and photos, understanding 21 types of content with pixel-level precision @AdinaYakup
  • OpenAI releases circuit-sparsity model on Hugging Face @_akhaliq

AI Industry Analysis

  • Anthropic revealed as Broadcom's mystery $10 billion customer from September, with an additional $11 billion order placed for AI infrastructure @AndrewCurran_
  • OpenAI announces collaboration with BBVA to expand ChatGPT Enterprise deployment to 120,000 employees, supporting BBVA's shift toward AI-native banking @gdb
  • OpenAI CEO Sam Altman indicates enterprise AI will be a massive priority for OpenAI in 2026, signaling a major strategic shift @gdb
  • Pinterest CEO reports taking open source models, fine-tuning them, and achieving similar performance to the best proprietary models at less than 10% of the cost @jeffboudier
  • NVIDIA considers increasing H200 chip output due to robust China demand despite export restrictions @AndrewCurran_
  • Ethan Mollick expresses certainty that even if AI development stopped today, society would experience massive rolling disruption for the next ten years as people figure out how to harness existing model capabilities @emollick
  • Industry observers note potential for model fatigue with LLMs similar to app install fatigue with mobile apps, where even superior products struggle to gain adoption @GergelyOrosz
  • Analysis suggests the industry has reached the peak of proprietary APIs and is entering a more balanced world where open-source, training, and alternative platforms will gain larger share of attention, usage, and revenue @ClementDelangue
  • Satirical post highlights enterprise AI adoption challenges, describing a $1.4M Microsoft Copilot deployment with minimal actual usage but successful metrics reporting for board presentations @gothburz

AI Ethics & Society

  • President Trump signs National Policy Framework for Artificial Intelligence executive order declaring the US must have one minimally burdensome national standard for AI rather than 50 discordant state laws @AndrewCurran_
  • The executive order includes tools such as a DOJ litigation task force, withholding federal funds from states with onerous AI laws, FTC efforts to curb state attempts to force AI models to alter truthful outputs, and FCC efforts to curb disclosure requirements @AndrewCurran_
  • YouTube announces AI-based age verification system using Gemini to automatically determine user age by analyzing viewing patterns, with users incorrectly estimated as under 18 required to verify via credit card or government ID @AndrewCurran_
  • Princeton researcher Arvind Narayanan publishes paper arguing that algorithmic fairness is a category error, advocating for studying entire sociotechnical systems rather than just technical subsystems when designing algorithmic bureaucracies @random_walker
  • Analysis suggests that if individuals have short timelines to transformative AI and believe some human values are fundamentally irreconcilable, ensuring the winning model enshrines their ethical framework will increasingly feel like the most important thing in the world @AndrewCurran_

AI Applications

  • Perplexity's Comet Android demonstrates ability to debug code from a phone by analyzing CI logs, tracing failures, figuring out fixes, and opening ready-to-merge pull requests @AravSrinivas
  • ChatGPT now includes a /home/oai/skills folder with skill definitions for PDFs, docs, and spreadsheets, with experimental support also added to Codex CLI @simonw
  • Google Translate rolls out Gemini-powered live speech-to-speech translation in beta, bringing real-time audio translation that captures the nuance of human speech @TechCrunch
  • Adobe launches free ChatGPT-integrated apps for Photoshop, Acrobat, and Express on desktop, web, and iOS, allowing users to access Adobe apps directly from within ChatGPT @gdb
  • OpenAI announces partnership with Disney to bring Sora and image generation capabilities for Disney characters, enabling users to generate content with Disney IP @sama
  • Microsoft announces MahaCrimeOS AI collaboration with Maharashtra to support victims of cybercrime and financial fraud @satyanadella
  • Moonlake introduces Reverie, a real-time programmable diffusion model trained for games, capable of conditioning beyond pixels and allowing gameplay to be restyled to any aesthetic while maintaining game mechanics @chrmanning
  • User reports GPT-5.2 provides impressive long-context analysis of game scripts, picking up subtle details and offering interpretations comparable to someone who played the game deeply, with almost no hallucinations @AndrewCurran_
  • Kimi K2 demonstrates extensive search behavior during reasoning, repeatedly searching to support claims, look at counterexamples, and verify information before providing final answers @AndrewCurran_

AI Research

  • Ai2's Olmo 3.1 32B Think demonstrates that RL scaling can continue far beyond initial expectations, with performance increasing over 125K H100 hours at approximately $250K cost, comparable to DeepSeek R1's resource usage @natolambert
  • Research introduces Fast Flow Joint Distillation (F2D2), cutting NFEs for both sampling and likelihood evaluation by two orders of magnitude in flow-based models while preserving sample quality @rsalakhu
  • Google DeepMind presents research on evaluating Gemini Robotics Policies in a Veo World Simulator, introducing a generalist evaluator for testing robot safety without breaking physical objects @Majumdar_Ani
  • Francois Chollet argues AI will evolve from automation machine to invention machine, requiring a fundamentally new paradigm with symbolic search as its core rather than curve-fitting @fchollet
  • Chollet explains that fluid intelligence measured by ARC is distinct from exploration, goal-setting, and planning capabilities needed for autonomous agents, with exploration being the hardest and planning the easiest among these open problems @fchollet
  • First LLM trained in space using NVIDIA H100 on Starcloud-1, also first to run a version of Google's Gemini in space, using highly efficient open source Gemma models @demishassabis
  • New text embedding methodology released using tiny ReLU network to approximate large transformer from lexical features, achieving fast CPU-only performance for document similarity, clustering, and classification @lukemerrick_
  • Unique LLM project trains model on 90GB of only 1800s and older texts to create a language model with zero modern bias contamination, serving as a true time capsule @Teknium
  • OpenAI's London Training team reports remarkable internal impact alongside San Francisco colleagues, with contributions now landing in production @gdb
  • Sebastien Bubeck notes OpenAI has cracked pretraining and reasoning, now experimenting with new techniques that maximally leverage their interaction, with GPT-5 being just the first step @SebastienBubeck
  • Anthropic Fellows Program expands for 2026 with two rounds beginning in May and July, providing funding, compute, and mentorship for four-month safety and security research projects, with 40% of first cohort joining Anthropic full-time @AnthropicAI
  • llama.cpp now features Ollama-style model management with auto-discovery of GGUFs from cache, load on first request, per-model processes, and OpenAI-compatible API routing @victormustar
  • Continuous batching in transformers achieves 10-14.5% throughput gains across 500 requests through optimizations like eliminating torch sync and more GPU-sided operations @remi_or_
  • PyTorch Foundation welcomes NeuralOperator, a PyTorch-native library for learning neural operators and modeling mappings between function spaces for AI-driven science and engineering @PyTorch