AI Updates on 2025-12-12

AI Model Announcements

OpenAI releases GPT-5.2 with knowledge cutoff updated to August 2025, priced at 1.4x over GPT-5.1, showing significant improvements in long-context handling and needle-in-haystack tasks @simonw
GPT-5.2 Pro (X-High) achieves 90.5% on ARC-AGI-1 at $11.64/task, representing a 390x efficiency improvement over an unreleased o3 (High) version from a year ago that scored 88% at $4.5k/task @simonw
Ai2 releases Olmo 3.1 with 32B Think and 32B Instruct models, extending their RL run for three additional weeks and achieving continued performance improvements on AIME and coding benchmarks at approximately $250K total cost @natolambert
Google releases updated Gemini 2.5 Flash Native Audio model with improvements to handle complex workflows, navigate user instructions, and hold natural conversations @GoogleAI
Gemini 2.5 Flash and 2.5 Pro Text-to-Speech preview models bring improved adherence to style prompts, precision pacing with context-aware speed adjustments, and character voice consistency for multi-speaker scenarios @GoogleAI
Moonshoot AI releases Kimi K2 Thinking model, now available in Tinker platform with extensive search capabilities @AndrewCurran_
ByteDance releases Dolphin-v2, a 3B document parsing model with MIT license that works on PDFs, scans, and photos, understanding 21 types of content with pixel-level precision @AdinaYakup
OpenAI releases circuit-sparsity model on Hugging Face @_akhaliq

AI Industry Analysis

Anthropic revealed as Broadcom's mystery $10 billion customer from September, with an additional $11 billion order placed for AI infrastructure @AndrewCurran_
OpenAI announces collaboration with BBVA to expand ChatGPT Enterprise deployment to 120,000 employees, supporting BBVA's shift toward AI-native banking @gdb
OpenAI CEO Sam Altman indicates enterprise AI will be a massive priority for OpenAI in 2026, signaling a major strategic shift @gdb
Pinterest CEO reports taking open source models, fine-tuning them, and achieving similar performance to the best proprietary models at less than 10% of the cost @jeffboudier
NVIDIA considers increasing H200 chip output due to robust China demand despite export restrictions @AndrewCurran_
Ethan Mollick expresses certainty that even if AI development stopped today, society would experience massive rolling disruption for the next ten years as people figure out how to harness existing model capabilities @emollick
Industry observers note potential for model fatigue with LLMs similar to app install fatigue with mobile apps, where even superior products struggle to gain adoption @GergelyOrosz
Analysis suggests the industry has reached the peak of proprietary APIs and is entering a more balanced world where open-source, training, and alternative platforms will gain larger share of attention, usage, and revenue @ClementDelangue
Satirical post highlights enterprise AI adoption challenges, describing a $1.4M Microsoft Copilot deployment with minimal actual usage but successful metrics reporting for board presentations @gothburz

AI Ethics & Society

President Trump signs National Policy Framework for Artificial Intelligence executive order declaring the US must have one minimally burdensome national standard for AI rather than 50 discordant state laws @AndrewCurran_
The executive order includes tools such as a DOJ litigation task force, withholding federal funds from states with onerous AI laws, FTC efforts to curb state attempts to force AI models to alter truthful outputs, and FCC efforts to curb disclosure requirements @AndrewCurran_
YouTube announces AI-based age verification system using Gemini to automatically determine user age by analyzing viewing patterns, with users incorrectly estimated as under 18 required to verify via credit card or government ID @AndrewCurran_
Princeton researcher Arvind Narayanan publishes paper arguing that algorithmic fairness is a category error, advocating for studying entire sociotechnical systems rather than just technical subsystems when designing algorithmic bureaucracies @random_walker
Analysis suggests that if individuals have short timelines to transformative AI and believe some human values are fundamentally irreconcilable, ensuring the winning model enshrines their ethical framework will increasingly feel like the most important thing in the world @AndrewCurran_

AI Applications

Perplexity's Comet Android demonstrates ability to debug code from a phone by analyzing CI logs, tracing failures, figuring out fixes, and opening ready-to-merge pull requests @AravSrinivas
ChatGPT now includes a /home/oai/skills folder with skill definitions for PDFs, docs, and spreadsheets, with experimental support also added to Codex CLI @simonw
Google Translate rolls out Gemini-powered live speech-to-speech translation in beta, bringing real-time audio translation that captures the nuance of human speech @TechCrunch
Adobe launches free ChatGPT-integrated apps for Photoshop, Acrobat, and Express on desktop, web, and iOS, allowing users to access Adobe apps directly from within ChatGPT @gdb
OpenAI announces partnership with Disney to bring Sora and image generation capabilities for Disney characters, enabling users to generate content with Disney IP @sama
Microsoft announces MahaCrimeOS AI collaboration with Maharashtra to support victims of cybercrime and financial fraud @satyanadella
Moonlake introduces Reverie, a real-time programmable diffusion model trained for games, capable of conditioning beyond pixels and allowing gameplay to be restyled to any aesthetic while maintaining game mechanics @chrmanning
User reports GPT-5.2 provides impressive long-context analysis of game scripts, picking up subtle details and offering interpretations comparable to someone who played the game deeply, with almost no hallucinations @AndrewCurran_
Kimi K2 demonstrates extensive search behavior during reasoning, repeatedly searching to support claims, look at counterexamples, and verify information before providing final answers @AndrewCurran_

AI Research

Ai2's Olmo 3.1 32B Think demonstrates that RL scaling can continue far beyond initial expectations, with performance increasing over 125K H100 hours at approximately $250K cost, comparable to DeepSeek R1's resource usage @natolambert
Research introduces Fast Flow Joint Distillation (F2D2), cutting NFEs for both sampling and likelihood evaluation by two orders of magnitude in flow-based models while preserving sample quality @rsalakhu
Google DeepMind presents research on evaluating Gemini Robotics Policies in a Veo World Simulator, introducing a generalist evaluator for testing robot safety without breaking physical objects @Majumdar_Ani
Francois Chollet argues AI will evolve from automation machine to invention machine, requiring a fundamentally new paradigm with symbolic search as its core rather than curve-fitting @fchollet
Chollet explains that fluid intelligence measured by ARC is distinct from exploration, goal-setting, and planning capabilities needed for autonomous agents, with exploration being the hardest and planning the easiest among these open problems @fchollet
First LLM trained in space using NVIDIA H100 on Starcloud-1, also first to run a version of Google's Gemini in space, using highly efficient open source Gemma models @demishassabis
New text embedding methodology released using tiny ReLU network to approximate large transformer from lexical features, achieving fast CPU-only performance for document similarity, clustering, and classification @lukemerrick_
Unique LLM project trains model on 90GB of only 1800s and older texts to create a language model with zero modern bias contamination, serving as a true time capsule @Teknium
OpenAI's London Training team reports remarkable internal impact alongside San Francisco colleagues, with contributions now landing in production @gdb
Sebastien Bubeck notes OpenAI has cracked pretraining and reasoning, now experimenting with new techniques that maximally leverage their interaction, with GPT-5 being just the first step @SebastienBubeck
Anthropic Fellows Program expands for 2026 with two rounds beginning in May and July, providing funding, compute, and mentorship for four-month safety and security research projects, with 40% of first cohort joining Anthropic full-time @AnthropicAI
llama.cpp now features Ollama-style model management with auto-discovery of GGUFs from cache, load on first request, per-model processes, and OpenAI-compatible API routing @victormustar
Continuous batching in transformers achieves 10-14.5% throughput gains across 500 requests through optimizations like eliminating torch sync and more GPU-sided operations @remi_or_
PyTorch Foundation welcomes NeuralOperator, a PyTorch-native library for learning neural operators and modeling mappings between function spaces for AI-driven science and engineering @PyTorch