AI Updates on 2025-12-11

AI Model Announcements

OpenAI releases GPT-5.2, described as the smartest generally-available model in the world, particularly strong at real-world knowledge work tasks including spreadsheets, presentations, and coding. The model comes in three variants: GPT-5.2 Instant for everyday work, GPT-5.2 Thinking for complex reasoning and long-context tasks, and GPT-5.2 Pro for difficult questions and scientific work @OpenAI
GPT-5.2 achieves 55.6% on SWE-Bench Pro, 52.9% on ARC-AGI-2, and 40.3% on Frontier Math, with a 70.9% win/tie rate against industry experts on GDPval benchmark measuring knowledge work tasks across 44 occupations @sama
GPT-5.2 Pro achieves state-of-the-art 90.5% score on ARC-AGI-1 at $11.64 per task, representing a 390x efficiency improvement over last year's o3 preview which scored 88% at $4,500 per task @arcprize
Alibaba announces Qwen Learn Mode powered by Qwen3-Max, featuring Socratic-style dialogue and adaptive learning paths grounded in cognitive psychology @Alibaba_Qwen
Cohere launches Rerank 4 with two versions (Fast and Pro), featuring the largest context window in their Rerank series, self-learning capabilities without annotated data, and support for over 100 languages with state-of-the-art retrieval in 10 major business languages @cohere
Google introduces Gemini Deep Research agent for developers, built on Gemini 3 Pro and trained using multi-step reinforcement learning to autonomously navigate the web and produce detailed reports with citations. Achieves state-of-the-art performance on DeepSearchQA benchmark and highest score yet on BrowseComp @GoogleDeepMind
Google updates Gemini TTS models with richer tone versatility, stricter adherence to style prompts, smarter context-aware speed adjustments, and consistent character voices in multi-speaker scenarios @OfficialLoganK
Mistral AI announces Devstral 2 is #1 trending on OpenRouter and teases another model drop coming in a few days @MistralAI
Google announces Gemini integration with Google Maps, serving up local results in a rich visual format with photos, ratings, and real-world information @GeminiApp

AI Industry Analysis

VC fundraising has dropped 75% from 2022 peak to approximately $45B in Q3 2025, returning to levels from 8 years ago, while capital deployment remains high at ~$330B over the last 4 quarters. The growing gap between funds deployed and funds raised suggests it will become significantly harder for startups to find capital @deedydas
Over one-third of startups in 2025 were started solo for the first time in history, with solo founders becoming increasingly common @julianweisser
Perplexity announces adoption by law firm Gunderson Dettmer for legal services, highlighting lawyers' need for accurate AI that can pull references reliably @AravSrinivas
Disney signs three-year licensing deal with OpenAI allowing Sora to generate AI videos featuring its 200 characters, with exclusivity for the first year. Disney will set guardrails for character usage and curate videos for Disney+ @TechCrunch
Harness raises $240M at $5.5B valuation to automate AI's "after-code" gap in software delivery @TechCrunch
Runware raises $50M Series A to help make image and video generation easier for developers @TechCrunch
Port raises $100M at $800M valuation to compete with Spotify's Backstage for developer portals @TechCrunch
Opera launches Neon, an AI-powered browser priced at $20 per month @TechCrunch
Worktrace raises $9M seed round led by 8VC to help businesses uncover automation opportunities, founded by former OpenAI product manager Angela Jiang and UIUC CS professor Deepak Vasisht @worktrace_ai
Vybe raises $10M seed round led by First Round to enable vibe-coding for internal business applications with production data integration @qhoang09
Oboe raises $16M Series A led by a16z for personalized learning platform @NirZicherman
Unconventional AI raises $475M seed round co-led by a16z to develop highly efficient AI-first chips using analog computing approaches inspired by biological brains @a16z
Hugging Face announces text-generation-inference is now in maintenance mode, recommending users migrate to vLLM, SGLang, llama.cpp or MLX for optimized inference @LysandreJik
Cursor introduces visual design editing directly in codebase, allowing users to select elements, modify them visually, and have Cursor write the code, aiming to bridge design and engineering workflows @cursor_ai
Runway releases its first world model and adds native audio to latest video model @TechCrunch
Rivian announces major autonomy push with custom silicon, lidar, and hints at robotaxis, with AI assistant coming to EVs in early 2026 @TechCrunch

AI Ethics & Society

Ethan Mollick demonstrates GPT-5.2 Pro creating visually complex shader code in a single shot, highlighting the difficulty of distinguishing AI-generated content from human-created work @emollick
OpenAI announces investment in cybersecurity preparedness as models grow more capable, working with global experts to strengthen safeguards and give defenders an advantage @OpenAI
Disney issues cease-and-desist to Google claiming massive copyright infringement @TechCrunch
TIME names "Architects of AI" as 2025 Person of the Year, including Fei-Fei Li, recognizing AI's transformational impact on humanity @drfeifei
xAI partners with El Salvador to bring personalized Grok tutoring to over 1 million public school students, creating the world's first nationwide AI tutor program @xai
Anthropic announces Model Context Protocol (MCP) is now part of the Agentic AI Foundation under the Linux Foundation, with OpenAI, Anthropic, and Block as co-founders @AnthropicAI
ICML 2026 announces new policy allowing reviewers and authors to choose between conservative or permissive LLM use, with matching based on preferences @icmlconf
Ethan Mollick notes that open weights AI models lack the same economics as open source software, with no clear path to capture value despite increasing model costs, raising questions about sustainability @emollick
Stanford researchers find that 1 in 20 AI benchmarks have serious flaws, meaning the industry has been promoting underperforming models and penalizing better ones due to broken evaluation methods @StanfordHAI

AI Applications

Linear introduces AI agent integration with Intercom, Zendesk, Gong, and Slack Workflows, enabling automatic issue creation from customer calls and tickets with a single click @karrisaarinen
Google debuts Disco, a Gemini-powered tool for making web apps from browser tabs @TechCrunch
Google launches AI try-on feature for clothes that works with just a selfie @TechCrunch
Andrew Ng shares recipe for building highly autonomous agents using open source aisuite package, allowing frontier LLMs to use tools like disk access and web search for complex tasks, though noting most practical agents need more scaffolding @AndrewYNg
Simon Willison publishes comprehensive guide on patterns for vibe-coding single-file HTML tools, covering CORS-enabled APIs, localStorage, URL state management, and rich copy-paste functionality after creating 150 different tools @simonw
Microsoft Research introduces Agent Lightning, which decouples how agents work from how they're trained by turning each agent step into reinforcement learning data, enabling developers to improve agent performance with minimal code changes @MSFTResearch
Satya Nadella demonstrates chain of debate app for deep research using multiple models and decision frameworks, announcing integration into Copilot @satyanadella
Swiggy uses Microsoft Fabric to process billions of data points in near real-time for delivery innovations @satyanadella

AI Research

On GDPval benchmark measuring well-specified knowledge work tasks across 44 occupations, GPT-5.2 Thinking is the first model to perform at human expert level, with GPT-5.2 Pro winning 71% of head-to-head comparisons against human experts on tasks requiring 4-8 hours as judged by other humans @emollick
Francois Chollet announces ARC 3 benchmark releasing in Q1 2026 to target exploration, goal-setting, and interactive planning as new bottlenecks beyond fluid intelligence. Notes that while ARC 1 is saturating, state-of-the-art models are not yet human-level on an efficiency basis, and ARC 2 remains largely unsaturated @fchollet
Mike Knoop estimates human efficiency for solving simple ARC v1 tasks is 10,000x higher than GPT-5.2 Pro on an energy basis, down from 1,000,000x compared to last year's o3 preview @mikeknoop
Google Deep