AI Updates on 2025-12-10

AI Model Announcements

Alibaba releases upgraded Qwen3-Omni-Flash (2025-12-01 version) with enhanced multi-turn video/audio understanding, customizable AI personality through system prompts, support for 119 text languages and 19 speech languages, and human-like voice quality @Alibaba_Qwen
Mistral releases Devstral 2 and Devstral Small 2 models with 123B and 24B parameters respectively, though with restrictive licensing that prohibits use by companies with over $20M monthly revenue @simonw
Mistral doubles Vibe context limit from 100k to 200k tokens @MistralAI
Nous Research open sources Nomos 1, a 30B parameter model that scored 87/120 on the 2024 Putnam mathematics competition, ranking #2 out of 3,988 participants @NousResearch
StepFun introduces Parallel Coordinated Reasoning (PaCoRe), enabling an 8B model to achieve 94.5% on HMMT25 (beating GPT-5's 93.2%) and 78.2% on LiveCodeBench through multi-million-token thinking time compute @StepFun_ai

AI Industry Analysis

Bloomberg reports Meta's superintelligence lab is using Gemma, OpenAI's open source model, and Qwen to train their next large model, code-named Avocado, marking a potential shift away from open source strategy @AndrewCurran_
ChatGPT becomes Apple's most downloaded app of 2025 in the US, with 64% of US teens using AI chatbots and 33% using them daily according to Pew Research @AndrewCurran_
BigTech giants announce approximately $68B in India investments over the next 5 years, positioning India as the second-biggest revenue driver after the US for AI development @deedydas
Hugging Face now hosts over 2.2 million models with 50,000+ models having API providers, demonstrating rapid growth in open-source AI ecosystem @_akhaliq
Google launches sub-$5 AI Plus plan in India to compete with ChatGPT Go @TechCrunch
Oboe raises $16M Series A led by a16z for its AI-powered course generation platform that creates personalized learning experiences @TechCrunch
Cursor releases version 2.2 with Debug Mode that instruments code and streams runtime data to agents, plus Plan Mode improvements and multi-agent judging capabilities @cursor_ai

AI Ethics & Society

OpenAI announces upcoming models will reach 'High' capability under their Preparedness Framework for cybersecurity, requiring strengthened safeguards and collaboration with global experts to give defenders an advantage @OpenAI
Ethan Mollick warns that restrictive licensing on Mistral models (prohibiting use by companies over $20M monthly revenue) could limit open source contributions, as historically much labor comes from for-profit firms @emollick
Gergelyi Orosz observes LinkedIn aggressively pushing AI products everywhere, with AI-generated content flooding the platform and making inbound job applications mostly useless @GergelyOrosz
Brian Lovin reports that new X accounts are shown extremely low-quality AI-generated content, politically charged material, and bottom-of-the-barrel posts as default feed @brian_lovin
Ethan Mollick notes the GPT-5 Auto router creates perception problems, as many examples of "ChatGPT got X wrong" are actually "ChatGPT-5 Instant got things wrong," leading to inaccurate beliefs about AI capabilities @emollick
John Carmack proposes using LLM chat history as job references, arguing multi-year chat histories provide better signals than traditional resumes and could optimize fit between people and jobs for both employers and employees @ID_AA_Carmack

AI Applications

Google partners with multiple publishers including Der Spiegel, The Guardian, The Times of India, and The Washington Post to test AI engagement features including audio briefings by Gemini in Google News @AndrewCurran_
Google launches managed MCP servers allowing AI agents to plug into its tools, plus Preferred Sources feature in Search for customizing Top Stories from valued outlets @TechCrunch
Figma launches AI-powered object removal and image extension tools in Design and Draw, enabling users to erase distractions, expand backgrounds, and isolate objects @figma
Mikhail Parakhin introduces SimGym, a system creating "digital customers" that behave like real ones to reveal optimization opportunities and enable A/B testing with zero live traffic @MParakhin
Ethan Mollick demonstrates Nano Banana Pro in NotebookLM can generate high-quality presentation decks from source materials with rare hallucinations, positioning it as a potential PowerPoint replacement @emollick
Andrej Karpathy creates auto-grading system using GPT 5.1 Thinking API to analyze 930 Hacker News discussions from December 2015 with hindsight, identifying most prescient comments for $60 in 1 hour @karpathy
Linear reports their AI agent has been one of their most loved features, with a significant uptick in new issues created after launch @karrisaarinen
Satya Nadella highlights Microsoft's partnership with India's Labour Ministry using AI to connect over 300 million informal workers to better jobs and social security @satyanadella
CTGT launches Mentat, an OpenAI-compatible API using mechanistic interpretability to give enterprises deterministic control over LLM behavior, adding safety policy guarantees without retraining @CyrilGorlla
Spotify tests more personalized, AI-powered 'Prompted Playlists' feature @TechCrunch

AI Research

Google DeepMind and Google Research develop FACTS Benchmark Suite, the industry's first comprehensive test evaluating LLM factuality across four dimensions: internal model knowledge, web search, grounding, and multimodal inputs, with Gemini 3 Pro achieving top score of 68.8% @GoogleDeepMind
Google Cloud introduces AlphaEvolve, a Gemini-powered coding agent for designing advanced algorithms that uses LLMs to propose intelligent code modifications in a feedback loop @GoogleCloudTech
Stanford researchers find 1 in 20 AI benchmarks have serious flaws, meaning the industry has been promoting underperforming models and penalizing better ones @StanfordHAI
Microsoft Research introduces Promptions, helping developers add dynamic, context-aware controls to chat interfaces so users can guide generative AI responses without writing long instructions @MSFTResearch
Nathan Lambert releases comprehensive talk covering every stage of building Olmo 3 Think, including changes to pretraining, evaluation, and post-training with focus on reinforcement learning infrastructure @natolambert
LeRobot Community Datasets v3 releases 50K episodes across 46 robot types from 235 contributors worldwide, representing one of the largest open-source crowdsourced robot demonstration collections @danaaubakir
Adi Oltean announces training of first LLM in space using NVIDIA H100 onboard Starcloud-1, successfully training nanoGPT model on Shakespeare's complete works and running inference @AdiOltean
Jeff Clune emphasizes that fastest path to self-improving AI comes from embracing quality diversity, open-endedness, and AI-generating algorithms, with concepts like OMNI and Darwin-complete search spaces enabling recursively self-improving AI @KevinWang_111