AI Updates on 2025-09-11

Alibaba releases Qwen3-Next-80B-A3B with 80B parameters but only 3B activated per token, achieving 10x cheaper training and 10x faster inference than Qwen3-32B, especially at 32K+ context lengths @Alibaba_Qwen
The Qwen3-Next-80B-A3B-Instruct model approaches the performance of Alibaba's 235B flagship model, while Qwen3-Next-80B-A3B-Thinking outperforms Gemini-2.5-Flash-Thinking @Alibaba_Qwen
Google announces support for SOTA Gemini Embeddings model in the Batch API with 50% discount versus regular pricing, available through OpenAI compatibility layer @OfficialLoganK

Perplexity's valuation jumped to $20 billion from $18 billion just two months earlier, demonstrating rapid growth in AI-powered search @TechCrunch
Oracle's hiring surge and all-time high valuation is revealed to be driven by their data center push for AI infrastructure @GergelyOrosz
Professional developers report that AI coding tools are most valuable for **migrations** rather than generating software from scratch, saving significant time and improving developer satisfaction @GergelyOrosz
Anthropic's quiet release strategy for major capability improvements in applications like Excel, PowerPoint, and personal assistant functions may be underemphasizing their practical utility advances @emollick
Hugging Face launches integration with GitHub Copilot Chat in VS Code, providing access to frontier open-source LLMs like Qwen3-Coder, gpt-oss, and GLM-4.5 through world-class inference partners @hanouticelina

FTC launches inquiry into AI chatbot safety, particularly focusing on companion chatbots and their impact on children, targeting major companies including OpenAI, Alphabet, Meta, and xAI @AndrewCurran_
California proposes SB 243, which would make it the first state to require safety protocols for AI companions and hold companies legally accountable if chatbots fail to meet safety standards @TechCrunch
Stanford HAI releases framework for approximating **political neutrality** in AI models, acknowledging true neutrality is technically impossible but offering 8 techniques to approach it @StanfordHAI

Claude demonstrates advanced **phone assistant** capabilities, successfully handling complex requests involving common sense and complicated constraints, though still requiring the larger Opus model for optimal performance @emollick
Replit Agent showcases end-to-end debugging and testing capabilities, able to click around applications and iterate for hours while providing full process playback and log analysis @tylerangert
Microsoft Research explores the **Model Context Protocol (MCP)** as a new standard for agent collaboration across fragmented tool ecosystems as agentic AI systems become more complex @MSFTResearch
Box releases new AI tools at Boxworks conference, advancing CEO Aaron Levie's vision for AI-led transformation of enterprise workflows @TechCrunch

Berkeley AI Research introduces **RecA (Reconstruction Alignment)** which significantly improves unified multimodal models with just 8k images and 4 hours of training on 8 GPUs, achieving major performance gains on GenEval, DPGBench, and ImgEdit benchmarks @XDWang101
NVIDIA develops AlphaEvolve-like framework for autonomously evolving NP-Complete SAT solvers, representing advancement in evolutionary coding agents @richardcsuwandi
Research demonstrates that AI evaluations are fundamentally **data science** work, requiring skills in data analysis, visualization, and metrics design, with AI tools making the PyData ecosystem more accessible @HamelHusain
New study challenges assumptions about long context windows making RAG less important, with experiments across 18 different models showing RAG remains valuable @HamelHusain
PyTorch and Google develop local checkpointing solution using DCP to reduce training overhead and improve goodput for large-scale distributed training jobs @PyTorch