AI Updates on 2025-10-16

Alibaba releases Qwen3-4B-SafeRL, a safety-aligned model fine-tuned via reinforcement learning that achieves significant safety improvement on WildJailbreak (64.7 → 98.1) without compromising general task performance @Alibaba_Qwen
Alibaba launches Qwen3-VL-Flash on Alibaba Cloud Model Studio, a vision-language model that combines reasoning and non-reasoning modes with ultra-long context support (up to 256K tokens) and enhanced image/video understanding @Alibaba_Qwen
OpenAI updates Sora 2 with storyboards now available on web to Pro users and extended video generation up to 15 seconds for all users, 25 seconds for Pro users on web @OpenAI
Google releases Veo 3.1 with significantly improved texture and surface detail rendering, making hair, fabrics, and surfaces appear more life-like and realistic @GeminiApp
Google AI announces DeepSomatic for cancer diagnostics and Gemma C2S-Scale 27B model that generated a novel hypothesis to convert "cold" tumors into "hot" tumors for immunotherapy treatment @GoogleAI

OpenAI reportedly pitched companies on a "sign in with ChatGPT" feature where startups could shift API costs to customers by charging against their ChatGPT capacity limits instead of paying OpenAI directly @btibor91
Anthropic introduces Claude integration with Microsoft 365 and enterprise search capabilities, allowing users to search SharePoint, OneDrive, Outlook and Teams for tailored responses @AnthropicAI
Microsoft reports rapid increase in AI use by nation states over the last year in their 2025 Digital Defense Report, highlighting AI's growing role in cybersecurity threats @AndrewCurran_
BigTech employment from top US universities has grown 3-4x from less than 10% to well over 20% in the past 20 years, making BigTech the #1 career choice for most elite university graduates @deedydas
Deel raises $300M at $17.3B valuation and reports being profitable for three years while surpassing $1 billion in ARR @TechCrunch

Senior engineers in private Slack channels are reportedly dismissing claims about AI usage at scale as lies, showing denial rather than curiosity about AI capabilities in enterprise settings @clairevo
Pinterest rolls out new controls allowing users to limit AI-generated content in their feeds and makes AI content labels more visible to address user concerns about synthetic content @TechCrunch
EFF files lawsuit alleging the Trump administration is monitoring and punishing non-citizens who express social media views that the government disfavors, raising concerns about AI-powered surveillance @TechCrunch

Google DeepMind partners with Commonwealth Fusion Systems to use reinforcement learning for discovering novel real-time control strategies to accelerate fusion energy development @AndrewCurran_
OpenAI launches "OpenAI for Science" initiative with first hire being a physicist to advance scientific discovery using AI @AndrewCurran_
Waymo partners with DoorDash to expand robotaxi services into delivery, marking a potential return to delivery applications for autonomous vehicles @TechCrunch
Kayak introduces "AI Mode" that lets travelers research, plan, and book trips through a built-in chatbot directly on their main platform @TechCrunch
Microsoft introduces the first commercially available ambient experience built for nursing workflows to help nurses focus on patient care @satyanadella
Perplexity AI launches language learning features with practice words, basic terms, and flashcards for advanced phrases on iOS and web @perplexity_ai

Andrew Ng emphasizes that the single biggest predictor of AI agent development progress is the team's ability to drive disciplined processes for evaluations and error analysis, rather than using the latest buzzy techniques @AndrewYNg
Andrej Karpathy completes training of nanochat d32 model for $1000, achieving CORE score of 0.31 (above GPT-2's ~0.26) and GSM8K improvement from ~8% to ~20%, demonstrating micro-model capabilities @karpathy
Research paper "The Art of Scaling Reinforcement Learning Compute for LLMs" provides first comprehensive analysis of scaling RL with large language models @natolambert
MIT CSAIL introduces "GLASS Flows" approach that boosts text-image alignment for large-scale models at inference time using ODEs to simulate random changes without retraining @MIT_CSAIL
Hugging Face re-launches HuggingChat v2 with 115 open source models in a single interface and introduces HuggingChat Omni for automatic model selection across different providers @reach_vb
Tiny Recursion Model (TRM) achieves 40% on ARC-AGI-1 at $1.76/task and 6.2% on ARC-AGI-2 at $2.10/task, contributing open source research to the community @arcprize
World Labs releases RTFM, a real-time, persistent, and 3D consistent generative World Model running on a single H100 GPU @drfeifei