AI Updates on 2025-10-21

Alibaba releases Qwen3-VL-2B and Qwen3-VL-32B models, with the 32B version outperforming GPT-5 mini and Claude 4 Sonnet across STEM, VQA, OCR, video understanding, and agent tasks while matching models up to 235B parameters @Alibaba_Qwen
Alibaba upgrades Qwen Deep Research to create not only reports but also live webpages and podcasts, powered by Qwen3-Coder, Qwen-Image, and Qwen3-TTS @Alibaba_Qwen
OpenAI launches ChatGPT Atlas, an AI-powered browser for macOS that can see web pages, answer questions in context, and complete tasks through agent mode for Plus and Pro users @OpenAI
Google's Veo 3.1 tops LMArena video leaderboards with significant improvements over Veo 3.0 for text-to-video (+30) and image-to-video (+70) generation @demishassabis
Google launches new AI-first coding experience in AI Studio optimized for building AI applications with Gemini @OfficialLoganK

Airbnb CEO reveals heavy reliance on Alibaba's Qwen model for production use, citing it as "very good, fast and cheap" while using OpenAI's latest models less frequently due to cost considerations @natolambert
AWS outage demonstrates how cloud dependencies can break seemingly local products, with Postman API development tool and Eight Sleep smart beds becoming unusable during the outage @GergelyOrosz
Cloudflare CEO urges regulators to rein in Google's AI practices, arguing the tech giant's search dominance gives it an unfair edge in the AI race @TechCrunch
Warner Bros explores potential sale of media holdings after interest from multiple parties including Netflix, potentially affecting access to major IP for generative media applications @AndrewCurran_

Simon Willison expresses concerns about browser agents, stating that security and privacy challenges remain insurmountable for the category @simonw
Stanford faces challenges with students using ChatGPT to cheat during midterms, but professors cannot proctor exams due to honor code policies that require multi-year bureaucratic processes to change @polynoamial
Research shows 66% of Americans have never used ChatGPT, with a new position paper arguing that LLM research is being shaped around adopters while leaving non-adopters' needs behind @KaitlynZhou
YouTube launches likeness detection technology allowing creators to request removal of AI content using their face and voice @TechCrunch

Anthropic launches sandbox support in Claude Code CLI to make the CLI safer and faster, reducing permission prompts by 84% through controlled directory and network access @_catwu
Microsoft Research introduces SentinelStep to enable AI agents to handle long-running monitoring tasks like watching for emails or tracking prices by managing when agents check and their context @MSFTResearch
Serval uses agentic AI models to automate IT service management with a unique approach that leverages agentic AI's powers while avoiding common pitfalls @TechCrunch
WhatsApp and Messenger implement AI-powered safety features, with WhatsApp warning users before screen sharing with unknown contacts and Messenger flagging suspicious messages @TechCrunch
Google enhances phone calls with AI-enhanced audio to reduce background noise and improve voice clarity, even when speaking to landlines or older devices @TechCrunch
Casio's Moflin robot pet uses AI to develop a personality over time, representing advances in AI-powered companion devices @TechCrunch

New research reverse engineers Claude Haiku's mechanisms for performing perceptual tasks, discovering feature families, manifolds, geometric transformations, and distributed attention algorithms @wesg52
Andrej Karpathy explores whether pixels are better inputs to LLMs than text tokens, suggesting that rendering text as images could provide better information compression, more general input streams, and eliminate tokenizer dependencies @karpathy
Research demonstrates that AI models continue to improve across medical benchmarks, with many cases where current AI beats human doctors, though real-world performance studies remain limited @emollick
Studies examine the debate over when AI should be used to label data, with findings that AI answers differ from humans but may sometimes be better, highlighting the challenge of data labeling in AI development @emollick
Berkeley AI presents Botany-Bot at IROS 2025, which creates segmented 3D models of plants using Gaussian splats and uses robot arms to expose hidden plant anatomy details for phenotyping @funmilore
Analysis of self-play in AI reveals why it works well for two-player zero-sum games like chess and poker but faces challenges in real-world domains due to equilibrium strategies being untethered from human utility @polynoamial