AI Updates on 2025-10-03

OpenAI releases Sora 2 Pro with higher resolution capabilities and 15-second clips instead of 10 seconds, now rolling out to Pro accounts @AndrewCurran_
Anthropic announces improvements to Claude Sonnet 4.5 for cybersecurity tasks, making it comparable or superior to Opus 4.1 while being faster and cheaper @AnthropicAI

Sierra Agent OS demonstrates how supervisory models, filtering, and evaluations provide industry-leading performance in enterprise AI applications @btaylor
MIT CSAIL report shows AI startups spend heavily on general LLM assistants and coding tools, highlighting how AI augments some employees while turning other roles into broadly deployed skills @MIT_CSAIL
a16z analysis reveals software is targeting the $13 trillion US labor market compared to just $300 billion for SaaS, with AI enabling software to perform work itself and charge on outcomes @a16z
Microsoft emphasizes building fungible and flexible AI infrastructure to meet real-world needs across inference and training, powering major workloads like Copilot and ChatGPT @satyanadella

Anthropic warns that AI's impact on cybersecurity is at an inflection point, with Claude now outperforming human teams in some competitions while attackers also use AI to expand operations @AnthropicAI
Ethan Mollick observes that when given tools to create anything, people primarily make videos of cats, celebrities, and anime characters, suggesting AI creativity tools may need different curation approaches @emollick
Mustafa Suleyman argues AI memory represents more than personalization, evolving into co-memory that remembers the world with users and proactively resurfaces information @mustafasuleyman

Ethan Mollick demonstrates Sora 2 creating highly specific content including academic references, suggesting an LLM is involved in the pipeline between prompt and video output @emollick
Comet browser gains rapid adoption on both Windows and Mac platforms with AI integration that doesn't feel intrusive or forceful to learn @AravSrinivas
Physical Intelligence releases pi0.5 Vision-Language-Action model on Hugging Face, designed for open-world generalization across physical, semantic, and environmental levels through co-training on heterogeneous data sources @ClementDelangue

Research shows training AI models on enough video enables reasoning about images in ways never trained for, including solving mazes and puzzles, with larger models performing better on out-of-distribution tasks @emollick
Sora 2 achieves 55% on GPQA Diamond benchmark, matching Claude 3 Opus performance at launch, raising questions about whether this represents pure video model capabilities or involves additional language model components @AndrewCurran_
GPT-5 Pro demonstrates improved error detection capabilities in academic work, catching subtle citation errors that human reviewers missed @emollick
Stanford researchers introduce RLAD framework for training LLMs to discover reasoning abstractions - natural language hints that encode procedural knowledge for structured exploration in complex reasoning problems @Anikait_Singh_