AI Updates on 2025-08-05

OpenAI releases gpt-oss family with two open-weight reasoning models: gpt-oss-120b (117B total/5.1B active parameters) and gpt-oss-20b (20.9B total/3.6B active parameters) under Apache 2.0 license, with the larger model matching o4-mini performance and the smaller matching o3-mini @OpenAI
Anthropic launches Claude Opus 4.1, an upgrade to Claude Opus 4 with improvements in agentic tasks, real-world coding, and reasoning, achieving state-of-the-art 74.5% on SWE-Bench @AnthropicAI
Google DeepMind unveils Genie 3, a world model that creates interactive, playable environments from text prompts with real-time capabilities at 720p and 24 FPS, featuring long-horizon consistency with visual memory extending up to 1 minute @GoogleDeepMind
Qwen releases APIs for Qwen3-Coder-Flash and Qwen3-2507 models supporting 1M token context length, with Qwen-Plus-Latest also upgraded to 1M context support @Alibaba_Qwen

OpenAI's shift to open-source models marks a significant strategic change, with CEO Sam Altman previously stating the company was "on the wrong side of history" regarding open source, driven by pressure from Meta's Llama models, Chinese competitors, and the Trump administration @TechCrunch
Perplexity acquires Invisible HQ to strengthen infrastructure for AI agents, combining expertise in multi-agent orchestration with Comet browser capabilities @AravSrinivas
Cognition offers Windsurf employees exit packages just three weeks after acquisition, providing accelerated equity vesting and nine months additional pay for those opting out @TechCrunch
App generation market analysis suggests segmentation rather than winner-take-all dynamics, with different platforms specializing in prototypes, personal tools, or production apps as complements rather than competitors @a16z
Microsoft Copilot integrates Shopify's commerce tools including Checkout Kit, Shopify Catalog, and Universal Cart to enable seamless embedded commerce experiences in AI conversations @tobi

OpenAI conducts first-of-its-kind safety analysis by adversarially fine-tuning gpt-oss models to maximize biosecurity and cybersecurity capabilities, finding the models unable to achieve High capability under their Preparedness Framework @Eric_Wallace_
OpenAI launches $500K Red Teaming Challenge to strengthen open source safety, inviting researchers worldwide to uncover novel risks in their open models @OpenAI
Cloudflare controversy emerges over blocking AI crawlers, with critics arguing the company is "dangerously misinformed on the basics of AI" and prioritizing their own interests over open web access @perplexity_ai

Meta FAIR releases Open Direct Air Capture 2025 dataset, the largest open dataset for discovering advanced materials that capture CO2 directly from air, enabling rapid screening of carbon capture materials using AI @AIatMeta
Meta introduces FastCSP workflow that generates stable crystal structures for organic molecules, accelerating material discovery from months to days, along with the Open Molecular Crystals (OMC25) dataset of 25 million structures @AIatMeta
Google Gemini launches Storybook feature allowing users to create personalized, illustrated storybooks with read-aloud narration from text prompts and photos @GeminiApp
Stability AI introduces enterprise Solutions offering custom models and workflows for marketing, advertising, and design verticals, including product photography, brand style generation, and digital twins @StabilityAI
ElevenLabs launches AI music generator cleared for commercial use, expanding beyond voice synthesis into music creation @TechCrunch
Perplexity's Comet browser demonstrates AI-powered web navigation, with users reporting it successfully finding difficult-to-locate website sections through natural language commands @brextonpham

Google DeepMind's Genie 3 demonstrates emergent environmental consistency capabilities, maintaining object persistence even when out of sight, representing significant progress in world model development from 16 frames in 2D to 1 minute of real-world generation @AndrewCurran_
OpenAI's gpt-oss models are trained for agentic workflows with function calling, web search, Python execution, and configurable reasoning effort, using harmony response format for chain-of-thought reasoning and tool use @OpenAI
Circuit analysis research collaboration between Anthropic, Google DeepMind, Goodfire AI, AI Eleuther, and Decode Research extends circuit tracing work with new methods for training trans/cross-coders and attribution graph comparisons @neuronpedia
Research demonstrates that training models to generate next frames auto-regressively teaches them to maintain physical consistency across time, enabling world models to understand environmental persistence @agrimgupta92
Stanford NLP celebrates team member Luong Minh-Thang leading Google DeepMind's gold medal achievement at International Mathematical Olympiad, with models operating end-to-end in natural language producing proofs directly from official problems @stanfordnlp