AI Updates on 2025-05-24

AI Model Announcements

  • Google's Veo 3 video generation model is now available in 71 new countries, with Pro subscribers getting a trial pack and Ultra subscribers receiving increased generation limits @GoogleAI @JeffDean @sundarpichai @demishassabis

AI Research

  • Berkeley AI Research published work on efficiently simulating phylodynamics for populations with billions of individuals, applicable to viral evolution and cancer genomics @berkeley_ai
  • Nathan Lambert suggests that RLVR (Reinforcement Learning from Value/Reward) papers show mostly formatting improvements rather than new skills because compute allocation is insufficient, estimating o3 uses closer to 5% of total compute for RL @natolambert

AI Applications

  • o3 was used to find a security vulnerability in the Linux kernel, demonstrating advanced capabilities in code analysis @gdb @aidan_mclau
  • Greg Brockman used Codex's "Ask" functionality to understand settings usage across an entire codebase, highlighting the value of AI-enhanced code reading @gdb
  • Replit has completely rewritten their documentation with new features including LLM support, AI chat, and search capabilities @amasad
  • Microsoft is building an AI agent for basic mitigation of on-call alerts, attempting to solve a painful problem for developers @GergelyOrosz
  • Code Four is building an AI Copilot for law enforcement that auto-generates reports, verifies narratives, and surfaces evidence, reducing desk time by 60% @ycombinator
  • The LLM Data Company has launched tooling to write, version, and execute evaluations for models and agents, helping measure performance and define rewards for reinforcement learning @ycombinator
  • Aegis helps healthcare providers automatically appeal denied insurance claims using AI @ycombinator
  • Kirana AI is building a full-stack manager for grocery stores that handles back-office tasks and integrates with camera systems for theft detection and inventory management @ycombinator
  • Galen AI serves as a 24/7 healthcare assistant powered by clinical and wearable data @ycombinator

AI Industry Analysis

  • Garry Tan questions why AI progress appears so even across multiple leading labs (xAI, OpenAI, Anthropic, Google) despite differential resources, suggesting equalizing forces are currently beating inflationary forces @garrytan
  • Eugene Yan suggests RAG (Retrieval Augmented Generation) can be a "black hole" of resources for marginal improvements, with embedding-based retrieval potentially being a dead end for complex queries @eugeneyan
  • Aravind Srinivas tested browser agents for autonomous tasks and believes that reliable agents with full autonomy and recursive feedback loops are "around the corner" despite current limitations @AravSrinivas
  • Ethan Mollick argues companies are excited about agents because they think it will let them skip the hard task of integrating AI into work processes, but more value will come from tackling that challenge directly @emollick

AI Ethics & Society

  • Scott Belsky explores the concept of "collective memory" in AI, questioning the implications of sharing AI's memory of us with colleagues and family, raising concerns about privacy, status, and trust in a world of shared AI memory @scottbelsky
  • Hamel Husain shares insights on systematic failure mode analysis for LLM applications, emphasizing the importance of diverse traces, manual review, and letting categories emerge from data rather than imposing predetermined frameworks @HamelHusain
  • Garry Tan advises everyone to identify "toilsome tasks" in work and life that AI could handle, suggesting there's "massive alpha" in being the first expert in your field to leverage AI effectively @garrytan @ycombinator